arXiv Open Access 2023

IslamicPCQA: A Dataset for Persian Multi-hop Complex Question Answering in Islamic Text Resources

Arash Ghafouri Hasan Naderi Mohammad Aghajani asl Mahdi Firouzmandi
Lihat Sumber

Abstrak

Nowadays, one of the main challenges for Question Answering Systems is to answer complex questions using various sources of information. Multi-hop questions are a type of complex questions that require multi-step reasoning to answer. In this article, the IslamicPCQA dataset is introduced. This is the first Persian dataset for answering complex questions based on non-structured information sources and consists of 12,282 question-answer pairs extracted from 9 Islamic encyclopedias. This dataset has been created inspired by the HotpotQA English dataset approach, which was customized to suit the complexities of the Persian language. Answering questions in this dataset requires more than one paragraph and reasoning. The questions are not limited to any prior knowledge base or ontology, and to provide robust reasoning ability, the dataset also includes supporting facts and key sentences. The prepared dataset covers a wide range of Islamic topics and aims to facilitate answering complex Persian questions within this subject matter

Topik & Kata Kunci

Penulis (4)

A

Arash Ghafouri

H

Hasan Naderi

M

Mohammad Aghajani asl

M

Mahdi Firouzmandi

Format Sitasi

Ghafouri, A., Naderi, H., asl, M.A., Firouzmandi, M. (2023). IslamicPCQA: A Dataset for Persian Multi-hop Complex Question Answering in Islamic Text Resources. https://arxiv.org/abs/2304.11664

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓