arXiv
Open Access
2023
NorQuAD: Norwegian Question Answering Dataset
Sardana Ivanova
Fredrik Aas Andreassen
Matias Jentoft
Sondre Wold
Lilja Øvrelid
Abstrak
In this paper we present NorQuAD: the first Norwegian question answering dataset for machine reading comprehension. The dataset consists of 4,752 manually created question-answer pairs. We here detail the data collection procedure and present statistics of the dataset. We also benchmark several multilingual and Norwegian monolingual language models on the dataset and compare them against human performance. The dataset will be made freely available.
Topik & Kata Kunci
Penulis (5)
S
Sardana Ivanova
F
Fredrik Aas Andreassen
M
Matias Jentoft
S
Sondre Wold
L
Lilja Øvrelid
Akses Cepat
Informasi Jurnal
- Tahun Terbit
- 2023
- Bahasa
- en
- Sumber Database
- arXiv
- Akses
- Open Access ✓