arXiv Open Access 2024

Danoliteracy of Generative Large Language Models

Søren Vejlgaard Holm Lars Kai Hansen Martin Carsten Nielsen

Lihat Sumber

Abstrak

The language technology moonshot moment of Generative Large Language Models (GLLMs) was not limited to English: These models brought a surge of technological applications, investments, and hype to low-resource languages as well. However, the capabilities of these models in languages such as Danish were, until recently, difficult to verify beyond qualitative demonstrations due to a lack of applicable evaluation corpora. We present a GLLM benchmark to evaluate \emph{Danoliteracy}, a measure of Danish language and cultural competency across eight diverse scenarios such as Danish citizenship tests and abstractive social media question answering. This limited-size benchmark was found to produce a robust ranking that correlates to human feedback at $ρ\sim 0.8$ with GPT-4 and Claude Opus models achieving the highest rankings. Analyzing these model results across scenarios, we find one strong underlying factor explaining $95\%$ of scenario performance variance for GLLMs in Danish, suggesting a $g$ factor of model consistency in language adaptation.

Topik & Kata Kunci

cs.CL cs.AI cs.LG

Penulis (3)

Søren Vejlgaard Holm

Lars Kai Hansen

Martin Carsten Nielsen

Format Sitasi

APA MLA BibTeX

Holm, S.V., Hansen, L.K., Nielsen, M.C. (2024). Danoliteracy of Generative Large Language Models. https://arxiv.org/abs/2410.22839

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2024
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓