arXiv Open Access 2026

ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs

Inês Vieira Inês Calvo Iago Paulo James Furtado Rafael Ferreira +4 lainnya

Lihat Sumber

Abstrak

As Large Language Models (LLMs) expand across multilingual domains, evaluating their performance in under-represented languages becomes increasingly important. European Portuguese (pt-PT) is particularly affected, as existing training data and benchmarks are mainly in Brazilian Portuguese (pt-BR). To address this, we introduce ALBA, a linguistically grounded benchmark designed from the ground up to assess LLM proficiency in linguistic-related tasks in pt-PT across eight linguistic dimensions, including Language Variety, Culture-bound Semantics, Discourse Analysis, Word Plays, Syntax, Morphology, Lexicology, and Phonetics and Phonology. ALBA is manually constructed by language experts and paired with an LLM-as-a-judge framework for scalable evaluation of pt-PT generated language. Experiments on a diverse set of models reveal performance variability across linguistic dimensions, highlighting the need for comprehensive, variety-sensitive benchmarks that support further development of tools in pt-PT.

Topik & Kata Kunci

cs.CL cs.AI cs.LG

Penulis (9)

Inês Vieira

Inês Calvo

Iago Paulo

James Furtado

Rafael Ferreira

Diogo Tavares

Diogo Glória-Silva

David Semedo

João Magalhães

Format Sitasi

APA MLA BibTeX

Vieira, I., Calvo, I., Paulo, I., Furtado, J., Ferreira, R., Tavares, D. et al. (2026). ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs. https://arxiv.org/abs/2603.26516

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2026
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓