arXiv Open Access 2026

ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs

Inês Vieira Inês Calvo Iago Paulo James Furtado Rafael Ferreira +4 lainnya
Lihat Sumber

Abstrak

As Large Language Models (LLMs) expand across multilingual domains, evaluating their performance in under-represented languages becomes increasingly important. European Portuguese (pt-PT) is particularly affected, as existing training data and benchmarks are mainly in Brazilian Portuguese (pt-BR). To address this, we introduce ALBA, a linguistically grounded benchmark designed from the ground up to assess LLM proficiency in linguistic-related tasks in pt-PT across eight linguistic dimensions, including Language Variety, Culture-bound Semantics, Discourse Analysis, Word Plays, Syntax, Morphology, Lexicology, and Phonetics and Phonology. ALBA is manually constructed by language experts and paired with an LLM-as-a-judge framework for scalable evaluation of pt-PT generated language. Experiments on a diverse set of models reveal performance variability across linguistic dimensions, highlighting the need for comprehensive, variety-sensitive benchmarks that support further development of tools in pt-PT.

Topik & Kata Kunci

Penulis (9)

I

Inês Vieira

I

Inês Calvo

I

Iago Paulo

J

James Furtado

R

Rafael Ferreira

D

Diogo Tavares

D

Diogo Glória-Silva

D

David Semedo

J

João Magalhães

Format Sitasi

Vieira, I., Calvo, I., Paulo, I., Furtado, J., Ferreira, R., Tavares, D. et al. (2026). ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs. https://arxiv.org/abs/2603.26516

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓