arXiv Open Access 2025

LengClaro2023: A Dataset of Administrative Texts in Spanish with Plain Language adaptations

Belén Agüera-Marco Itziar Gonzalez-Dios
Lihat Sumber

Abstrak

In this work, we present LengClaro2023, a dataset of legal-administrative texts in Spanish. Based on the most frequently used procedures from the Spanish Social Security website, we have created for each text two simplified equivalents. The first version follows the recommendations provided by arText claro. The second version incorporates additional recommendations from plain language guidelines to explore further potential improvements in the system. The linguistic resource created in this work can be used for evaluating automatic text simplification (ATS) systems in Spanish.

Topik & Kata Kunci

Penulis (2)

B

Belén Agüera-Marco

I

Itziar Gonzalez-Dios

Format Sitasi

Agüera-Marco, B., Gonzalez-Dios, I. (2025). LengClaro2023: A Dataset of Administrative Texts in Spanish with Plain Language adaptations. https://arxiv.org/abs/2506.05927

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓