DOAJ Open Access 2025

Keelekorpus kui leksikograafi abiline kõnekeelsuse tuvastamisel

Lydia Risberg Maria Tuulik Margit Langemets Kristina Koppel Ene Vainik +2 lainnya

Abstrak

Using corpus data to support lexicographers in identifying informal language This study examines how new corpus analysis tools can assist lexicographers in determining whether to assign a word an informal register label in a dictionary. Labelling words in dictionaries is necessary for language users seeking register information. Moreover, there have been calls for the upcoming Dictionary of Standard Estonian (DSE, 2025) to clearly distinguish standard language from other linguistic varieties. Informal language was chosen for analysis because it is more difficult to define than other marked registers. In DSE 2018, some words were labelled as informal based on language planning decisions rather than empirical analysis. As register labels should be data-driven and based on corpus evidence, a systematic review of these words is necessary for the revised edition. Our study investigates how corpus genre data can support lexicographers in deciding whether to add or remove the informal label. We found that corpus data provided useful insights in 82.1% of cases. Based on our experiment, we developed a guideline to assist in labelling word meanings as informal. Namely, if a word occurs in blogs and forums in 36% or more of its total corpus occurrences, it may be considered as tending towards informal usage. This guideline is not a rigid rule but a supportive tool, as additional factors should be considered based on the lexicographer’s linguistic expertise. Users value reliable linguistic information in dictionaries. Our proposed guideline helps lexicographers make more systematic decisions while maintaining expert judgment as the ultimate determinant.

Topik & Kata Kunci

Other Finnic languages and dialects

Penulis (7)

Lydia Risberg

Maria Tuulik

Margit Langemets

Kristina Koppel

Ene Vainik

Esta Prangel

Eleri Aedmaa

Format Sitasi

APA MLA BibTeX

Risberg, L., Tuulik, M., Langemets, M., Koppel, K., Vainik, E., Prangel, E. et al. (2025). Keelekorpus kui leksikograafi abiline kõnekeelsuse tuvastamisel. https://doi.org/10.54013/kk811a3

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →

Lihat di Sumber doi.org/10.54013/kk811a3

Informasi Jurnal

Tahun Terbit: 2025
Sumber Database: DOAJ
DOI: 10.54013/kk811a3
Akses: Open Access ✓