DOAJ Open Access 2022

Automatic Lemmatization of Old English Class III Strong Verbs (L-Y) with ALOEV3

Roberto Torre Alonso

Abstrak

This article presents ALOEV3, a lemmatizer based on Morphological Generation that allows for the type-based automatic lemmatization of Old English Class III strong verbs beginning with the letters L–Y. The lemmatizer operates on the basis of the inflectional, derivational and morpho-phonological alternation rules characteristic of this class. The generated form-types are checked against the two most reputed Old English corpora, namely the Dictionary of Old English Corpus and The York-Toronto-Helsinki Parsed Corpus of Old English Prose to validate their attestations and assign the corresponding lemma. Results show that 97 percent of the validated forms are successfully assigned a single lemma. The remaining inflectional forms (38 out of 1,256) show competition between two lemmas, which implies that despite the high level of accuracy of the lemmatizer, contextual, token-based analysis is still needed for disambiguation. However, the research shows that competition only occurs in a limited set of lemma pairs and their derivatives. Although the research focuses on but one strong verb class, it confirms that exploring the avenues of automatic lemmatization will contribute to the field of Old English lexicography by either lemmatizing attested inflectional form types or by highlighting areas for manual revision.

Topik & Kata Kunci

Penulis (1)

R

Roberto Torre Alonso

Format Sitasi

Alonso, R.T. (2022). Automatic Lemmatization of Old English Class III Strong Verbs (L-Y) with ALOEV3. https://doi.org/10.18172/jes.5324

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.18172/jes.5324
Informasi Jurnal
Tahun Terbit
2022
Sumber Database
DOAJ
DOI
10.18172/jes.5324
Akses
Open Access ✓