Matrix Similarity Analysis of Texts Written in Belarusian and Ukrainian
Abstrak
This publication presents the results of a study on text similarity between Belarusian and Ukrainian, utilizing a matrix-based analysis method grounded in edit distance. A distinctive feature of this approach is the absence of language-specific vocabulary rules, highlighting the algorithm’s linguistic universality in similarity analysis. The analyzed texts were sourced from excerpts of online encyclopedias, translated using AI-powered online translation services provided by well-known companies. The primary objective of this study is to determine whether it is possible to compare texts written in these languages without prior translation into a common language. Additionally, it aims to assess whether a method that does not belong to the large language model (LLM) family or the broader category of AI-based approaches can effectively compare languages within the same linguistic group. Furthermore, the study provides insights into the degree of similarity between Belarusian and Ukrainian, investigating the extent to which speakers of one language might partially understand the other.
Topik & Kata Kunci
Penulis (2)
Artur Niewiarowski
Anna Plichta
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.24423/cames.2025.1657
- Akses
- Open Access ✓