DOAJ Open Access 2025

Matrix Similarity Analysis of Texts Written in Belarusian and Ukrainian

Artur Niewiarowski Anna Plichta

Abstrak

This publication presents the results of a study on text similarity between Belarusian and Ukrainian, utilizing a matrix-based analysis method grounded in edit distance. A distinctive feature of this approach is the absence of language-specific vocabulary rules, highlighting the algorithm’s linguistic universality in similarity analysis. The analyzed texts were sourced from excerpts of online encyclopedias, translated using AI-powered online translation  services provided by well-known companies. The primary objective of this study is to determine whether it is possible to compare texts written in these languages without prior translation into a common language. Additionally, it aims to assess whether a method that does not belong to the large language model (LLM) family or the broader category of AI-based approaches can effectively compare languages within the same linguistic group. Furthermore, the study provides insights into the degree of similarity between Belarusian and Ukrainian, investigating the extent to which speakers of one language might partially understand the other.

Penulis (2)

A

Artur Niewiarowski

A

Anna Plichta

Format Sitasi

Niewiarowski, A., Plichta, A. (2025). Matrix Similarity Analysis of Texts Written in Belarusian and Ukrainian. https://doi.org/10.24423/cames.2025.1657

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.24423/cames.2025.1657
Informasi Jurnal
Tahun Terbit
2025
Sumber Database
DOAJ
DOI
10.24423/cames.2025.1657
Akses
Open Access ✓