Analysis of Kazakh Language Abbreviations Based on Machine Learning Approach
Abstrak
This research aimed to analyze the use of abbreviations in the Kazakh language using a machine learning approach. Studies of the most commonly used words-abbreviations in Kazakh texts, as well as analysis and classification were carried out. Several machine learning models, tc, including naive Bayes, neural networks and support vector machines (SVMs) were tested for the study. The linguistic corpus was used for the experimental part Abbreviations-abb.xml and abbreviations.csv. The results showed that SVM outperformed other models with an accuracy of 0.85. The unique features of the Kazakh language were also discussed, such as the use of the Cyrillic alphabet and complex word forms. The implications of this study for natural language processing and computational linguistics were presented, and the limitations of the study were discussed. This study contributes to understanding the use of abbreviations in the Kazakh language and demonstrates the potential of machine learning approaches to analyze languages with complex characteristics.
Penulis (3)
D. Rakhimova
Yerkin Suleimenov
Dinara Makulbek
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2023
- Bahasa
- en
- Total Sitasi
- 4×
- Sumber Database
- Semantic Scholar
- DOI
- 10.1109/ubmk59864.2023.10286751
- Akses
- Open Access ✓