Hausa Handwriting Character Recognition Using CNN and Tesseract
Abstrak
The Hausa language, spoken by over 64 million people, is a vital part of West Africa's cultural and social fabric. Despite its importance, handwritten Hausa script recognition has little concern due to its tonal nature, unique characters, and limited datasets. This study tackles these challenges by creating a dataset and evaluating two models commonly used in handwritten recognition: a Convolutional Neural Network (CNN) and Tesseract Optical Character Recognition (OCR). The custom CNN, trained on data from 30 volunteers with varied writing styles, achieved an accuracy of 96%, with strong precision, recall, and F1-score metrics. In contrast, Tesseract OCR struggled, particularly with Hausa-specific characters like Ɓ, Ƙ, Ƴ, and ɗ, achieving lowest accuracy for these classes. Although it had non-zero recognition in some characters, its performance was inconsistent, indicating difficulties in generalization. This work provides a scalable solution for handwritten Hausa character recognition, contributing to research on low-resource African languages and paving the way for better access to Hausa handwritten literature and integration into natural language processing applications.
Penulis (3)
Muhammad Khamis Dauda
U. Musa
Hassanin M. Al-Barhamtoshy
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2025
- Bahasa
- en
- Total Sitasi
- 2×
- Sumber Database
- Semantic Scholar
- DOI
- 10.1109/ICAISC64594.2025.10959458
- Akses
- Open Access ✓