Semantic Scholar Open Access 2025 2 sitasi

Hausa Handwriting Character Recognition Using CNN and Tesseract

Muhammad Khamis Dauda U. Musa Hassanin M. Al-Barhamtoshy

Abstrak

The Hausa language, spoken by over 64 million people, is a vital part of West Africa's cultural and social fabric. Despite its importance, handwritten Hausa script recognition has little concern due to its tonal nature, unique characters, and limited datasets. This study tackles these challenges by creating a dataset and evaluating two models commonly used in handwritten recognition: a Convolutional Neural Network (CNN) and Tesseract Optical Character Recognition (OCR). The custom CNN, trained on data from 30 volunteers with varied writing styles, achieved an accuracy of 96%, with strong precision, recall, and F1-score metrics. In contrast, Tesseract OCR struggled, particularly with Hausa-specific characters like Ɓ, Ƙ, Ƴ, and ɗ, achieving lowest accuracy for these classes. Although it had non-zero recognition in some characters, its performance was inconsistent, indicating difficulties in generalization. This work provides a scalable solution for handwritten Hausa character recognition, contributing to research on low-resource African languages and paving the way for better access to Hausa handwritten literature and integration into natural language processing applications.

Penulis (3)

M

Muhammad Khamis Dauda

U

U. Musa

H

Hassanin M. Al-Barhamtoshy

Format Sitasi

Dauda, M.K., Musa, U., Al-Barhamtoshy, H.M. (2025). Hausa Handwriting Character Recognition Using CNN and Tesseract. https://doi.org/10.1109/ICAISC64594.2025.10959458

Akses Cepat

Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Total Sitasi
Sumber Database
Semantic Scholar
DOI
10.1109/ICAISC64594.2025.10959458
Akses
Open Access ✓