DOAJ Open Access 2026

Prosodic information extraction and classification based on MFCC features and machine learning models

Sajid Habib Gill Javed Ahmed Mahar Shahid Ali Mahar Mirza Abdur Razzaq Arif Mehmood +2 lainnya

Abstrak

Punjabi is an old Indo-Aryan language spoken across the world, particularly in Pakistan and India. Punjabi is a tonal and low-resourced language therefore; significant research work has not been done so far, especially in the South Punjab belt. This language is divided into different dialects and finding the diversity of tonal qualities in the Majhi Punjabi dialect is the core objective of this research. Speech-processing applications are usually influenced by prosodic properties such as pitch, amplitude, and duration. A speech corpus was collected from 241 native speakers, encompassing spoken words totaling 7712, and representing various age groups and genders. The proposed prosodic model using the Mel Frequency Cepstral Coefficients (MFCC) system is used to extract the prosodic features from collected speech utterances of the Majhi Punjabi dialect. The examination of the results suggests that tonal and dialectal word information demonstrates a considerable impact on the information delivered by the speaker. Gender-specific variations in tonal word amplitudes are shown by the model. The extracted prosodic information is classified with support vector machine, logistic regression, random forest, K nearest neighbor, gradient boost (GB), and extra tree classifier (ETC). The ETC and GB models performed well with the highest accuracy of 97%. The four deep learning models are also implemented for performance comparison with machine learning, however, deep learning models do not perform well on this dataset. The highest accuracy is gained by CNN which is 86%. This research endeavor will be beneficial for Punjabi speech-processing applications. Additionally, the impact of dialectal variations elucidates the rich diversity present in spoken language, hinting at the importance of considering regional nuances in future investigations.

Penulis (7)

S

Sajid Habib Gill

J

Javed Ahmed Mahar

S

Shahid Ali Mahar

M

Mirza Abdur Razzaq

A

Arif Mehmood

G

Gyu Sang Choi

I

Imran Ashraf

Format Sitasi

Gill, S.H., Mahar, J.A., Mahar, S.A., Razzaq, M.A., Mehmood, A., Choi, G.S. et al. (2026). Prosodic information extraction and classification based on MFCC features and machine learning models. https://doi.org/10.1177/00202940251315031

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1177/00202940251315031
Informasi Jurnal
Tahun Terbit
2026
Sumber Database
DOAJ
DOI
10.1177/00202940251315031
Akses
Open Access ✓