arXiv Open Access 2021

Linguistic and Gender Variation in Speech Emotion Recognition using Spectral Features

Zachary Dair Ryan Donovan Ruairi O'Reilly

Lihat Sumber

Abstrak

This work explores the effect of gender and linguistic-based vocal variations on the accuracy of emotive expression classification. Emotive expressions are considered from the perspective of spectral features in speech (Mel-frequency Cepstral Coefficient, Melspectrogram, Spectral Contrast). Emotions are considered from the perspective of Basic Emotion Theory. A convolutional neural network is utilised to classify emotive expressions in emotive audio datasets in English, German, and Italian. Vocal variations for spectral features assessed by (i) a comparative analysis identifying suitable spectral features, (ii) the classification performance for mono, multi and cross-lingual emotive data and (iii) an empirical evaluation of a machine learning model to assess the effects of gender and linguistic variation on classification accuracy. The results showed that spectral features provide a potential avenue for increasing emotive expression classification. Additionally, the accuracy of emotive expression classification was high within mono and cross-lingual emotive data, but poor in multi-lingual data. Similarly, there were differences in classification accuracy between gender populations. These results demonstrate the importance of accounting for population differences to enable accurate speech emotion recognition.

Topik & Kata Kunci

cs.SD eess.AS

Penulis (3)

Zachary Dair

Ryan Donovan

Ruairi O'Reilly

Format Sitasi

APA MLA BibTeX

Dair, Z., Donovan, R., O'Reilly, R. (2021). Linguistic and Gender Variation in Speech Emotion Recognition using Spectral Features. https://arxiv.org/abs/2112.09596

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓