DOAJ Open Access 2024

Significance of relative phase features for shouted and normal speech classification

Khomdet Phapatanaburi Longbiao Wang Meng Liu Seiichi Nakagawa Talit Jumphoo +1 lainnya

Abstrak

Abstract Shouted and normal speech classification plays an important role in many speech-related applications. The existing works are often based on magnitude-based features and ignore phase-based features, which are directly related to magnitude information. In this paper, the importance of phase-based features is explored for the detection of shouted speech. The novel contributions of this work are as follows. (1) Three phase-based features, namely, relative phase (RP), linear prediction analysis estimated speech-based RP (LPAES-RP) and linear prediction residual-based RP (LPR-RP) features, are explored for shouted and normal speech classification. (2) We propose a new RP feature, called the glottal source-based RP (GRP) feature. The main idea of the proposed GRP feature is to exploit the difference between RP and LPAES-RP features to detect shouted speech. (3) A score combination of phase- and magnitude-based features is also employed to further improve the classification performance. The proposed feature and combination are evaluated using the shouted normal electroglottograph speech (SNE-Speech) corpus. The experimental findings show that the RP, LPAES-RP, and LPR-RP features provide promising results for the detection of shouted speech. We also find that the proposed GRP feature can provide better results than those of the standard mel-frequency cepstral coefficient (MFCC) feature. Moreover, compared to using individual features, the score combination of the MFCC and RP/LPAES-RP/LPR-RP/GRP features yields an improved detection performance. Performance analysis under noisy environments shows that the score combination of the MFCC and the RP/LPAES-RP/LPR-RP features gives more robust classification. These outcomes show the importance of RP features in distinguishing shouted speech from normal speech.

Penulis (6)

K

Khomdet Phapatanaburi

L

Longbiao Wang

M

Meng Liu

S

Seiichi Nakagawa

T

Talit Jumphoo

P

Peerapong Uthansakul

Format Sitasi

Phapatanaburi, K., Wang, L., Liu, M., Nakagawa, S., Jumphoo, T., Uthansakul, P. (2024). Significance of relative phase features for shouted and normal speech classification. https://doi.org/10.1186/s13636-023-00324-4

Akses Cepat

PDF tidak tersedia langsung

Cek di sumber asli →
Lihat di Sumber doi.org/10.1186/s13636-023-00324-4
Informasi Jurnal
Tahun Terbit
2024
Sumber Database
DOAJ
DOI
10.1186/s13636-023-00324-4
Akses
Open Access ✓