Enhancing Biometric Security: A Robust Voice Frequency Detector with CNN-BiLSTM and Anti-Spoofing Mechanisms
Abstrak
This study introduces a Voice Frequency Detector (VFD) framework to enhance biometric password authentication by addressing key challenges such as spoofing attacks, environmental noise, and natural variations in speaker voice due to health, emotion, or aging. The system leverages dynamic vocal features including fundamental frequency (F0), Mel-Frequency Cepstral Coefficients (MFCCs), and formant structures, integrated with a hybrid CNN-BiLSTM deep learning model and attention mechanisms for robust spectral-temporal analysis. An anti-spoofing subsystem employs spectral flatness and phase distortion features to detect synthetic and replayed voices. The methodology involves signal preprocessing (Wiener filtering, voice activity detection), feature extraction, and score fusion by combining deep learning outputs with anti-spoofing results. Experiments on a dataset of 100 speakers and 1,000 spoofed samples demonstrate strong performance, achieving an EER of 2.8% in controlled conditions and 5.0% in noisy environments, with over 91% accuracy against replay, synthetic, and voice conversion attacks. Statistical analysis confirms that MFCCs are the most discriminative feature, contributing to 62% of the variance. The VFD framework offers a secure, adaptive, and practical voice authentication solution suitable for finance, IoT, and access control applications. Future enhancements may explore multi-modal integration and transformer-based architectures for broader applicability.
Topik & Kata Kunci
Penulis (2)
Mahfudz Ahnan Al Faruq
Mohammad Givi Efgivia
Akses Cepat
- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.58482/ijeresm.v4i2.3
- Akses
- Open Access ✓