Voice-Based Gender Identification Using Machine Learning

Authors

  • Umair Ijaz Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan.
  • Muhammad Munwar Iqbal Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan.
  • Ze Shan Ali Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan.
  • Anees Tariq Department of Robotics and AI, SZABIST University, Islamabad, Pakistan.
  • Romail Khan Department of Computer Science, University of Engineering and Technology, Taxila, Pakistan.

Keywords:

Gender Classification, Machine Learning, Audio Signal Processing, Automatic Gender Identification (AGI), Feature Extraction, TensorFlow, Classification Accuracy, Voice-Based Systems, Speech Applications

Abstract

Automatic gender classification (AGC) based on voice signals plays a crucial role in biometric authentication, speech analytics, and human-computer interaction. This study proposes a hybrid machine learning framework that integrates Mel-Frequency Cepstral Coefficients (MFCC) for feature extraction, Principal Component Analysis (PCA) for dimensionality reduction, and a Convolutional Neural Network (CNN) for classification. The model was trained on a curated dataset of 3,497 Urdu-language voice samples collected from publicly available YouTube recordings and processed for gender classification tasks, encompassing speakers of varying genders and dialects. Addressing limitations in prior approaches, the proposed method combines traditional spectral features with deep learning techniques to enhance classification performance. The system achieved an accuracy of 98.4%, along with strong precision, recall, and F1-score metrics, outperforming baseline models such as Support Vector Machines (SVM) and k-Nearest Neighbours (KNN). These findings support the model’s applicability in real-world use cases, including virtual assistants, automated call routing, and emotion-aware computing systems.

Downloads

Published

2025-06-01

How to Cite

Umair Ijaz, Muhammad Munwar Iqbal, Ze Shan Ali, Anees Tariq, & Romail Khan. (2025). Voice-Based Gender Identification Using Machine Learning. Journal of Computing & Biomedical Informatics, 9(01). Retrieved from https://jcbi.org/index.php/Main/article/view/955