Adaptive Boosted Support Vector Machine-random Forest for Environmental Sound Classification

Faiz Ul Hasnain; Visha Iqbal; Tayyaba Javed; Muhammad Yasir

Authors

Faiz Ul Hasnain Department of Computer Science, University of Education Lahore, Vehari Campus, Vehari, Pakistan.
Visha Iqbal Department of Computer Science, University of Management and Technology Lahore, Pakistan.
Tayyaba Javed Department of Computer Science, Barani Institute of Information Technology, Rawalpindi 46604, Pakistan.
Muhammad Yasir Department of Computer Science, University of Management and Technology Lahore, Pakistan.

Keywords:

ESC, Environmental Sound, UrbanSound8K, ESC10, ESC50, AdaBoost, Feature Fusion

Abstract

Environmental sound classification (ESC) is a method to differentiate the audio related to the various environmental sounds. Environmental sounds have a more complex time-frequency structure compared to structured sounds like music and speech. To extract the frequency and time-based features from audio more accurately and effectively, a novel fusion of several features including MFCCs, Mel-spectrogram, spectral skewness, spectral kurtosis and normalized pitch frequency will be evaluated in this study to provide a comprehensive representation of environmental sounds. The fusion will capture various aspects of the input audio data, such as spectral characteristics, statistical properties, and frequency-related information. By using multimodal information fusion, the algorithm will enhance the discriminative power of the model to distinguish between different sounds more effectively. Moreover, the integration of a variety of machine learning models will enhance the robustness and generalization ability of the model. The combination of several machine learning models will reduce the training time and enhance the classification rate of environmental audio under limited computational resources. Furthermore, this thesis will employ three data augmentation methods, namely, time stretch, pitch tuning, and white noise to minimize the probability of overfitting problems due to the limited audios in each class of dataset. This research will evaluate the ensemble model classification accuracy against baseline SVM, RF classifiers, and other state-of-the-art approaches. In UrbanSound8K, ESC-50, and ESC-10 datasets, the highest achieved accuracies using AdaBoost SVM-RF classifiers were ( 94%), (85%), and ( 95%) respectively. The experimental findings demonstrate that the suggested approach achieves superior performance for ESC tasks.

Adaptive Boosted Support Vector Machine-random Forest for Environmental Sound Classification

Authors

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

SCOPUS

HJRS

ISSN

Online First

Call for Papers

Make a Submission

Open Access

Information

Conference

SC-2