Performance Evaluation of Machine Learning and Deep Learning Models for Rainfall Classification Using Climatic Datasets of Pakistan

Authors

  • Hira Farman Department of Computer Science, Karachi Institute of Economics and Technology, Pakistan & Department of Computer Science, Iqra University, Karachi, Pakistan.
  • Arif Hussain Department of Computer Science, Karachi Institute of Economics and Technology, Pakistan.
  • Jahangir Baig Department of statistic, University of Karachi, Pakistan.
  • Sharaf Hussain Department of Computer Science, Iqra University, Karachi, Pakistan.
  • Alisha Farman Department of Computer Science, Iqra University, Karachi, Pakistan.
  • Qurat-ul-ain Mastoi School of computer Science and Creative technologies, University of the west of England Bristol, United Kingdom.

Keywords:

Forecasting, Machine Learning, Classification

Abstract

This research seeks to improve rainfall forecasting, which is essential to agricultural operations, water management, and disaster preparedness, especially in floods and droughts. Proper forecasting of rainfall is critical in the sustainable development by preventing the effects of extreme weather conditions like flooding that may cause loss of life, health problems, and economic disturbances. Nevertheless, because of the unpredictable character of rainfall, the traditional forecasting models have proven to be a great problem since in most cases they lack the ability to understand the complicated set of interactions that determine the formation of meteorological patterns. To solve this, the research uses different machine learning (ML) algorithms, such as Random Forest (RF), Logistic Regression (LR), Support Vector Machine (SVM), AdaBoost, Gradient Boosting, K-Nearest Neighbors (KNN), and Naive Bayes (NB) to offer better predictions in Karachi, Pakistan. The dataset used is the one that was provided by Visual Crossing and had 33 weather-related variables, including temperature, humidity, the speed of wind, and air pressure, and 4,778 observations between 2011 and 2023. A thorough process of data preprocessing such as data cleaning, transformation, and selecting features was applied prior to the division of dataset into a training set and a test set. The model was evaluated using the 5-fold cross-validation and the performance was assessed as precision, recall, accuracy and ROC curves. Random Forest has proved to be the most accurate with 99 percent of them and Naive Bayes has reported the overfitting nature of all models. AdaBoost and Gradient Boosting had a similar performance whereby both dealt with the problem of overfitting. Moreover, a deep learning network (BiLSTM) was used to identify temporal correlations in the sequence of rainfall which also demonstrated a test accuracy of 99.8 that confirms the reliability of deep learning models in addition to conventional ML models. The results show that machine learning, as well as deep learning algorithms, can learn and comprehend complex climate patterns and can considerably improve the accuracy of the weather predictions. These models can be utilized to make more informed decisions using the data on climate resilience, disaster preparedness, and sustainable environmental management.

Downloads

Published

2025-09-01

How to Cite

Hira Farman, Arif Hussain, Jahangir Baig, Sharaf Hussain, Alisha Farman, & Qurat-ul-ain Mastoi. (2025). Performance Evaluation of Machine Learning and Deep Learning Models for Rainfall Classification Using Climatic Datasets of Pakistan. Journal of Computing & Biomedical Informatics, 9(02). Retrieved from https://jcbi.org/index.php/Main/article/view/1113