Predictive Analysis of Smog Exposure and Its Impact on Human Health Outcomes

Authors

  • Zonesha Pervaiz Department of Mathematics & Physics The Superior University Lahore, 54000, Pakistan.
  • Rehman Sharif Department of Mathematics & Physics The Superior University Lahore, 54000, Pakistan.
  • Ayesha Khalid Department of Mathematics & Physics The Superior University Lahore, 54000, Pakistan.
  • Muhammad Rshad Department of Mathematics & Physics The Superior University Lahore, 54000, Pakistan.

Keywords:

Smog Pollution, Air Quality Analysis, Human Health Effects, ML Prediction, Environmental Hazards, Lahore Pakistan

Abstract

The paper is an analysis of the anticipation of human health effect that results due to exposure to smog in Lahore, Pakistan. Smog has become a constant environmental hazard due to industrial emission, motor pollution and high rates of urbanization process. The given quantitative study is a combination of two important datasets, such as a five-year air quality dataset (20202024) and a survey of the population about the knowledge of smog and its influence on health. The key pollutants present in the air quality data include PM 2.5, PM 10, NO 2, NH 3, SO 2, CO, and O 3, which was retrieved in a Kaggle repository called Pakistani Cities AQI (2020-2024).The data was preprocessed by cleaning, standardizing and aligning the data with the answers of the survey. According to the analysis of the exploratory data (EDA) performed with the help of Python tools: Pandas, NumPy, Matplotlib and Seaborn, PM 2.5 and PM 0 were strongly correlated (r = 0.84), which suggests that they have shared sources based mostly on vehicular and industrial activities. The negative reaction in ozone was a sign of the following photochemical reactions. Seasonal analysis showed that due to temperature inversion and stagnant air the intensity of smog was more intense during winter (season) between November and January.Health consequences were predicted using machine learning methods such as Random Forest, XGBoost and Logistic Regression. Random Forest and XGBoost had a higher predicted accuracy and could determine the significance of such factors as income, education, the level of particulate matter, and residential nearness to busy territories. The models had been identified to be accurate, F1-score, and ROC-AUC measures.The results in Lahore show that particles pollutants, in particular, PM 2.5 and PM 1.0 are the main cause of respiratory and other health problems. The evidence-based approach to the advancement of health of the population and management of the air quality is supported by the predictive framework of the study which proposes the provision of a data-driven approach to forecast the health risks associated with the smog.

Downloads

Published

2025-09-01

How to Cite

Zonesha Pervaiz, Rehman Sharif, Ayesha Khalid, & Muhammad Rshad. (2025). Predictive Analysis of Smog Exposure and Its Impact on Human Health Outcomes. Journal of Computing & Biomedical Informatics, 9(02). Retrieved from https://jcbi.org/index.php/Main/article/view/1107