Multivariate Air Pollution Anomaly Detection via LSTM Autoencoders on Beijing Multi-Site Sensor Data

Authors

  • Muhammad Talha Gulzar Department of Computer Sciences, Lahore Garrison University, Lahore, Pakistan.
  • Maria Tariq Department of Computer Sciences, Lahore Garrison University, Lahore, Pakistan.
  • Khushbu Khalid Butt Department of Information Technology, Lahore Garrison University, Lahore, Pakistan.
  • Sundus Muir Department of Criminology, Lahore Garrison University, Lahore, Pakistan.
  • Omer Irshad Department of Software Engineering, Lahore Garrison University, Lahore, Pakistan.

Keywords:

Air Pollution, Anomaly Detection, Time-Series Analysis, LSTM Autoencoder, Deep Learning, Environ- Mental Monitoring, Multivariate Sensor Data, Unsupervised Learning, Beijing Air Quality Data

Abstract

The current paper investigates the application of a Long Short-Term Memory Autoencoder (LSTM-AE) to identifying anomalies in the multi-variate time-series of air-pollution measurements. The algorithm was used to the sensor data of PM2.5, CO, O3, NO2, TEMP, and WSPM of the Beijing Multi-Site Air-Quality Data Set described on Kaggle. The test set included fifty synthetic anomalies that were used to assess performance. The anomalies were identified using the reconstruction error calculated using Mean Squared Error (MSE) as a dynamic threshold value of 0.002957. The presented model produced the Precision, Recall, F1-Score, and ROC-AUC of 77.00%, 100.00%, 87.01%, and 99.86%, respectively, proving its effectiveness in detecting minor and drastic changes in the pattern of air quality.

Downloads

Published

2025-11-29

How to Cite

Muhammad Talha Gulzar, Maria Tariq, Khushbu Khalid Butt, Sundus Muir, & Omer Irshad. (2025). Multivariate Air Pollution Anomaly Detection via LSTM Autoencoders on Beijing Multi-Site Sensor Data. Journal of Computing & Biomedical Informatics, 10(01). Retrieved from https://jcbi.org/index.php/Main/article/view/1127