Machine Learning Models for Air Pollution Health Risk Assessment

  • Lipatova A.V Moscow Technical University of Communications and Informatics, Moscow, Russian Federation
  • Potapchenko T.D Moscow Technical University of Communications and Informatics, Moscow, Russian Federation
Keywords: air pollution classification, public health risk assessment, machine learning, ensemble models, ; environmental monitoring

Abstract

This study explores the application of machine learning (ML) models and artificial neural networks (ANNs) in the assessment of public health concerns associated with air pollution. Utilizing a dataset comprising over 12,000 records from India and Nepal, encompassing both quantitative measurements and visual data, several classification models were constructed and evaluated to predict air quality index (AQI) categories indicative of varying health risk levels. The implemented models comprise decision tree (DT), support vector machine (SVM), random forest (RF), XGBoost, and deep neural networks (both convolutional and recurrent). The methodology entailed data preprocessing, feature significance analysis, and model assessment utilizing accuracy metrics and ROC curves. The findings reveal a high classification accuracy across all models (>90%), with ensemble-based methods demonstrating enhanced performance. XGBoost attained superior accuracy with optimal resource efficiency; however, artificial neural network (ANN) models, especially long short-term memory (LSTM), obtained accuracy levels of 98% by the 15th training epoch. The feature significance analysis revealed that AQI, PM2.5, and PM10 are the primary predictors of health risk categorization. The correlation analysis demonstrated robust associations between particulate matter measures (PM2.5 and PM10), underscoring their significance in air quality evaluation. This study proposes a methodological framework for automating risk assessment procedures via machine learning approaches, facilitating more effective environmental health monitoring. The findings suggest that ensemble models offer an optimal balance between precision and computing efficiency for real-time air quality classification systems, with potential applications in early warning systems and public health intervention techniques.

Downloads

Download data is not yet available.
Published
2025-04-30
How to Cite
Lipatova A.V, & Potapchenko T.D. (2025). Machine Learning Models for Air Pollution Health Risk Assessment. Journal of Systems Engineering and Information Technology (JOSEIT), 4(1). https://doi.org/10.29207/joseit.v4i1.6544
Section
Articles