Machine Learning Models for Air Pollution Health Risk Assessment
Abstract
This study explores the application of machine learning (ML) models and artificial neural networks (ANNs) in the assessment of public health concerns associated with air pollution. Utilizing a dataset comprising over 12,000 records from India and Nepal, encompassing both quantitative measurements and visual data, several classification models were constructed and evaluated to predict air quality index (AQI) categories indicative of varying health risk levels. The implemented models comprise decision tree (DT), support vector machine (SVM), random forest (RF), XGBoost, and deep neural networks (both convolutional and recurrent). The methodology entailed data preprocessing, feature significance analysis, and model assessment utilizing accuracy metrics and ROC curves. The findings reveal a high classification accuracy across all models (>90%), with ensemble-based methods demonstrating enhanced performance. XGBoost attained superior accuracy with optimal resource efficiency; however, artificial neural network (ANN) models, especially long short-term memory (LSTM), obtained accuracy levels of 98% by the 15th training epoch. The feature significance analysis revealed that AQI, PM2.5, and PM10 are the primary predictors of health risk categorization. The correlation analysis demonstrated robust associations between particulate matter measures (PM2.5 and PM10), underscoring their significance in air quality evaluation. This study proposes a methodological framework for automating risk assessment procedures via machine learning approaches, facilitating more effective environmental health monitoring. The findings suggest that ensemble models offer an optimal balance between precision and computing efficiency for real-time air quality classification systems, with potential applications in early warning systems and public health intervention techniques.
Downloads
Copyright (c) 2025 Journal of Systems Engineering and Information Technology (JOSEIT)

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under Creative Commons Attribution 4.0 International License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (Refer to The Effect of Open Access).