Comparative Analysis of Machine Learning and Deep Learning for Air Quality Prediction Using Meteorological and Climate Data
This repository contains the code for our research on air quality predictions using XGBoost, LSTM and Informer using meteorological and climate data. The goal is to compare model performance, model efficiency and feature importance analysis in predicting PM2.5 concentrations across multiple cities.
The associated paper is available on IEEE Explore
Comparative_Analysis_Of_Machine_Learning_and_Deep_Learning_For_Air_Quality_Prediction
β
βββ Dataset and Training File/ # Dataset and scripts (training and testing)
β βββ General_EDA.ipynb # General EDA on dataset
β βββ Informer_Model_Training_(Exponential_Smoothing)_Fix (1).ipynb # Informer model training and testing
β βββ LSTM_Preprocessing_Training.ipynb # LSTM model training and testing
β βββ XGBoost_EDA_and_Preprocessing_Training.ipynb # XGBoost mdoel training and testing
β βββ combined_dataset.csv # proccessed dataset
β βββ t_paired_test_For_RMSE_per_city_From_Each_Model.ipynb # t-paired test script
|
βββ README.md # Main project documentation
1. git clone https://github.com/Andersen-C/Comparative_Analysis_Of_Machine_Learning_and_Deep_Learning_For_Air_Quality_Prediction.git
2. Open the ipynb scripts file in Jupyter Notebook/Google Colab/VS Code
3. Install all the required libraries
4. Run the code
The results of all models' performance are as follows:
| Model | RMSE | MAE | R2 Score | MAPE |
|---|---|---|---|---|
| XGBoost | 0.1907 | 0.0939 | 0.9727 | 15.03% |
| LSTM | 0.0425 | 0.0215 | 0.9203 | 22.86% |
| Informer | 0.0253 | 0.0441 | 0.9666 | 69.12% |
- Andersen Chandra - Lead Researcher
- Laurentius Nicholas - Lead Researcher
- Dr. Ir. Alexander Agung Santoso Gunawan, M.Si., M.Sc., IPM. - Supervisor
- Rilo Chandra Pradana - Supervisor