Comparative Evaluation of MLR and SVM Algorithms for DKI Jakarta Air Quality Prediction

Penulis

  • Arfany Dhimas Muftareza Sekolah Tinggi Meteorologi Klimatologi dan Geofisika

DOI:

https://doi.org/10.55123/jomlai.v4i2.5369

Kata Kunci:

Air Quality , Machine Learning , Prediction , MLR , SVM

Abstrak

This research explores the application of Machine Learning using Multiple Linear Regression (MLR) and Support Vector Machine (SVM) algorithms to predict air quality categories in Jakarta based on key pollutant parameters, such as PM10, PM2.5, NO2, CO, SO2, and O3. The dataset used comes from ISPU data measured from five Air quality monitoring stations in DKI Jakarta Province in 2021. The research process includes data collection, data cleaning, model implementation using the scikit-learn library, and model performance evaluation using Accuracy, R-Squared, RMSE, and MAE metrics. The results of model performance evaluation show that SVM performs better than MLR, as evidenced by higher accuracy value (91.78% vs. 90.41%), higher R-squared value (69.63% vs. 64.56%), lower RMSE value (0.2867 vs. 0.3097), and lower MAE value (0.0822 vs. 0.0959), indicating that the error in SVM model is smaller than MLR. This study proves the effectiveness of machine learning-based models in providing accurate air quality category predictions, although there are still challenges in predicting the “Good” category that require further development, such as balancing data and advanced feature engineering to improve the prediction accuracy of all categories.

Referensi

S. Halsana, “Air Quality Prediction Model using Supervised Machine Learning Algorithms,” Int. J. Sci. Res. Comput. Sci. Eng. Inf. Technol., vol. 3307, pp. 190–201, 2020, doi: 10.32628/cseit206435.

S. D. A. Kusumaningtyas, E. Aldrian, T. Wati, D. Atmoko, and Sunaryo, “The recent state of ambient air quality in Jakarta,” Aerosol Air Qual. Res., vol. 18, no. 9, pp. 2343–2354, 2018, doi: 10.4209/aaqr.2017.10.0391.

D. A. Kristiyanti, E. Purwaningsih, E. Nurelasari, A. Al Kaafi, and A. H. Umam, “Implementation of Neural Network Method for Air Quality Forecasting in Jakarta Region,” J. Phys. Conf. Ser., vol. 1641, no. 1, 2020, doi: 10.1088/1742-6596/1641/1/012037.

A. Maulana et al., “Optimizing University Admissions: A Machine Learning Perspective,” J. Educ. Manag. Learn., vol. 1, no. 1, pp. 1–7, 2023, doi: 10.60084/jeml.v1i1.46.

B. Mahesh, “Machine Learning Algorithms - A Review,” Int. J. Sci. Res., vol. 9, no. 1, pp. 381–386, 2020, doi: 10.21275/art20203995.

G. M. Idroes et al., “Urban Air Quality Classification Using Machine Learning Approach to Enhance Environmental Monitoring,” Leuser J. Environ. Stud., vol. 1, no. 2, pp. 62–68, 2023, doi: 10.60084/ljes.v1i2.99.

S. Rath, A. Tripathy, and A. R. Tripathy, “Prediction of new active cases of coronavirus disease (COVID-19) pandemic using multiple linear regression model,” Diabetes Metab. Syndr. Clin. Res. Rev., vol. 14, no. 5, pp. 1467–1474, 2020, doi: 10.1016/j.dsx.2020.07.045.

S. Fashoto, E. Mbunge, G. Ogunleye, and J. Van den Burg, “Implementation of Machine Learning for Predicting Maize Crop Yields Using Multiple Linear Regression and Backward Elimination,” Malaysian J. Comput., vol. 6, no. 1, p. 679, 2021, doi: 10.24191/mjoc.v6i1.8822.

C. Cortes and V. Vapnik, “Support-Vector Networks,” Kluwer Acad. Publ., vol. 20, no. 2, pp. 273–297, 1995, doi: 10.1111/j.1747-0285.2009.00840.x.

W. Lu et al., “Air pollutant parameter forecasting using support vector machines,” Proc. Int. Jt. Conf. Neural Networks, vol. 1, no. February, pp. 630–635, 2002, doi: 10.1109/ijcnn.2002.1005545.

D. Iskandaryan, F. Ramos, and S. Trilles, “Air quality prediction in smart cities using machine learning technologies based on sensor data: A review,” Appl. Sci., vol. 10, no. 7, 2020, doi: 10.3390/app10072401.

H. Liu, Q. Li, D. Yu, and Y. Gu, “Air quality index and air pollutant concentration prediction based on machine learning algorithms,” Appl. Sci., vol. 9, no. 19, 2019, doi: 10.3390/app9194069.

Y. C. Liang, Y. Maimury, A. H. L. Chen, and J. R. C. Juarez, “Machine learning-based prediction of air quality,” Appl. Sci., vol. 10, no. 24, pp. 1–17, 2020, doi: 10.3390/app10249151.

D. Adryanti Felicia Sampe et al., “Pilot study of air quality index assessment of nitrogen pollutant using lichen as bioindicators in Jakarta and Depok, Indonesia,” E3S Web Conf., vol. 211, pp. 1–13, 2020, doi: 10.1051/e3sconf/202021102014.

Menteri Lingkungan Hidup dan Kehutanan Republik Indonesia, “Peraturan Menteri Lingkungan Hidup dan Kehutanan Republik Indonesia Nomor P.14/MENLHK/SETJEN/KUM.1/7/2020 Tentang Indeks Standar Pencemaran Udara,” pp. 1–16, 2020, [Online]. Available: https://peraturan.bpk.go.id/Details/163466/permen-lhk-no-14-tahun-2020

M. Tranmer, J. Murphy, M. Elliot, and M. Pampaka, “Multiple Linear Regression (2nd Edition),” Cathie Marsh Cent. Census Surv. Res., vol. 5, no. 5, pp. 1–5, 2020.

N. H. Ovirianti, M. Zarlis, and H. Mawengkang, “Support Vector Machine Using A Classification Algorithm,” SinkrOn, vol. 7, no. 3, pp. 2103–2107, 2022, doi: 10.33395/sinkron.v7i3.11597.

D. Meyer, “Support Vector Machines,” R-News, vol. 1, pp. 3–9, 2009.

M. Zidan and A. Kamil, “Forecasting the Air Quality Index (AQI) in Jakarta, Indonesia by Using a Linear Regression Model,” ResearchGate, no. September, 2024, doi: 10.13140/RG.2.2.34971.89122.

T. O. Hodson, “Root-mean-square error (RMSE) or mean absolute error (MAE): when to use them or not,” Geosci. Model Dev., vol. 15, no. 14, pp. 5481–5487, 2022, doi: 10.5194/gmd-15-5481-2022.

R. Heiss, “Data, Types of,” Int. Encycl. Commun. Res. Methods, pp. 1–6, 2017, doi: 10.1002/9781118901731.iecrm0062.

C. Andrade, “A Student’s Guide to the Classification and Operationalization of Variables in the Conceptualization and Design of a Clinical Study: Part 1,” Indian J. Psychol. Med., vol. 43, no. 2, pp. 177–179, 2021, doi: 10.1177/0253717621994334.

H. Müller and J. Freytag, “Problems, Methods, and Challenges in Comprehensive Data Cleansing,” Informatics reports // Inst. Comput. Sci. Humboldt Univ. Berlin, no. HUB-IB-164, Humboldt University Berlin, pp. 1–23, 2003, [Online]. Available: http://www.dbis.informatik.hu-berlin.de/fileadmin/research/papers/techreports/2003-hub_ib_164-mueller.pdf

F. Ridzuan and W. M. N. Wan Zainon, “A review on data cleansing methods for big data,” Procedia Comput. Sci., vol. 161, pp. 731–738, 2019, doi: 10.1016/j.procs.2019.11.177.

E. Hartini, “Classification of Missing Values Handling Method During Data Mining: Review,” Sigma Epsil., vol. 21, no. 2, pp. 49–60, 2017.

A. ur Rehman and S. B. Belhaouari, “Unsupervised outlier detection in multidimensional data,” J. Big Data, vol. 8, no. 1, 2021, doi: 10.1186/s40537-021-00469-z.

A. Nurhopipah and U. Hasanah, “Dataset Splitting Techniques Comparison For Face Classification on CCTV Images,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 14, no. 4, p. 341, 2020, doi: 10.22146/ijccs.58092.

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, and B. Thirion, “Scikit-learn: Machine Learning in Python,” J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011, doi: 10.1289/EHP4713.

B. E. Blaine, “Winsorizing,” Fish. Digit. Publ., pp. 1817–1818, 2018.

I. T. Jollife and J. Cadima, “Principal component analysis: A review and recent developments,” Philos. Trans. R. Soc. A Math. Phys. Eng. Sci., vol. 374, no. 2065, 2016, doi: 10.1098/rsta.2015.0202.

Diterbitkan

2025-06-20

Cara Mengutip

Arfany Dhimas Muftareza. (2025). Comparative Evaluation of MLR and SVM Algorithms for DKI Jakarta Air Quality Prediction. JOMLAI: Journal of Machine Learning and Artificial Intelligence, 4(2), 116–126. https://doi.org/10.55123/jomlai.v4i2.5369

Terbitan

Bagian

Articles