Aplikasi Prediksi Banjir Menggunakan Algoritma XGBoost Berbasis Website
Prediksi Banjir, XGBoost, Pembelajaran Mesin, Data Preprocessing, Evaluasi Model
Abstract
Penelitian ini bertujuan untuk mengembangkan model prediksi risiko banjir menggunakan algoritma XGBoost dengan memanfaatkan dataset yang tersedia di Kaggle. Dataset tersebut mencakup berbagai faktor yang mempengaruhi risiko banjir seperti kualitas bendungan, pengikisan sistem drainase, longsor, dan hilangnya lahan basah. Proses penelitian dimulai dengan pengumpulan data, diikuti oleh preprocessing yang meliputi penanganan missing values, pemilihan fitur menggunakan regresi untuk memastikan fitur yang paling berpengaruh, dan normalisasi data. Model XGBoost kemudian dilatih dengan data yang telah diproses dan dievaluasi menggunakan beberapa metrik evaluasi. Hasil evaluasi menunjukkan bahwa model memiliki performa yang sangat baik dengan nilai Cross-Validation RMSE sebesar 0.00097, Mean Squared Error (MSE) sebesar 1.0336, Root Mean Squared Error (RMSE) sebesar 0.001017, Mean Absolute Error (MAE) sebesar 0.000801, dan Mean Absolute Percentage Error (MAPE) sebesar 0.1605%. Nilai-nilai ini mengindikasikan kesalahan prediksi yang relatif kecil. Visualisasi hasil juga menunjukkan bahwa model tidak memiliki bias sistematis dan kesalahan prediksi tersebar merata. Penelitian ini mendesak mengingat peningkatan frekuensi dan dampak banjir akibat perubahan iklim dan urbanisasi yang pesat. Model ini diharapkan dapat digunakan secara efektif untuk memberikan peringatan dini dan membantu dalam perencanaan tata ruang yang lebih baik untuk mengurangi dampak bencana banjir.
Downloads
References
Ahmed, S., El-Magd, A., Pradhan, B., & Alamri, A. (2021). Machine learning algorithm for flash flood prediction mapping in Wadi El-Laqeita and surroundings, Central Eastern Desert, Egypt. Arabian Journal of Geosciences. https://doi.org/10.1007/s12517-021-06466-z/Published
Branson, N., Cutillas, P. R., & Bessant, C. (2024). Comparison of multiple modalities for drug response prediction with learning curves using neural networks and XGBoost. Bioinformatics Advances, 4(1). https://doi.org/10.1093/bioadv/vbad190
Ibrahem Ahmed Osman, A., Najah Ahmed, A., Chow, M. F., Feng Huang, Y., & El-Shafie, A. (2021). Extreme gradient boosting (Xgboost) model to predict the groundwater levels in Selangor Malaysia. Ain Shams Engineering Journal, 12(2), 1545–1556. https://doi.org/10.1016/j.asej.2020.11.011
Jayaraman, V., Parthasarathy, S., Lakshminarayanan, A. R., & Singh, H. K. (2021). Predicting the Quantity of Municipal Solid Waste using XGBoost Model. Proceedings of the 3rd International Conference on Inventive Research in Computing Applications, ICIRCA 2021, 148–152. https://doi.org/10.1109/ICIRCA51532.2021.9544094
Joshi, A., Vishnu, C., Mohan, C. K., & Raman, B. (2024). Application of XGBoost model for early prediction of earthquake magnitude from waveform data. Journal of Earth System Science, 133(1). https://doi.org/10.1007/s12040-023-02210-1
Khaire, U. M., & Dhanalakshmi, R. (2022). Stability of feature selection algorithm: A review. Journal of King Saud University - Computer and Information Sciences, 34(4), 1060–1073. https://doi.org/10.1016/j.jksuci.2019.06.012
Kumar, V., Kedam, N., Sharma, K. V., Khedher, K. M., & Alluqmani, A. E. (2023). A Comparison of Machine Learning Models for Predicting Rainfall in Urban Metropolitan Cities. Sustainability (Switzerland), 15(18). https://doi.org/10.3390/su151813724
Le, X. H., & Thu Hien, L. T. (2024). Predicting maximum scour depth at sluice outlet: a comparative study of machine learning models and empirical equations. Environmental Research Communications, 6(1). https://doi.org/10.1088/2515-7620/ad1f94
Lee, G., & Lee, K. (2021). Feature selection using distributions of orthogonal PLS regression vectors in spectral data. BioData Mining, 14(1). https://doi.org/10.1186/s13040-021-00240-3
Liu, X., Zhou, P., Lin, Y., Sun, S., Zhang, H., Xu, W., & Yang, S. (2022). Influencing Factors and Risk Assessment of Precipitation-Induced Flooding in Zhengzhou, China, Based on Random Forest and XGBoost Algorithms. International Journal of Environmental Research and Public Health, 19(24). https://doi.org/10.3390/ijerph192416544
Ma, M., Zhao, G., He, B., Li, Q., Dong, H., Wang, S., & Wang, Z. (2021). XGBoost-based method for flash flood risk assessment. Journal of Hydrology, 598. https://doi.org/10.1016/j.jhydrol.2021.126382
Mohammad Asif Syeed, M., Farzana, M., Namir, I., Ishrar, I., Hossain Nushra, M., & Rahman, T. (2022). Flood Prediction Using Machine Learning Models. International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). https://doi.org/10.1109/HORA55278.2022.9800023
Nguyen, D. H., Hien Le, X., Heo, J. Y., & Bae, D. H. (2021). Development of an Extreme Gradient Boosting Model Integrated with Evolutionary Algorithms for Hourly Water Level Prediction. IEEE Access, 9, 125853–125867. https://doi.org/10.1109/ACCESS.2021.3111287
Nti, I. K., Nyarko-Boateng, O., Boateng, S., Bawah, F. U., Agbedanu, P. R., Awarayi, N. S., Nimbe, P., Adekoya, A. F., Weyori, B. A., & Akoto-Adjepong, V. (2021). Enhancing Flood Prediction using Ensemble and Deep Learning Techniques. 2021 22nd International Arab Conference on Information Technology, ACIT 2021. https://doi.org/10.1109/ACIT53391.2021.9677084
Razali, N., Ismail, S., & Mustapha, A. (2020). Machine learning approach for flood risks prediction. IAES International Journal of Artificial Intelligence, 9(1), 73–80. https://doi.org/10.11591/ijai.v9.i1.pp73-80
Ren, H., Pang, B., Bai, P., Zhao, G., Liu, S., Liu, Y., & Li, M. (2024). Flood Susceptibility Assessment with Random Sampling Strategy in Ensemble Learning (RF and XGBoost). Remote Sensing, 16(2). https://doi.org/10.3390/rs16020320
Riza, H., Santoso, E. W., Tejakusuma, I. G., & Prawiradisastra, F. (2020, June 13). Advancing Flood Disaster Mitigation in Indonesia Using Machine Learning Methods. IEEE Xplore. https://doi.org/10.1145/3234781.3234798
Xu, K., Han, Z., Xu, H., & Bin, L. (2023). Rapid Prediction Model for Urban Floods Based on a Light Gradient Boosting Machine Approach and Hydrological–Hydraulic Model. International Journal of Disaster Risk Science, 14(1), 79–97. https://doi.org/10.1007/s13753-023-00465-2
Yuan, H., Wang, M., Zhang, D., Muhammad Adnan Ikram, R., Su, J., Zhou, S., Wang, Y., Li, J., & Zhang, Q. (2024). Data-driven urban configuration optimization: An XGBoost-based approach for mitigating flood susceptibility and enhancing economic contribution. Ecological Indicators, 166. https://doi.org/10.1016/j.ecolind.2024.112247
Zhu, Z., & Zhang, Y. (2022). Flood disaster risk assessment based on random forest algorithm. Neural Computing and Applications, 34(5), 3443–3455. https://doi.org/10.1007/s00521-021-05757-6