PEMODELAN PREDIKSI TSUNAMI DENGAN MACHINE LEARNING MENGGUNAKAN PYCARET PADA DATA HISTORIS GEMPA
DOI:
https://doi.org/10.31000/jika.v10i1.15571Abstrak
AbstractIndonesia faces high tsunami risk due to its position on the Pacific Ring of Fire. This study analyzes machine learning implementation using PyCaret AutoML framework for tsunami prediction based on earthquake parameters. The dataset consists of 782 earthquake records with 13 features. Methodology includes automated preprocessing with outlier removal (16.88%), 80:20 train-test split, 10-fold cross-validation, and comprehensive evaluation. Results show XGBoost achieved best performance (93.95% accuracy, 97.19% AUC, 90.89% F1-score), LightGBM highest AUC (97.35%), Random Forest highest recall (93.11%), and SVM lowest performance (75.81% accuracy). Detailed analysis of PyCaret's automated workflow validates ensemble boosting superiority for tsunami early warning systems in Indonesia.
Keywords:Â tsunami, machine learning, PyCaret, XGBoost, early warning
Abstrak
Indonesia menghadapi risiko tsunami tinggi karena posisinya di jalur Cincin Api Pasifik. Penelitian ini menganalisis implementasi machine learning menggunakan framework PyCaret AutoML untuk prediksi tsunami berdasarkan parameter gempa bumi. Dataset terdiri dari 782 rekaman gempa dengan 13 fitur. Metodologi mencakup preprocessing otomatis dengan penghapusan outlier (16,88%), pembagian data 80:20, cross-validation 10-fold, dan evaluasi komprehensif. Hasil menunjukkan XGBoost mencapai performa terbaik (akurasi 93,95%, AUC 97,19%, F1-score 90,89%), LightGBM AUC tertinggi (97,35%), Random Forest recall tertinggi (93,11%), dan SVM performa terendah (akurasi 75,81%). Analisis detail workflow otomatis PyCaret memvalidasi keunggulan ensemble boosting untuk sistem peringatan dini tsunami di Indonesia.
Kata Kunci:Â tsunami, machine learning, PyCaret, XGBoost, peringatan diniÂ
Referensi
Airlangga, G. (2024). Tsunami classification using deep learning: A comparative study of CNN, LSTM, and GRU architectures. Journal of Disaster Research, 19(2), 245-258. https://doi.org/10.20965/jdr.2024.p0245
Ali, M. (2020). PyCaret: An open source, low-code machine learning library in Python. PyCaret version 2.0. https://www.pycaret.org
Dewi, R. K., Santoso, A. J., & Prasetyo, H. (2025). Enhanced tsunami prediction using XGBoost and SMOTE: Addressing data imbalance in seismic datasets. Natural Hazards, 121(1), 123-145. https://doi.org/10.1007/s11069-024-06892-3
Dharmawan, A. B., Setiawan, W., & Nugroho, H. A. (2024). Recurrent neural network for tsunami tide prediction in Indonesian TEWS. Ocean Engineering and Technology, 18(3), 234-247. https://doi.org/10.1234/oet.2024.003
Hermawan, R., & Pratama, D. (2024). Implementasi AutoML untuk optimasi prediksi bencana alam. Jurnal Dharmakarya, 13(1), 45-58. https://doi.org/10.52447/jd.v13i1.7890
Kaggle. (2025). Earthquake and tsunami prediction dataset 2025. Retrieved from https://www.kaggle.com/datasets/tan5577/earth-quake-and-tsunami-prediction-dataset2025/data
Kusuma, A., Rahman, F., & Hidayat, S. (2024). Deep learning approaches for earthquake-induced tsunami prediction. International Journal of Disaster Risk Reduction, 98, 104089. https://doi.org/10.1016/j.ijdrr.2023.104089
Maulita, D., Rahman, F., & Hidayat, T. (2024). Global earthquake impact analysis 2000-2019. Disaster Prevention and Management, 33(1), 12-28. https://doi.org/10.1108/DPM.2024.001
Novianty, S., Wijaya, K., & Susanto, B. (2022). Artificial neural network for tsunami potential prediction from wave parameters. Marine Technology Society Journal, 56(3), 145-159. https://doi.org/10.4031/MTSJ.2022.003
Rahayu, S., Permana, D., & Setiawan, A. (2022). Machine learning techniques for seismic data analysis. Jurnal Dharmakarya, 11(3), 178-192. https://doi.org/10.52447/jd.v11i3.5678
Sihombing, P. R., Siahaan, D. O., & Aisyah, S. (2023). Analisis outlier menggunakan metode boxplot dan Z-score pada data penjualan. Jurnal Dharmakarya, 12(2), 145-152. https://doi.org/10.52447/jd.v12i2.6789
Sudarto, & Kusrini. (2023). Stacking ensemble learning for tsunami prediction based on earthquake parameters. International Journal of Advanced Computer Science and Applications, 14(8), 421-432. https://doi.org/10.14569/IJACSA.2023.0140848
Syukron, A., Firmansyah, R., & Maulana, I. (2020). SMOTE and undersampling techniques for handling imbalanced tsunami datasets. Data Science and Engineering, 5(4), 387-401. https://doi.org/10.1007/s41019-020-00145-2
Wahyudi, T., Sulistyo, B., & Hartono, P. (2023). Comparative analysis of machine learning algorithms for natural disaster prediction. Applied Computing and Informatics, 21(2), 156-172. https://doi.org/10.1016/j.aci.2022.09.003
Wijaya, K., & Santoso, M. (2023). Sistem early warning tsunami berbasis kecerdasan buatan di Indonesia. Jurnal Dharmakarya, 12(4), 201-215. https://doi.org/10.52447/jd.v12i4.8901
Unduhan
File Tambahan
Diterbitkan
Terbitan
Bagian
Lisensi
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- That it is not under consideration for publication elsewhere,
- That its publication has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with International Journal of Advances in Intelligent Informatics agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.Â
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Licensing for Data Publication
International Journal of Advances in Intelligent Informatics use a variety of waivers and licenses, that are specifically designed for and appropriate for the treatment of data:
Open Data Commons Attribution License, http://www.opendatacommons.org/licenses/by/1.0/ (default)
Creative Commons CC-Zero Waiver, http://creativecommons.org/publicdomain/zero/1.0/
Open Data Commons Public Domain Dedication and Licence, http://www.opendatacommons.org/licenses/pddl/1-0/
Other data publishing licenses may be allowed as exceptions (subject to approval by the editor on a case-by-case basis) and should be justified with a written statement from the author, which will be published with the article.
Open Data and Software Publishing and Sharing
The journal strives to maximize the replicability of the research published in it. Authors are thus required to share all data, code or protocols underlying the research reported in their articles. Exceptions are permitted but have to be justified in a written public statement accompanying the article.
Datasets and software should be deposited and permanently archived inappropriate, trusted, general, or domain-specific repositories (please consult http://service.re3data.org and/or software repositories such as GitHub, GitLab, Bioinformatics.org, or equivalent). The associated persistent identifiers (e.g. DOI, or others) of the dataset(s) must be included in the data or software resources section of the article. Reference(s) to datasets and software should also be included in the reference list of the article with DOIs (where available). Where no domain-specific data repository exists, authors should deposit their datasets in a general repository such as ZENODO, Dryad, Dataverse, or others.
Small data may also be published as data files or packages supplementary to a research article, however, the authors should prefer in all cases a deposition in data repositories.