OPTIMALISASI ALGORITMA RANDOM FOREST FEATURE SELECTION DAN HYPERPARAMETER TUNING KLASIFIKASI GENRE MUSIK
DOI:
https://doi.org/10.31000/jika.v9i1.12216Abstrak
Mendengarkan musik merupakan aspek penting dari kehidupan manusia, namun pengenalan genre musik secara subjektif menambah kompleksitas dalam proses klasifikasinya. Oleh karena itu, diperlukan pendekatan yang teliti dan andal untuk menganalisis serta mengelompokkan data musik. Metode Random Forest banyak digunakan dalam klasifikasi genre musik, memerlukan optimalisasi algoritma yang presisi melalui Feature Selection dan Hyperparameter Tuning. Manfaat penelitian ini yaitu untuk memberikan pemahaman mengenai peran teknik Feature Selection dan Hyperparameter Tuning dalam mengoptimalkan performa algoritma Random Forest. Dengan memanfaatkan algoritma secara maksimal, akurasi klasifikasi genre musik dapat ditingkatkan, yang berperan penting dalam menciptakan sistem rekomendasi musik yang lebih tepat dan akurat. Penelitian ini diawali dengan pengumpulan data yang diolah dalam proses preprocessing untuk mendapatkan data yang bersih. Fitur-fitur dalam dataset dipilih melalui Feature Selection untuk mendapatkan fitur yang mampu merepresentasikan kelas genre musik. Metode Random Forest digunakan untuk klasifikasi, diikuti dengan Hyperparameter Tuning untuk mendapatkan parameter yang optimal. Hasil pengujian menunjukkan bahwa metode Random Forest memiliki nilai ROC AUC sebesar 0.909. Optimalisasi meningkatkan kinerja dengan nilai ROC AUC menjadi 0.913, menunjukkan peningkatan kinerja model sebesar 0.004 dan masuk kategori evaluasi yang excellentReferensi
Al-Tashi, Q., Abdulkadir, S. J., Rais, H. M., Mirjalili, S., & Alhussian, H. (2020). Approaches to Multi-Objective Feature Selection: A Systematic Literature Review. IEEE Access, 8, 125076–125096. https://doi.org/10.1109/ACCESS.2020.3007291
Avci, C., Budak, M., Yagmur, N., & Balcik, F. (2023). Comparison between random forest and support vector machine algorithms for LULC classification. International Journal of Engineering and Geosciences, 8(1), 1–10. https://doi.org/10.26833/ijeg.987605
Azmi, B. N., Hermawan, A., & Avianto, D. (2023). Analisis Pengaruh Komposisi Data Training dan Data Testing pada Penggunaan PCA dan Algoritma Decision Tree untuk Klasifikasi Penderita Penyakit Liver. JTIM : Jurnal Teknologi Informasi Dan Multimedia, 4(4), 281–290. https://doi.org/10.35746/jtim.v4i4.298
Belete, D. M., & Huchaiah, M. D. (2022). Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results. International Journal of Computers and Applications, 44(9), 875–886. https://doi.org/10.1080/1206212X.2021.1974663
Corbacioglu, Åž., & Aksel, G. (2023). Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to interpreting the area under the curve value. Turkish Journal of Emergency Medicine, 23(4), 195. https://doi.org/10.4103/tjem.tjem_182_23
Ghosh, P., Mahapatra, S., Jana, S., & Kr. Jha, R. (2023). A Study on Music Genre Classification using Machine Learning. International Journal of Engineering Business and Social Science, 1(04), 308–320. https://doi.org/10.58451/ijebss.v1i04.55
Khan, F., Tarimer, I., Alwageed, H. S., KaradaÄŸ, B. C., Fayaz, M., Abdusalomov, A. B., & Cho, Y.-I. (2022). Effect of Feature Selection on the Accuracy of Music Popularity Classification Using Machine Learning Algorithms. Electronics, 11(21), 3518. https://doi.org/10.3390/electronics11213518
Matin, I. M. M. (2023). Hyperparameter Tuning Menggunakan GridsearchCV pada Random Forest untuk Deteksi Malware. MULTINETICS, 9(1), 43–50. https://doi.org/10.32722/multinetics.v9i1.5578
Ma, Z., Cui, S., & Joe, I. (2022). An Enhanced Proximal Policy Optimization-Based Reinforcement Learning Method with Random Forest for Hyperparameter Optimization. Applied Sciences, 12(14), 7006. https://doi.org/10.3390/app12147006
Navisa, S., Hakim, L., & Nabilah, A. (2021). Komparasi Algoritma Klasifikasi Genre Musik pada Spotify Menggunakan CRISP-DM. Jurnal Sistem Cerdas, 4(2), 114–125. https://doi.org/10.37396/jsc.v4i2.162
Nivethithaa, K. K., & Vijayalakshmi, S. (2021). Survey on Data Mining Techniques, Process and Algorithms. Journal of Physics: Conference Series, 1947(1), 012052. https://doi.org/10.1088/1742-6596/1947/1/012052
Setiadi, D. R. I. M., Rahardwika, D. S., Rachmawanto, E. H., Sari, C. A., Susanto, A., Mulyono, I. U. W., Astuti, E. Z., & Fahmi, A. (2020). Effect of Feature Selection on The Accuracy of Music Genre Classification using SVM Classifier. 2020 International Seminar on Application for Technology of Information and Communication (ISemantic), 7–11. https://doi.org/10.1109/iSemantic50169.2020.9234222
Singhal, R., Srivatsan, S., & Panda, P. (2022). Classification of Music Genres using Feature Selection and Hyperparameter Tuning. Journal of Artificial Intelligence and Capsule Networks, 4(3), 167–178. https://doi.org/10.36548/jaicn.2022.3.003
Tanujaya, L. B. C., Susanto, B., & Saragih, A. (2020). The Comparison of Logistic Regression Methods and Random Forest for Spotify Audio Mode Featurre Classification. Indonesian Journal of Data and Science, 1(3). https://doi.org/10.33096/ijodas.v1i3.16
We Are Social. (2022). Presentase Pengguna Streaming Musik Di Indonesia. Https://Dataindonesia.Id/Internet/Detail/503-Warga-Ri-Gunakan-Streaming-Musik-Pada-Kuartal-Iii2022.
Yang, Z., Xu, Q., Bao, S., Cao, X., & Huang, Q. (2022). Learning With Multiclass AUC: Theory and Algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11), 7747–7763. https://doi.org/10.1109/TPAMI.2021.3101125
Unduhan
Diterbitkan
Terbitan
Bagian
Lisensi
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- That it is not under consideration for publication elsewhere,
- That its publication has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with International Journal of Advances in Intelligent Informatics agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.Â
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
Licensing for Data Publication
International Journal of Advances in Intelligent Informatics use a variety of waivers and licenses, that are specifically designed for and appropriate for the treatment of data:
Open Data Commons Attribution License, http://www.opendatacommons.org/licenses/by/1.0/ (default)
Creative Commons CC-Zero Waiver, http://creativecommons.org/publicdomain/zero/1.0/
Open Data Commons Public Domain Dedication and Licence, http://www.opendatacommons.org/licenses/pddl/1-0/
Other data publishing licenses may be allowed as exceptions (subject to approval by the editor on a case-by-case basis) and should be justified with a written statement from the author, which will be published with the article.
Open Data and Software Publishing and Sharing
The journal strives to maximize the replicability of the research published in it. Authors are thus required to share all data, code or protocols underlying the research reported in their articles. Exceptions are permitted but have to be justified in a written public statement accompanying the article.
Datasets and software should be deposited and permanently archived inappropriate, trusted, general, or domain-specific repositories (please consult http://service.re3data.org and/or software repositories such as GitHub, GitLab, Bioinformatics.org, or equivalent). The associated persistent identifiers (e.g. DOI, or others) of the dataset(s) must be included in the data or software resources section of the article. Reference(s) to datasets and software should also be included in the reference list of the article with DOIs (where available). Where no domain-specific data repository exists, authors should deposit their datasets in a general repository such as ZENODO, Dryad, Dataverse, or others.
Small data may also be published as data files or packages supplementary to a research article, however, the authors should prefer in all cases a deposition in data repositories.