Analisis Klasifikasi Hepatitis Menggunakan Synthetic Minority Oversampling Technique, Support Vector Machine, dan Random Forest
DOI:
https://doi.org/10.51454/decode.v6i1.1630Keywords:
Hepatitis, Random Forest, SMOTE, Support Vector MachineAbstract
Hepatitis akibat infeksi virus masih menjadi masalah kesehatan masyarakat yang serius sehingga deteksi dini berbasis data klinis penting untuk mencegah kerusakan hati lebih lanjut. Penelitian ini menganalisis kinerja algoritma Support Vector Machine (SVM) dan Random Forest pada klasifikasi hepatitis serta mengkaji dampak penerapan Synthetic Minority Over-sampling Technique (SMOTE). Dataset yang digunakan adalah HepatitisCdata.csv dari Kaggle dengan 615 data pasien yang memuat atribut demografis dan parameter biokimia hati. Tahapan penelitian meliputi preprocessing data, penanganan outlier, transformasi atribut kategorikal, serta pembangunan model baseline dan SMOTE. Evaluasi dilakukan menggunakan 10-fold cross-validation dengan metrik akurasi, presisi, recall, dan F1-score. Hasil menunjukkan bahwa SMOTE meningkatkan performa kedua algoritma, dengan Random Forest + SMOTE memberikan hasil terbaik (akurasi 98,85%) dibandingkan SVM + SMOTE (98,50%). Kontribusi penelitian ini terletak pada penggunaan pipeline preprocessing dan evaluasi yang seragam untuk membandingkan dampak SMOTE secara langsung pada dua algoritma klasifikasi hepatitis.
References
Ali, A. M., Hassan, M. R., Aburub, F., Alauthman, M., Aldweesh, A., Al-Qerem, A., Jebreen, I., & Nabot, A. (2023). Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection. Machines, 11(3), 1–14. https://doi.org/10.3390/machines11030391
Alnur, B., Mulyono, M., Amillia, F., & Sutoyo, S. (2023). JITE (Journal of Informatics and Telecommunication Engineering). Journal of Informatics and Telecommunication Engineering, 7(1), 102–111. https://doi.org/10.31289/jite.v8i2.13218 Received:
Aurelia, J. E., Rustam, Z., Wirasati, I., Hartini, S., & Saragih, G. S. (2021). Hepatitis classification using support vector machines and random forest. IAES International Journal of Artificial Intelligence, 10(2), 446–451. https://doi.org/10.11591/IJAI.V10.I2.PP446-451
Barata, B., Noersasongko, M. E., Purwanto, M. A. S. (2021). Improving the Accuracy of C4.5 Algorithm with Chi-Square Method on Pure Tea Classification Using Electronic Nose. Resti, 1(10), 19–25. https://doi.org/10.29207/resti.v7i2.4687
Barata, M., Dwi Irnawati, Ifnu Wisma Dwi Prastya, & Dwi Issadari Hastuti. (2025). Hydrogen Sulfide Leak Detection Using the C4.5 Algorithm: Optimizing Feature Extraction for Enhanced Accuracy. PROCEEDING AL GHAZALI International Conference, 2, 348–358. https://doi.org/10.52802/aicp.v1i1.1352
Cabanillas-Carbonell, M., & Zapata-Paulini, J. (2025). Seeking best performance: a comparative evaluation of machine learning models in the prediction of hepatitis C. Indonesian Journal of Electrical Engineering and Computer Science, 39(1), 374. https://doi.org/10.11591/ijeecs.v39.i1.pp374-386
Erlin, E., Desnelita, Y., Nasution, N., Suryati, L., & Zoromi, F. (2022). Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 21(3), 677–690. https://doi.org/10.30812/matrik.v21i3.1726
Gunawan, R. G., & Ilham Pratama, M. (2024). Analisa Kinerja Algoritma Machine Learning Untuk Prediksi Virus Hepatitis C. Jurnal CoSciTech (Computer Science and Information Technology), 4(3), 772–777. https://doi.org/10.37859/coscitech.v4i3.6513
Kementerian Kesehatan RI. (2018). Riset Kesehatan Dasar (Riskedas). Laporan Nasional Riskesdad.2018.Kementerian Kesehatan RI. Badan Penelitian dan Pengembangan Kesehatan. Laporan Nasional Riskesndas 2018, 44(8), 181–222. http://www.yankes.kemkes.go.id/assets/downloads/PMK No. 57 Tahun 2013 tentang PTRM.pdf
Kementrian Kesehatan. (2016). Profil Kesehatan.
Khairunnas, Masitha, A., & Rafiuddin. (2025). Identifikasi Klaster UMKM di Kota Bima menuju Indonesia Emas 2045 dengan Metode Support Vector Machine. Decode: Jurnal Pendidikan Teknologi Informasi, 5(3), 967–981. https://doi.org/10.51454/decode.v5i3.1359
Khatun, P., Umam, S., Razzak, R. B., Shamsuddin, I. B., & Salma, N. (2025). A study on the effectiveness of machine learning models for hepatitis prediction. Scientific Reports, 15(1), 1–16. https://doi.org/10.1038/s41598-025-07104-4
Kumari, S., Das, S., Sonker, P. K., Saroj, A., & Kumar, M. (2025). Prediction of hepatitis-C virus using statistical learning models. Discover Public Health, 22(1). https://doi.org/10.1186/s12982-025-00654-y
Lilhore, U. K., Manoharan, P., Sandhu, J. K., Simaiya, S., Dalal, S., Baqasah, A. M., Alsafyani, M., Alroobaea, R., Keshta, I., & Raahemifar, K. (2023). Hybrid model for precise hepatitis-C classification using improved random forest and SVM method. Scientific Reports, 13(1), 1–18. https://doi.org/10.1038/s41598-023-36605-3
Farghaly, H. M., Shams, M. Y., & El-Hafeez, T. A (2023). Hepatitis C Virus prediction based on machine learning framework: a real-world case study in Egypt. Knowledge and Information Systems, 65(6), 2595–2617. https://doi.org/10.1007/s10115-023-01851-4
Mehzabeen, S. M., Gayathri, R., Paramasaivam, P., & Ramya, A. (2025). Enhancing Hepatitis C Diagnosis: The Impact of SMOTE, Optuna, and SHAP on Detection Methods. Iranian Journal of Electrical and Electronic Engineering, 21(4), 1–16. https://doi.org/10.22068/IJEEE.21.4.3418
Nugraha, M. A., Mazdadi, M. I., Farmadi, A., Muliadi, & Saragih, T. H. (2023). Penyeimbangan Kelas SMOTE dan Seleksi Fitur Ensemble Filter pada Support Vector Machine untuk Klasifikasi Penyakit Liver. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(6), 1273–1284. https://doi.org/10.25126/jtiik.2023107234
Purnomo, A., Barata, M. A., Soeleman, M. A., & Alzami, F. (2020). Adding feature selection on Naïve Bayes to increase accuracy on classification heart attack disease. Journal of Physics: Conference Series, 1511(1), 1–7. https://doi.org/10.1088/1742-6596/1511/1/012001
Putri, S. A., & Rachmatika, R. (2025). Penerapan Algoritma Random Forest dan SMOTE untuk Prediksi Risiko Putus Sekolah Siswa Sekolah Menengah Kejuruan. DECODE: Jurnal Pendidikan Teknologi Informasi, 5(3), 903–910. https://doi.org/Doi: http://dx.doi.org/10.51454/decode.v5i3.1360
Rehman, A. U., Butt, W. H., Ali, T. M., Javaid, S., Almufareh, M. F., Humayun, M., Rahman, H., Mir, A., & Shaheen, M. (2024). A Machine Learning-Based Framework for Accurate and Early Diagnosis of Liver Diseases: A Comprehensive Study on Feature Selection, Data Imbalance, and Algorithmic Performance. International Journal of Intelligent Systems, 2024, 1–29. https://doi.org/10.1155/2024/6111312
Sharfina, N., & Ramadhan, N. G. (2023). Analisis SMOTE Pada Klasifikasi Hepatitis C Berbasis Random Forest dan Naïve Bayes. JOINTECS (Journal of Information Technology and Computer Science), 8(1), 33. https://doi.org/10.31328/jointecs.v8i1.4456
Syukron, M., Santoso, R., & Widiharih, T. (2020). Pada Imbalance Class Data Muhamad.Jurnal Gaussian, 9, 227–236.
Yaqin, A. A., Barata, M. A., & Mahmudah, N. (2025). Implementation of the Random Forest Algorithm with Optuna Optimization in Lung Cancer Classification. Sistemasi, 14(2), 561. https://doi.org/10.32520/stmsi.v14i2.4877
Zhang, S., & Cui, F. (2025). Global progress, challenges and strategies in eliminating public threat of viral hepatitis. Infectious Diseases of Poverty, 14(1), 25–28. https://doi.org/10.1186/s40249-025-01275-y
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Amalia Nur Laily, Mula Agung Barata, Denny Nurdiansyah

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.








