Analisis Klasifikasi Hepatitis Menggunakan Synthetic Minority Oversampling Technique, Support Vector Machine, dan Random Forest

Authors

  • Amalia Nur Laily Teknik Informatika Universitas Nahdlatul Ulama Sunan Giri
  • Mula Agung Barata Teknik Informatika Universitas Nahdlatul Ulama Sunan Giri
  • Denny Nurdiansyah Statistika Universitas Nahdlatul Ulama Sunan Giri

DOI:

https://doi.org/10.51454/decode.v6i1.1630

Keywords:

Hepatitis, Random Forest, SMOTE, Support Vector Machine

Abstract

Hepatitis akibat infeksi virus masih menjadi masalah kesehatan masyarakat yang serius sehingga deteksi dini berbasis data klinis penting untuk mencegah kerusakan hati lebih lanjut. Penelitian ini menganalisis kinerja algoritma Support Vector Machine (SVM) dan Random Forest pada klasifikasi hepatitis serta mengkaji dampak penerapan Synthetic Minority Over-sampling Technique (SMOTE). Dataset yang digunakan adalah HepatitisCdata.csv dari Kaggle dengan 615 data pasien yang memuat atribut demografis dan parameter biokimia hati. Tahapan penelitian meliputi preprocessing data, penanganan outlier, transformasi atribut kategorikal, serta pembangunan model baseline dan SMOTE. Evaluasi dilakukan menggunakan 10-fold cross-validation dengan metrik akurasi, presisi, recall, dan F1-score. Hasil menunjukkan bahwa SMOTE meningkatkan performa kedua algoritma, dengan Random Forest + SMOTE memberikan hasil terbaik (akurasi 98,85%) dibandingkan SVM + SMOTE (98,50%). Kontribusi penelitian ini terletak pada penggunaan pipeline preprocessing dan evaluasi yang seragam untuk membandingkan dampak SMOTE secara langsung pada dua algoritma klasifikasi hepatitis.

References

Ali, A. M., Hassan, M. R., Aburub, F., Alauthman, M., Aldweesh, A., Al-Qerem, A., Jebreen, I., & Nabot, A. (2023). Explainable Machine Learning Approach for Hepatitis C Diagnosis Using SFS Feature Selection. Machines, 11(3), 1–14. https://doi.org/10.3390/machines11030391

Alnur, B., Mulyono, M., Amillia, F., & Sutoyo, S. (2023). JITE (Journal of Informatics and Telecommunication Engineering). Journal of Informatics and Telecommunication Engineering, 7(1), 102–111. https://doi.org/10.31289/jite.v8i2.13218 Received:

Aurelia, J. E., Rustam, Z., Wirasati, I., Hartini, S., & Saragih, G. S. (2021). Hepatitis classification using support vector machines and random forest. IAES International Journal of Artificial Intelligence, 10(2), 446–451. https://doi.org/10.11591/IJAI.V10.I2.PP446-451

Barata, B., Noersasongko, M. E., Purwanto, M. A. S. (2021). Improving the Accuracy of C4.5 Algorithm with Chi-Square Method on Pure Tea Classification Using Electronic Nose. Resti, 1(10), 19–25. https://doi.org/10.29207/resti.v7i2.4687

Barata, M., Dwi Irnawati, Ifnu Wisma Dwi Prastya, & Dwi Issadari Hastuti. (2025). Hydrogen Sulfide Leak Detection Using the C4.5 Algorithm: Optimizing Feature Extraction for Enhanced Accuracy. PROCEEDING AL GHAZALI International Conference, 2, 348–358. https://doi.org/10.52802/aicp.v1i1.1352

Cabanillas-Carbonell, M., & Zapata-Paulini, J. (2025). Seeking best performance: a comparative evaluation of machine learning models in the prediction of hepatitis C. Indonesian Journal of Electrical Engineering and Computer Science, 39(1), 374. https://doi.org/10.11591/ijeecs.v39.i1.pp374-386

Erlin, E., Desnelita, Y., Nasution, N., Suryati, L., & Zoromi, F. (2022). Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 21(3), 677–690. https://doi.org/10.30812/matrik.v21i3.1726

Gunawan, R. G., & Ilham Pratama, M. (2024). Analisa Kinerja Algoritma Machine Learning Untuk Prediksi Virus Hepatitis C. Jurnal CoSciTech (Computer Science and Information Technology), 4(3), 772–777. https://doi.org/10.37859/coscitech.v4i3.6513

Kementerian Kesehatan RI. (2018). Riset Kesehatan Dasar (Riskedas). Laporan Nasional Riskesdad.2018.Kementerian Kesehatan RI. Badan Penelitian dan Pengembangan Kesehatan. Laporan Nasional Riskesndas 2018, 44(8), 181–222. http://www.yankes.kemkes.go.id/assets/downloads/PMK No. 57 Tahun 2013 tentang PTRM.pdf

Kementrian Kesehatan. (2016). Profil Kesehatan.

Khairunnas, Masitha, A., & Rafiuddin. (2025). Identifikasi Klaster UMKM di Kota Bima menuju Indonesia Emas 2045 dengan Metode Support Vector Machine. Decode: Jurnal Pendidikan Teknologi Informasi, 5(3), 967–981. https://doi.org/10.51454/decode.v5i3.1359

Khatun, P., Umam, S., Razzak, R. B., Shamsuddin, I. B., & Salma, N. (2025). A study on the effectiveness of machine learning models for hepatitis prediction. Scientific Reports, 15(1), 1–16. https://doi.org/10.1038/s41598-025-07104-4

Kumari, S., Das, S., Sonker, P. K., Saroj, A., & Kumar, M. (2025). Prediction of hepatitis-C virus using statistical learning models. Discover Public Health, 22(1). https://doi.org/10.1186/s12982-025-00654-y

Lilhore, U. K., Manoharan, P., Sandhu, J. K., Simaiya, S., Dalal, S., Baqasah, A. M., Alsafyani, M., Alroobaea, R., Keshta, I., & Raahemifar, K. (2023). Hybrid model for precise hepatitis-C classification using improved random forest and SVM method. Scientific Reports, 13(1), 1–18. https://doi.org/10.1038/s41598-023-36605-3

Farghaly, H. M., Shams, M. Y., & El-Hafeez, T. A (2023). Hepatitis C Virus prediction based on machine learning framework: a real-world case study in Egypt. Knowledge and Information Systems, 65(6), 2595–2617. https://doi.org/10.1007/s10115-023-01851-4

Mehzabeen, S. M., Gayathri, R., Paramasaivam, P., & Ramya, A. (2025). Enhancing Hepatitis C Diagnosis: The Impact of SMOTE, Optuna, and SHAP on Detection Methods. Iranian Journal of Electrical and Electronic Engineering, 21(4), 1–16. https://doi.org/10.22068/IJEEE.21.4.3418

Nugraha, M. A., Mazdadi, M. I., Farmadi, A., Muliadi, & Saragih, T. H. (2023). Penyeimbangan Kelas SMOTE dan Seleksi Fitur Ensemble Filter pada Support Vector Machine untuk Klasifikasi Penyakit Liver. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(6), 1273–1284. https://doi.org/10.25126/jtiik.2023107234

Purnomo, A., Barata, M. A., Soeleman, M. A., & Alzami, F. (2020). Adding feature selection on Naïve Bayes to increase accuracy on classification heart attack disease. Journal of Physics: Conference Series, 1511(1), 1–7. https://doi.org/10.1088/1742-6596/1511/1/012001

Putri, S. A., & Rachmatika, R. (2025). Penerapan Algoritma Random Forest dan SMOTE untuk Prediksi Risiko Putus Sekolah Siswa Sekolah Menengah Kejuruan. DECODE: Jurnal Pendidikan Teknologi Informasi, 5(3), 903–910. https://doi.org/Doi: http://dx.doi.org/10.51454/decode.v5i3.1360

Rehman, A. U., Butt, W. H., Ali, T. M., Javaid, S., Almufareh, M. F., Humayun, M., Rahman, H., Mir, A., & Shaheen, M. (2024). A Machine Learning-Based Framework for Accurate and Early Diagnosis of Liver Diseases: A Comprehensive Study on Feature Selection, Data Imbalance, and Algorithmic Performance. International Journal of Intelligent Systems, 2024, 1–29. https://doi.org/10.1155/2024/6111312

Sharfina, N., & Ramadhan, N. G. (2023). Analisis SMOTE Pada Klasifikasi Hepatitis C Berbasis Random Forest dan Naïve Bayes. JOINTECS (Journal of Information Technology and Computer Science), 8(1), 33. https://doi.org/10.31328/jointecs.v8i1.4456

Syukron, M., Santoso, R., & Widiharih, T. (2020). Pada Imbalance Class Data Muhamad.Jurnal Gaussian, 9, 227–236.

Yaqin, A. A., Barata, M. A., & Mahmudah, N. (2025). Implementation of the Random Forest Algorithm with Optuna Optimization in Lung Cancer Classification. Sistemasi, 14(2), 561. https://doi.org/10.32520/stmsi.v14i2.4877

Zhang, S., & Cui, F. (2025). Global progress, challenges and strategies in eliminating public threat of viral hepatitis. Infectious Diseases of Poverty, 14(1), 25–28. https://doi.org/10.1186/s40249-025-01275-y

Downloads

Published

2026-03-31

How to Cite

Laily, A. N., Barata, M. A., & Nurdiansyah, D. (2026). Analisis Klasifikasi Hepatitis Menggunakan Synthetic Minority Oversampling Technique, Support Vector Machine, dan Random Forest. Decode: Jurnal Pendidikan Teknologi Informasi, 6(1), 223–236. https://doi.org/10.51454/decode.v6i1.1630

Issue

Section

Articles