Comparison of Decision Tree-Based Methods in Lung Disease Detection
DOI:
https://doi.org/10.30812/bite.v7i1.4909Keywords:
C4.5, Decision Tree, Lung Disease, machine learning, Random Forest, XGBoostAbstract
Background: Lung disease is a leading cause of death globally, with more than 4 million cases each year, including 500,000 new cases in Indonesia, most of which are detected at an advanced stage.
Objective: This study aims to compare the performance of three decision tree algorithms, XGBoost, C4.5, and Random Forest, in detecting lung disease and to determine the best method based on evaluation metrics.
Methods: A total of 30,000 data samples from Kaggle were processed through a cleaning stage using the IQR method, categorical attribute coding, and data division into 80% for training and 20% for testing. The classification models used include XGBoost, C4.5, and Random Forest. Model performance evaluation used a confusion matrix, accuracy, precision, recall, and F1-score.
Result: The results showed that the C4.5 algorithm had the best performance with an accuracy of 94.33% and zero false negatives. XGBoost followed with an accuracy of 93.18%, while Random Forest was the lowest (90.07%).
Conclusion: These findings indicate that C4.5 has great potential in an accurate early detection system, helping to reduce the risk of misdiagnosis, especially in false negative cases, and supporting clinical decision making in health facilities.
Downloads
References
1] S. A. Naufal, A. Adiwijaya, dan W. Astuti, “Analisis Perbandingan Klasifikasi Support Vector Machine (SVM) dan K-Nearest Neighbors (KNN) untuk Deteksi Kanker dengan Data Microarray,” JURIKOM (Jurnal Riset Komputer), vol. 7, no. 1, p. 162, Feb. 2020. doi: 10.30865/jurikom.v7i1.2014.
[2] M. Y. Haffandi et al., “Klasifikasi Penyakit Paru-Paru dengan Menggunakan Metode Na¨ıve Bayes Classifier,” Jurnal Teknik Informasi dan Komputer (Tekinkom), vol. 5, no. 2, p. 176, Dec. 2022. doi: 10.37600/tekinkom.v5i2.649.
[3] D. Adimanggala, Algoritme XGBoost dengan Contohnya, https://dindaadi.medium.com/algoritmexgboost-dengan-contohnya-28e958a3e2f6, Mar. 2023.
[4] R. Harahap et al., “Perbandingan Algoritma Random Forest dan XGBoost untuk Klasifikasi Penyakit Paru-Paru Berdasarkan Data Demografi Pasien,” Jurnal Ilmiah Betrik, vol. 15, no. 2, pp. 130–141, 2024.
[5] Y. Amelia, “Perbandingan Metode Machine Learning untuk Mendeteksi Penyakit Jantung,” IDEALIS : Indonesia Journal Information System, vol. 6, no. 2, pp. 220–225, Jul. 2023. doi: 10.36080/idealis.v6i2.3043.
[6] M. Muhasshanah et al., “Comparison of the Performance Results of C4.5 and Random Forest Algorithm in Data Mining to Predict Childbirth Process,” CommIT (Communication and Information Technology) Journal, vol. 17, no. 1, pp. 51–59, Mar. 2023. doi: 10.21512/commit.v17i1.8236.
[7] J. M. A. S. Dachi dan P. Sitompul, “Analisis Perbandingan Algoritma XGBoost dan Algoritma Random Forest Ensemble Learning pada Klasifikasi Keputusan Kredit,” Jurnal Riset Rumpun Matematika dan Ilmu Pengetahuan Alam, vol. 2, no. 2, pp. 87–103, Jul. 2023. doi: 10.55606/jurrimipa.v2i2.1470.
[8] F. T. Kristanti et al., “Advancing financial analytics: Integrating XGBoost, LSTM, and Random Forest Algorithms for precision forecasting of corporate financial distress,” Journal of Infrastructure, Policy and Development, vol. 8, no. 8, p. 4972, Aug. 2024. doi: 10.24294/jipd.v8i8.4972.
[9] A. S. Sunge et al., “Performance Comparison of Decision Tree, Random Forest, and XGBoost Models; And Its Interpretability Using Shap for Recognizing the Necessity of Caesareans Section of Childbirth,” Journal of Theoretical and Applied Information Technology, vol. 101, no. 9, p. 3297, 2023.
[10] Y. Yennimar et al., “Comparison of data mining algorithms (random forest, C4.5, catboost) based on adaptive boosting in predicting diabetes mellitus,” Jurnal Teknik Informatika C.I.T Medicom, vol. 16, no. 1, pp. 1–12, Mar. 2024. doi: 10.35335/cit.Vol16.2024.730.pp1-12.
[11] E. Ismanto dan M. Novalia, “Komparasi Kinerja Algoritma C4.5, Random Forest, dan Gradient Boosting untuk Klasifikasi Komoditas,” Techno.Com, vol. 20, no. 3, pp. 400–410, Aug. 2021. doi: 10.33633/tc.v20i3.4576.
[12] A. Wahid, “Komparasi Algoritma C4.5 dengan Random Forest untuk Rekomendasi Penjualan Gaun aliexpress.com,” Skripsi, Universitas Muhammadiyah Jember, Jan. 2020. doi: 10/ARTIKEL%20.pdf.
[13] T. R. Karin et al., “Enhancing Bank Customer Protection Against Phishing Attacks Through XGBoostBased Feature Analysis,” Transmisi: Jurnal Ilmiah Teknik Elektro, vol. 26, no. 3, pp. 114–121, Nov. 2024. doi: 10.14710/transmisi.26.3.114-121.
[14] H. H. Sinaga dan S. Agustian, “Pebandingan Metode Decision Tree dan XGBoost untuk Klasifikasi Sentimen Vaksin Covid-19 di Twitter,” Jurnal Nasional Teknologi dan Sistem Informasi, vol. 8, no. 3, pp. 107–114, Dec. 2022. doi: 10.25077/TEKNOSI.v8i3.2022.107-114.
[15] K. L. Kohsasih dan Z. Situmorang, “Analisis Perbandingan Algoritma C4.5 dan Na¨ıve Bayes dalam Memprediksi Penyakit Cerebrovascular,” Jurnal Informatika, vol. 9, no. 1, pp. 13–17, Apr. 2022. doi: 10.31294/inf.v9i1.11931.
[16] E. D. Wahyuni, A. A. Arifiyanti, dan M. Kustyani, “Exploratory Data Analysis dalam Konteks Klasifikasi Data Mining,” Prosiding Seminar Nasional ReTII Ke-14 2019, pp. 263–269, Nov. 2019.
[17] R. Adinugroho, “Perbandingan Rasio Split data Training dan data Testing Menggunakan Metode LSTM dalam Memprediksi Harga Indeks Saham Asia,” Skripsi, Fakultas Sains dan Teknologi UIN Syarif Hidayatullah Jakarta, Feb. 2023.
[18] T. Tukino, “Penerapan Algoritma C4.5 untuk Memprediksi Keuntungan pada PT SMOE Indonesia,” Jurnal Sistem Informasi Bisnis, vol. 9, no. 1, p. 39, May 2019. doi: 10.21456/vol9iss1pp39-46.
[19] B. S. C. Putra et al., “Efektivitas Algoritma Random Forest, XGBoost, dan Logistic Regression dalam Prediksi Penyakit Paru-paru,” Techno.Com, vol. 23, no. 4, pp. 909–922, Nov. 2024. doi: 10.62411/tc.v23i4.11705.
[20] A. Sugarda et al., “Penerapan Metode Data Mining C4.5 dalam Penentuan Kelayakan Rehabilitas Rumah Warga,” Journal of Computing and Informatics Research, vol. 1, no. 3, pp. 56–64, Jul. 2022. doi:10.47065/comforch.v1i3.321.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Lely Kurniawati, Dadang Priyanto, Neny Sulistia Ningsih, Moch Syahrir, Ria Rismayati

This work is licensed under a Creative Commons Attribution 4.0 International License.