Investigating Liver Disease Machine Learning Prediction Performancethrough Various Feature Selection Methods
DOI:
https://doi.org/10.30812/matrik.v24i3.4531Keywords:
Feature Selection, Liver Disease Prediction, Ensemble Machine LearningAbstract
Given the increasing prevalence and significant health burden of liver diseases globally, improving the accuracy of predictive models is essential for early diagnosis and effective treatment. The purpose of the study is to systematically analyze how different feature selection methods impact the performance of various machine learning classifiers for liver disease prediction. The research method involved evaluating four distinct feature selection techniques—regular, analysis of variance (ANOVA), univariate, and model-based on a suite of classifiers, including decision forest, decision tree, support vector classifier, multi-layer perceptron, and linear discriminant analysis. The result revealed a significant and variable impact of feature selection on model accuracy. Notably, the ANOVA method paired with the multi-layer perceptron achieved the highest accuracy of 0.801724, while the univariate method was optimal for the decision forest classifier (0.741379). In contrast, model-based selection often degraded performance, particularly for the decision tree classifier, likely due to the introduction of noise and overfitting. The support vector classifier, however, demonstrated robust and consistent accuracy across all selection techniques. These findings underscore that there is no universally superior feature selection method; instead, optimal predictive performance hinges on tailoring the selection technique to the specific machine learning model. This study contributes practical, evidence-based insights into the critical interplay between feature selection and model choice in medical data analysis, offering a guide for improving classification accuracy in liver disease prediction. Future work should explore more sophisticated and hybrid feature selection methods to enhance model performance further.
Downloads
References
[1] P.-L. Gan, S. Huang, X. Pan, H.-F. Xia, X.-Y. Zeng, W.-S. Ren, X. Zhou, M.-H. Lv, and X.-W. Tang, “Global research trends in
the field of liver cirrhosis from 2011 to 2020: A visualised and bibliometric study,” World Journal of Gastroenterology, vol. 28,
no. 33, pp. 4909–4919, Sep. 2022, https://doi.org/10.3748/wjg.v28.i33.4909.
[2] S. Cheemerla and M. Balakrishnan, “Global Epidemiology of Chronic Liver Disease,” Clinical Liver Disease, vol. 17, no. 5,
pp. 365–370, May 2021, https://doi.org/10.1002/cld.1061.
[3] R. Williams, C. Alessi, G. Alexander, M. Allison, R. Aspinall, R. L. Batterham, N. Bhala, N. Day, A. Dhawan, C. Drummond,
J. Ferguson, G. Foster, I. Gilmore, R. Goldacre, H. Gordon, C. Henn, D. Kelly, A. MacGilchrist, R. McCorry, N. McDougall,
Z. Mirza, K. Moriarty, P. Newsome, R. Pinder, S. Roberts, H. Rutter, S. Ryder, M. Samyn, K. Severi, N. Sheron, D. Thorburn,
J. Verne, J. Williams, and A. Yeoman, “New dimensions for hospital services and early detection of disease: A Review from
the Lancet Commission into liver disease in the UK,” The Lancet, vol. 397, no. 10286, pp. 1770–1780, May 2021, https:
//doi.org/10.1016/s0140-6736(20)32396-5.
[4] H. S. R. Rajula, G. Verlato, M. Manchia, N. Antonucci, and V. Fanos, “Comparison of Conventional Statistical Methods with
Machine Learning in Medicine: Diagnosis, Drug Development, and Treatment,” Medicina, vol. 56, no. 9, p. 455, Sep. 2020,
https://doi.org/10.3390/medicina56090455.
[5] M. Pourhomayoun and M. Shakibi, “Predicting mortality risk in patients with COVID-19 using machine learning to help medical
decision-making,” Smart Health, vol. 20, p. 100178, Apr. 2021, https://doi.org/10.1016/j.smhl.2020.100178.
[6] C. Giordano, M. Brennan, B. Mohamed, P. Rashidi, F. Modave, and P. Tighe, “Accessing Artificial Intelligence for Clinical
Decision-Making,” Frontiers in Digital Health, vol. 3, Jun. 2021, https://doi.org/10.3389/fdgth.2021.645232.
[7] S. M. D. A. C. Jayatilake and G. U. Ganegoda, “Involvement of Machine Learning Tools in Healthcare Decision Making,”
Journal of Healthcare Engineering, vol. 2021, pp. 1–20, Jan. 2021, https://doi.org/10.1155/2021/6679512.
[8] M. A. Kalas, L. Chavez, M. Leon, P. T. Taweesedt, and S. Surani, “Abnormal liver enzymes: A review for clinicians,” World
Journal of Hepatology, vol. 13, no. 11, pp. 1688–1698, Nov. 2021, https://doi.org/10.4254/wjh.v13.i11.1688.
[9] K. Pafili and M. Roden, “Nonalcoholic fatty liver disease (NAFLD) from pathogenesis to treatment concepts in humans,”
Molecular Metabolism, vol. 50, p. 101122, Aug. 2021, https://doi.org/10.1016/j.molmet.2020.101122.
[10] F. Mostafa, E. Hasan, M. Williamson, and H. Khan, “Statistical Machine Learning Approaches to Liver Disease Prediction,”
Livers, vol. 1, no. 4, pp. 294–312, Dec. 2021, https://doi.org/10.3390/livers1040023.
[11] A. Spann, A. Yasodhara, J. Kang, K. Watt, B. Wang, A. Goldenberg, and M. Bhat, “Applying Machine Learning in Liver
Disease and Transplantation: A Comprehensive Review,” Hepatology (Baltimore, Md.), vol. 71, no. 3, pp. 1093–1105, Mar.
2020, https://doi.org/10.1002/hep.31103.
[12] M. Ghosh, Md. Mohsin Sarker Raihan, M. Raihan, L. Akter, A. Kumar Bairagi, S. S. Alshamrani, and M. Masud, “A Comparative
Analysis of Machine Learning Algorithms to Predict Liver Disease,” Intelligent Automation & Soft Computing, vol. 30,
no. 3, pp. 917–928, 2021, https://doi.org/10.32604/iasc.2021.017989.
[13] R. Choudhary, T. Gopalakrishnan, D. Ruby, A. Gayathri, V. S. Murthy, and R. Shekhar, “An Efficient Model for Predicting
Liver Disease Using Machine Learning,” in Data Analytics in Bioinformatics, 1st ed. Wiley, Feb. 2021, pp. 443–457, https:
//doi.org/10.1002/9781119785620.ch18.
[14] J. Singh, S. Bagga, and R. Kaur, “Software-based Prediction of Liver Disease with Feature Selection and Classification Techniques,”
Procedia Computer Science, vol. 167, pp. 1970–1980, 2020, https://doi.org/10.1016/j.procs.2020.03.226.
[15] U. M. Khaire and R. Dhanalakshmi, “Stability of feature selection algorithm: A review,” Journal of King Saud University -
Computer and Information Sciences, vol. 34, no. 4, pp. 1060–1073, Apr. 2022, https://doi.org/10.1016/j.jksuci.2019.06.012.
[16] S. Alabdulwahab and B. Moon, “Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building
Time of Machine Learning Classifiers,” Symmetry, vol. 12, no. 9, p. 1424, Aug. 2020, https://doi.org/10.3390/sym12091424.
[17] G. Kou, P. Yang, Y. Peng, F. Xiao, Y. Chen, and F. E. Alsaadi, “Evaluation of feature selection methods for text classification
with small datasets using multiple criteria decision-making methods,” Applied Soft Computing, vol. 86, p. 105836, Jan. 2020,
https://doi.org/10.1016/j.asoc.2019.105836.
[18] R.-C. Chen, C. Dewi, S.-W. Huang, and R. E. Caraka, “Selecting critical features for data classification based on machine
learning methods,” Journal of Big Data, vol. 7, no. 1, Dec. 2020, https://doi.org/10.1186/s40537-020-00327-4.
[19] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction
Techniques for Feature Selection and Feature Extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 1, pp.
56–70, May 2020, https://doi.org/10.38094/jastt1224.
[20] A. Bommert, X. Sun, B. Bischl, J. Rahnenf¨uhrer, and M. Lang, “Benchmark for filter methods for feature selection in highdimensional
classification data,” Computational Statistics & Data Analysis, vol. 143, p. 106839, Mar. 2020, https://doi.org/10.
1016/j.csda.2019.106839.
[21] N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O’Sullivan, “A Review of Feature Selection Methods for Machine
Learning-Based Disease Risk Prediction,” Frontiers in Bioinformatics, vol. 2, Jun. 2022, https://doi.org/10.3389/fbinf.2022.
927312.
[22] M. Rostami, K. Berahmand, E. Nasiri, and S. Forouzandeh, “Review of swarm intelligence-based feature selection methods,”
Engineering Applications of Artificial Intelligence, vol. 100, p. 104210, Apr. 2021, https://doi.org/10.1016/j.engappai.2021.
104210.
[23] S. Afrin, F. M. J. M. Shamrat, T. I. Nibir, M. F. Muntasim, M. S. Moharram, M. M. Imran, and M. Abdulla, “Supervised
machine learning based liver disease prediction approach with LASSO feature selection,” Bulletin of Electrical Engineering
and Informatics, vol. 10, no. 6, pp. 3369–3376, Dec. 2021, https://doi.org/10.11591/eei.v10i6.3242.
[24] J. S. Ko, J. Byun, S. Park, and J. Y.Woo, “Prediction of insufficient hepatic enhancement during the Hepatobiliary phase of Gd-
EOB DTPA-enhanced MRI using machine learning classifier and feature selection algorithms,” Abdominal Radiology, vol. 47,
no. 1, pp. 161–173, Jan. 2022, https://doi.org/10.1007/s00261-021-03308-0.
[25] N. Biswas, M. M. Ali, M. A. Rahaman, M. Islam, M. R. Mia, S. Azam, K. Ahmed, F. M. Bui, F. A. Al-Zahrani, and M. A. Moni,
“Machine Learning-Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques,”
BioMed Research International, vol. 2023, no. 1, Jan. 2023, https://doi.org/10.1155/2023/6864343.
[26] R. Spencer, F. Thabtah, N. Abdelhamid, and M. Thompson, “Exploring feature selection and classification methods for predicting
heart disease,” Dgital Healt, vol. 6, Jan. 2020, https://doi.org/10.1177/2055207620914777.
[27] A. Soni, “Performance Analysis of Classification Algorithms on Liver Disease Detection,” in 2021 IEEE Mysore Sub Section
International Conference (MysuruCon). Hassan, India: IEEE, Oct. 2021, pp. 1–5, https://doi.org/10.1109/mysurucon52639.
2021.9641684.
[28] X. Zhang, L. Yu, H. Yin, and K. K. Lai, “Integrating data augmentation and hybrid feature selection for small sample credit risk
assessment with high dimensionality,” Computers & Operations Research, vol. 146, p. 105937, Oct. 2022, https://doi.org/10.
1016/j.cor.2022.105937.
[29] O. A. Akinola, J. O. Agushaka, and A. E. Ezugwu, “Binary dwarf mongoose optimizer for solving high-dimensional feature
selection problems,” PLOS ONE, vol. 17, no. 10, p. e0274850, Oct. 2022, https://doi.org/10.1371/journal.pone.0274850.
[30] D. P. M. Abellana and D. M. Lao, “A new univariate feature selection algorithm based on the best–worst multi-attribute decisionmaking
method,” Decision Analytics Journal, vol. 7, p. 100240, Jun. 2023.
[31] M. Amiriebrahimabadi and N. Mansouri, “A comprehensive survey of feature selection techniques based on whale optimization
algorithm,” Multimedia Tools and Applications, vol. 83, no. 16, pp. 47 775–47 846, Oct. 2023, https://doi.org/10.1007/
s11042-023-17329-y.
[32] D. Theng and K. K. Bhoyar, “Feature selection techniques for machine learning: A survey of more than two decades of research,”
Knowledge and Information Systems, vol. 66, no. 3, pp. 1575–1637, Mar. 2024, https://doi.org/10.1007/s10115-023-02010-5.
[33] R. Amin, R. Yasmin, S. Ruhi, M. H. Rahman, and M. S. Reza, “Prediction of chronic liver disease patients using integrated
projection based statistical feature extraction with machine learning algorithms,” Informatics in Medicine Unlocked, vol. 36, p.
101155, 2023, https://doi.org/10.1016/j.imu.2022.101155.
[34] M. P. Behera, A. Sarangi, D. Mishra, and S. K. Sarangi, “A Hybrid Machine Learning algorithm for Heart and Liver Disease
Prediction Using Modified Particle Swarm Optimization with Support Vector Machine,” Procedia Computer Science, vol. 218,
pp. 818–827, 2023, https://doi.org/10.1016/j.procs.2023.01.062.
[35] R. K. Sachdeva, P. Bathla, P. Rani, V. Solanki, and R. Ahuja, “A systematic method for diagnosis of hepatitis disease using
machine learning,” Innovations in Systems and Software Engineering, vol. 19, no. 1, pp. 71–80, Mar. 2023, https://doi.org/10.
1007/s11334-022-00509-8.
[36] T. A. Assegie, “Support Vector Machine And K-Nearest Neighbor Based Liver Disease Classification Model,” Indonesian
Journal of electronics, electromedical engineering, and medical informatics, vol. 3, no. 1, pp. 9–14, Feb. 2021, https://doi.org/
10.35882/ijeeemi.v3i1.2.
[37] M. A. Jamil and S. Khanam, “Influence of One-Way ANOVA and Kruskal–Wallis Based Feature Ranking on the Performance of
ML Classifiers for Bearing Fault Diagnosis,” Journal of Vibration Engineering & Technologies, vol. 12, no. 3, pp. 3101–3132,
Mar. 2024, https://doi.org/10.1007/s42417-023-01036-x.
[38] J. Tripathy, R. Dash, B. K. Pattanayak, S. K. Mishra, T. K. Mishra, and D. Puthal, “Combination of Reduction Detection
Using TOPSIS for Gene Expression Data Analysis,” Big Data and Cognitive Computing, vol. 6, no. 1, p. 24, Feb. 2022,
https://doi.org/10.3390/bdcc6010024.
[39] C. Jacob, C. Gopakumar, and F. Nazarudeen, “Optimized Radiomics-Based Machine Learning Approach for Lung Cancer
Subtype Classification,” Biomedical Engineering: Applications, Basis and Communications, vol. 35, no. 05, Oct. 2023, https:
//doi.org/10.4015/s1016237223500230.
[40] A. Ganji, D. Usha, and P. Rajakumar, “Enhanced Early Diagnosis of Liver Diseases Using Feature Selection and Machine
Learning Techniques on the Indian Liver Patient Dataset,” Scalable Computing: Practice and Experience, vol. 26, no. 3, pp.
1104–1115, Apr. 2025, https://doi.org/10.12694/scpe.v26i3.4254.
[41] I. R. Hikmah and R. N. Yasa, “Perbandingan Hasil Prediksi Diagnosis pada Indian Liver Patient Dataset (ILPD) dengan Teknik
Supervised Learning Menggunakan Software Orange,” Jurnal Telematika, vol. 16, no. 2, pp. 69–76, 2021, https://doi.org/10.
61769/telematika.v16i2.402.
[42] A. Gulia, R. Vohra, and P. Rani, “Liver Patient Classification Using Intelligent Techniques,” International Journal of Computer
Science and Information Technologies, vol. 5, no. 4, pp. 5110–5115, 2014.
[43] E. O. Abiodun, A. Alabdulatif, O. I. Abiodun, M. Alawida, A. Alabdulatif, and R. S. Alkhawaldeh, “A systematic review
of emerging feature selection optimization methods for optimal text classification: The present state and prospective opportunities,”
Neural Computing and Applications, vol. 33, no. 22, pp. 15 091–15 118, Nov. 2021, https://doi.org/10.1007/
s00521-021-06406-8.
[44] H. Nasiri and S. A. Alavi, “A Novel Framework Based on Deep Learning and ANOVA Feature Selection Method for Diagnosis
of COVID-19 Cases from Chest X-Ray Images,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–11, Jan. 2022,
https://doi.org/10.1155/2022/4694567.
[45] M. Alassaf and A. M. Qamar, “Improving Sentiment Analysis of Arabic Tweets by One-way ANOVA,” Journal of King Saud
University - Computer and Information Sciences, vol. 34, no. 6, pp. 2849–2859, Jun. 2022, https://doi.org/10.1016/j.jksuci.
2020.10.023.
[46] S. Williamson, K. Vijayakumar, and V. J. Kadam, “Predicting breast cancer biopsy outcomes from BI-RADS findings using
random forests with chi-square and MI features,” Multimedia Tools and Applications, vol. 81, no. 26, pp. 36 869–36 889, Nov.
2022, https://doi.org/10.1007/s11042-021-11114-5.
[47] A. S. Sumant and D. Patil, “Ensemble Feature Subset Selection: Integration of Symmetric Uncertainty and Chi-Square techniques
with RReliefF,” Journal of The Institution of Engineers (India): Series B, vol. 103, no. 3, pp. 831–844, Jun. 2022,
https://doi.org/10.1007/s40031-021-00684-5.
[48] Y. E. Isik, Y. Gormez, Z. Aydin, and B. Bakir-Gungor, “The Determination of Distinctive Single Nucleotide Polymorphism Sets
for the Diagnosis of Behc¸et’s Disease,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 19, no. 3,
pp. 1909–1918, May 2022, https://doi.org/10.1109/tcbb.2021.3053429.
[49] J. Azimjonov and T. Kim, “Stochastic gradient descent classifier-based lightweight intrusion detection systems using the efficient
feature subsets of datasets,” Expert Systems with Applications, vol. 237, p. 121493, Mar. 2024, https://doi.org/10.1016/j.
eswa.2023.121493.
[50] P. Dhal and C. Azad, “A comprehensive survey on feature selection in the various fields of machine learning,” Applied Intelligence,
vol. 52, no. 4, pp. 4543–4581, Mar. 2022, https://doi.org/10.1007/s10489-021-02550-9.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ahmad Zein Al Wafi, Febry Putra Rochim, Veda Bezaleel

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Irma Binti Sya'idah, Sugiyarto Surono, Goh Khang Wen, DynamicWeighted Particle Swarm Optimization - Support Vector Machine Optimization in Recursive Feature Elimination Feature Selection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Abd Mizwar A Rahim, Andi Sunyoto, Muhammad Rudyanto Arief, Stroke Prediction Using Machine Learning Method with Extreme Gradient Boosting Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
- Roudlotul Jannah Alfirdausy, Nurissaidah Ulinnuha, Wika Dianita Utami, Implementation of The Extreme Gradient Boosting Algorithm with Hyperparameter Tuning in Celiac Disease Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Annisa Nurul Puteri, Arizal Arizal, Andini Dani Achmad, Feature Selection Correlation-Based pada Prediksi Nasabah Bank Telemarketing untuk Deposito , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 20 No. 2 (2021)
- Muhammad Alkaff, Muhammad Afrizal Miqdad, Muhammad Fachrurrazi, Muhammad Nur Abdi, Ahmad Zainul Abidin, Raisa Amalia, Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Taufik Hidayat, Mohammad Ridwan, Muhamad Fajrul Iqbal, Sukisno Sukisno, Robby Rizky, William Eric Manongga, Determining Toddler's Nutritional Status with Machine Learning Classification Analysis Approach , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Edi Ismanto, Januar Al Amien, Vitriani Vitriani, A Comparison of Enhanced Ensemble Learning Techniques for Internet of Things Network Attack Detection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Firda Yunita Sari, Maharani sukma Kuntari, Hani Khaulasari, Winda Ari Yati, Comparison of Support Vector Machine Performance with Oversampling and Outlier Handling in Diabetic Disease Detection Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Mamluatul Hani'ah, Moch Zawaruddin Abdullah, Wilda Imama Sabilla, Syafaat Akbar, Dikky Rahmad Shafara, Google Trends and Technical Indicator based Machine Learning for Stock Market Prediction , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Wahyu Styo Pratama, Didik Dwi Prasetya, Triyanna Widyaningtyas, Muhammad Zaki Wiryawan, Lalu Ganda Rady Putra, Tsukasa Hirashima, Performance Evaluation of Artificial Intelligence Models for Classification in Concept Map Quality Assessment , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
You may also start an advanced similarity search for this article.
.png)











