Investigating Liver Disease Machine Learning Prediction Performancethrough Various Feature Selection Methods
DOI:
https://doi.org/10.30812/matrik.v24i3.4531Keywords:
Feature Selection, Liver Disease Prediction, Ensemble Machine LearningAbstract
Given the increasing prevalence and significant health burden of liver diseases globally, improving the accuracy of predictive models is essential for early diagnosis and effective treatment. The purpose of the study is to systematically analyze how different feature selection methods impact the performance of various machine learning classifiers for liver disease prediction. The research method involved evaluating four distinct feature selection techniques—regular, analysis of variance (ANOVA), univariate, and model-based on a suite of classifiers, including decision forest, decision tree, support vector classifier, multi-layer perceptron, and linear discriminant analysis. The result revealed a significant and variable impact of feature selection on model accuracy. Notably, the ANOVA method paired with the multi-layer perceptron achieved the highest accuracy of 0.801724, while the univariate method was optimal for the decision forest classifier (0.741379). In contrast, model-based selection often degraded performance, particularly for the decision tree classifier, likely due to the introduction of noise and overfitting. The support vector classifier, however, demonstrated robust and consistent accuracy across all selection techniques. These findings underscore that there is no universally superior feature selection method; instead, optimal predictive performance hinges on tailoring the selection technique to the specific machine learning model. This study contributes practical, evidence-based insights into the critical interplay between feature selection and model choice in medical data analysis, offering a guide for improving classification accuracy in liver disease prediction. Future work should explore more sophisticated and hybrid feature selection methods to enhance model performance further.
Downloads
References
[1] P.-L. Gan, S. Huang, X. Pan, H.-F. Xia, X.-Y. Zeng, W.-S. Ren, X. Zhou, M.-H. Lv, and X.-W. Tang, “Global research trends in
the field of liver cirrhosis from 2011 to 2020: A visualised and bibliometric study,” World Journal of Gastroenterology, vol. 28,
no. 33, pp. 4909–4919, Sep. 2022, https://doi.org/10.3748/wjg.v28.i33.4909.
[2] S. Cheemerla and M. Balakrishnan, “Global Epidemiology of Chronic Liver Disease,” Clinical Liver Disease, vol. 17, no. 5,
pp. 365–370, May 2021, https://doi.org/10.1002/cld.1061.
[3] R. Williams, C. Alessi, G. Alexander, M. Allison, R. Aspinall, R. L. Batterham, N. Bhala, N. Day, A. Dhawan, C. Drummond,
J. Ferguson, G. Foster, I. Gilmore, R. Goldacre, H. Gordon, C. Henn, D. Kelly, A. MacGilchrist, R. McCorry, N. McDougall,
Z. Mirza, K. Moriarty, P. Newsome, R. Pinder, S. Roberts, H. Rutter, S. Ryder, M. Samyn, K. Severi, N. Sheron, D. Thorburn,
J. Verne, J. Williams, and A. Yeoman, “New dimensions for hospital services and early detection of disease: A Review from
the Lancet Commission into liver disease in the UK,” The Lancet, vol. 397, no. 10286, pp. 1770–1780, May 2021, https:
//doi.org/10.1016/s0140-6736(20)32396-5.
[4] H. S. R. Rajula, G. Verlato, M. Manchia, N. Antonucci, and V. Fanos, “Comparison of Conventional Statistical Methods with
Machine Learning in Medicine: Diagnosis, Drug Development, and Treatment,” Medicina, vol. 56, no. 9, p. 455, Sep. 2020,
https://doi.org/10.3390/medicina56090455.
[5] M. Pourhomayoun and M. Shakibi, “Predicting mortality risk in patients with COVID-19 using machine learning to help medical
decision-making,” Smart Health, vol. 20, p. 100178, Apr. 2021, https://doi.org/10.1016/j.smhl.2020.100178.
[6] C. Giordano, M. Brennan, B. Mohamed, P. Rashidi, F. Modave, and P. Tighe, “Accessing Artificial Intelligence for Clinical
Decision-Making,” Frontiers in Digital Health, vol. 3, Jun. 2021, https://doi.org/10.3389/fdgth.2021.645232.
[7] S. M. D. A. C. Jayatilake and G. U. Ganegoda, “Involvement of Machine Learning Tools in Healthcare Decision Making,”
Journal of Healthcare Engineering, vol. 2021, pp. 1–20, Jan. 2021, https://doi.org/10.1155/2021/6679512.
[8] M. A. Kalas, L. Chavez, M. Leon, P. T. Taweesedt, and S. Surani, “Abnormal liver enzymes: A review for clinicians,” World
Journal of Hepatology, vol. 13, no. 11, pp. 1688–1698, Nov. 2021, https://doi.org/10.4254/wjh.v13.i11.1688.
[9] K. Pafili and M. Roden, “Nonalcoholic fatty liver disease (NAFLD) from pathogenesis to treatment concepts in humans,”
Molecular Metabolism, vol. 50, p. 101122, Aug. 2021, https://doi.org/10.1016/j.molmet.2020.101122.
[10] F. Mostafa, E. Hasan, M. Williamson, and H. Khan, “Statistical Machine Learning Approaches to Liver Disease Prediction,”
Livers, vol. 1, no. 4, pp. 294–312, Dec. 2021, https://doi.org/10.3390/livers1040023.
[11] A. Spann, A. Yasodhara, J. Kang, K. Watt, B. Wang, A. Goldenberg, and M. Bhat, “Applying Machine Learning in Liver
Disease and Transplantation: A Comprehensive Review,” Hepatology (Baltimore, Md.), vol. 71, no. 3, pp. 1093–1105, Mar.
2020, https://doi.org/10.1002/hep.31103.
[12] M. Ghosh, Md. Mohsin Sarker Raihan, M. Raihan, L. Akter, A. Kumar Bairagi, S. S. Alshamrani, and M. Masud, “A Comparative
Analysis of Machine Learning Algorithms to Predict Liver Disease,” Intelligent Automation & Soft Computing, vol. 30,
no. 3, pp. 917–928, 2021, https://doi.org/10.32604/iasc.2021.017989.
[13] R. Choudhary, T. Gopalakrishnan, D. Ruby, A. Gayathri, V. S. Murthy, and R. Shekhar, “An Efficient Model for Predicting
Liver Disease Using Machine Learning,” in Data Analytics in Bioinformatics, 1st ed. Wiley, Feb. 2021, pp. 443–457, https:
//doi.org/10.1002/9781119785620.ch18.
[14] J. Singh, S. Bagga, and R. Kaur, “Software-based Prediction of Liver Disease with Feature Selection and Classification Techniques,”
Procedia Computer Science, vol. 167, pp. 1970–1980, 2020, https://doi.org/10.1016/j.procs.2020.03.226.
[15] U. M. Khaire and R. Dhanalakshmi, “Stability of feature selection algorithm: A review,” Journal of King Saud University -
Computer and Information Sciences, vol. 34, no. 4, pp. 1060–1073, Apr. 2022, https://doi.org/10.1016/j.jksuci.2019.06.012.
[16] S. Alabdulwahab and B. Moon, “Feature Selection Methods Simultaneously Improve the Detection Accuracy and Model Building
Time of Machine Learning Classifiers,” Symmetry, vol. 12, no. 9, p. 1424, Aug. 2020, https://doi.org/10.3390/sym12091424.
[17] G. Kou, P. Yang, Y. Peng, F. Xiao, Y. Chen, and F. E. Alsaadi, “Evaluation of feature selection methods for text classification
with small datasets using multiple criteria decision-making methods,” Applied Soft Computing, vol. 86, p. 105836, Jan. 2020,
https://doi.org/10.1016/j.asoc.2019.105836.
[18] R.-C. Chen, C. Dewi, S.-W. Huang, and R. E. Caraka, “Selecting critical features for data classification based on machine
learning methods,” Journal of Big Data, vol. 7, no. 1, Dec. 2020, https://doi.org/10.1186/s40537-020-00327-4.
[19] R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction
Techniques for Feature Selection and Feature Extraction,” Journal of Applied Science and Technology Trends, vol. 1, no. 1, pp.
56–70, May 2020, https://doi.org/10.38094/jastt1224.
[20] A. Bommert, X. Sun, B. Bischl, J. Rahnenf¨uhrer, and M. Lang, “Benchmark for filter methods for feature selection in highdimensional
classification data,” Computational Statistics & Data Analysis, vol. 143, p. 106839, Mar. 2020, https://doi.org/10.
1016/j.csda.2019.106839.
[21] N. Pudjihartono, T. Fadason, A. W. Kempa-Liehr, and J. M. O’Sullivan, “A Review of Feature Selection Methods for Machine
Learning-Based Disease Risk Prediction,” Frontiers in Bioinformatics, vol. 2, Jun. 2022, https://doi.org/10.3389/fbinf.2022.
927312.
[22] M. Rostami, K. Berahmand, E. Nasiri, and S. Forouzandeh, “Review of swarm intelligence-based feature selection methods,”
Engineering Applications of Artificial Intelligence, vol. 100, p. 104210, Apr. 2021, https://doi.org/10.1016/j.engappai.2021.
104210.
[23] S. Afrin, F. M. J. M. Shamrat, T. I. Nibir, M. F. Muntasim, M. S. Moharram, M. M. Imran, and M. Abdulla, “Supervised
machine learning based liver disease prediction approach with LASSO feature selection,” Bulletin of Electrical Engineering
and Informatics, vol. 10, no. 6, pp. 3369–3376, Dec. 2021, https://doi.org/10.11591/eei.v10i6.3242.
[24] J. S. Ko, J. Byun, S. Park, and J. Y.Woo, “Prediction of insufficient hepatic enhancement during the Hepatobiliary phase of Gd-
EOB DTPA-enhanced MRI using machine learning classifier and feature selection algorithms,” Abdominal Radiology, vol. 47,
no. 1, pp. 161–173, Jan. 2022, https://doi.org/10.1007/s00261-021-03308-0.
[25] N. Biswas, M. M. Ali, M. A. Rahaman, M. Islam, M. R. Mia, S. Azam, K. Ahmed, F. M. Bui, F. A. Al-Zahrani, and M. A. Moni,
“Machine Learning-Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques,”
BioMed Research International, vol. 2023, no. 1, Jan. 2023, https://doi.org/10.1155/2023/6864343.
[26] R. Spencer, F. Thabtah, N. Abdelhamid, and M. Thompson, “Exploring feature selection and classification methods for predicting
heart disease,” Dgital Healt, vol. 6, Jan. 2020, https://doi.org/10.1177/2055207620914777.
[27] A. Soni, “Performance Analysis of Classification Algorithms on Liver Disease Detection,” in 2021 IEEE Mysore Sub Section
International Conference (MysuruCon). Hassan, India: IEEE, Oct. 2021, pp. 1–5, https://doi.org/10.1109/mysurucon52639.
2021.9641684.
[28] X. Zhang, L. Yu, H. Yin, and K. K. Lai, “Integrating data augmentation and hybrid feature selection for small sample credit risk
assessment with high dimensionality,” Computers & Operations Research, vol. 146, p. 105937, Oct. 2022, https://doi.org/10.
1016/j.cor.2022.105937.
[29] O. A. Akinola, J. O. Agushaka, and A. E. Ezugwu, “Binary dwarf mongoose optimizer for solving high-dimensional feature
selection problems,” PLOS ONE, vol. 17, no. 10, p. e0274850, Oct. 2022, https://doi.org/10.1371/journal.pone.0274850.
[30] D. P. M. Abellana and D. M. Lao, “A new univariate feature selection algorithm based on the best–worst multi-attribute decisionmaking
method,” Decision Analytics Journal, vol. 7, p. 100240, Jun. 2023.
[31] M. Amiriebrahimabadi and N. Mansouri, “A comprehensive survey of feature selection techniques based on whale optimization
algorithm,” Multimedia Tools and Applications, vol. 83, no. 16, pp. 47 775–47 846, Oct. 2023, https://doi.org/10.1007/
s11042-023-17329-y.
[32] D. Theng and K. K. Bhoyar, “Feature selection techniques for machine learning: A survey of more than two decades of research,”
Knowledge and Information Systems, vol. 66, no. 3, pp. 1575–1637, Mar. 2024, https://doi.org/10.1007/s10115-023-02010-5.
[33] R. Amin, R. Yasmin, S. Ruhi, M. H. Rahman, and M. S. Reza, “Prediction of chronic liver disease patients using integrated
projection based statistical feature extraction with machine learning algorithms,” Informatics in Medicine Unlocked, vol. 36, p.
101155, 2023, https://doi.org/10.1016/j.imu.2022.101155.
[34] M. P. Behera, A. Sarangi, D. Mishra, and S. K. Sarangi, “A Hybrid Machine Learning algorithm for Heart and Liver Disease
Prediction Using Modified Particle Swarm Optimization with Support Vector Machine,” Procedia Computer Science, vol. 218,
pp. 818–827, 2023, https://doi.org/10.1016/j.procs.2023.01.062.
[35] R. K. Sachdeva, P. Bathla, P. Rani, V. Solanki, and R. Ahuja, “A systematic method for diagnosis of hepatitis disease using
machine learning,” Innovations in Systems and Software Engineering, vol. 19, no. 1, pp. 71–80, Mar. 2023, https://doi.org/10.
1007/s11334-022-00509-8.
[36] T. A. Assegie, “Support Vector Machine And K-Nearest Neighbor Based Liver Disease Classification Model,” Indonesian
Journal of electronics, electromedical engineering, and medical informatics, vol. 3, no. 1, pp. 9–14, Feb. 2021, https://doi.org/
10.35882/ijeeemi.v3i1.2.
[37] M. A. Jamil and S. Khanam, “Influence of One-Way ANOVA and Kruskal–Wallis Based Feature Ranking on the Performance of
ML Classifiers for Bearing Fault Diagnosis,” Journal of Vibration Engineering & Technologies, vol. 12, no. 3, pp. 3101–3132,
Mar. 2024, https://doi.org/10.1007/s42417-023-01036-x.
[38] J. Tripathy, R. Dash, B. K. Pattanayak, S. K. Mishra, T. K. Mishra, and D. Puthal, “Combination of Reduction Detection
Using TOPSIS for Gene Expression Data Analysis,” Big Data and Cognitive Computing, vol. 6, no. 1, p. 24, Feb. 2022,
https://doi.org/10.3390/bdcc6010024.
[39] C. Jacob, C. Gopakumar, and F. Nazarudeen, “Optimized Radiomics-Based Machine Learning Approach for Lung Cancer
Subtype Classification,” Biomedical Engineering: Applications, Basis and Communications, vol. 35, no. 05, Oct. 2023, https:
//doi.org/10.4015/s1016237223500230.
[40] A. Ganji, D. Usha, and P. Rajakumar, “Enhanced Early Diagnosis of Liver Diseases Using Feature Selection and Machine
Learning Techniques on the Indian Liver Patient Dataset,” Scalable Computing: Practice and Experience, vol. 26, no. 3, pp.
1104–1115, Apr. 2025, https://doi.org/10.12694/scpe.v26i3.4254.
[41] I. R. Hikmah and R. N. Yasa, “Perbandingan Hasil Prediksi Diagnosis pada Indian Liver Patient Dataset (ILPD) dengan Teknik
Supervised Learning Menggunakan Software Orange,” Jurnal Telematika, vol. 16, no. 2, pp. 69–76, 2021, https://doi.org/10.
61769/telematika.v16i2.402.
[42] A. Gulia, R. Vohra, and P. Rani, “Liver Patient Classification Using Intelligent Techniques,” International Journal of Computer
Science and Information Technologies, vol. 5, no. 4, pp. 5110–5115, 2014.
[43] E. O. Abiodun, A. Alabdulatif, O. I. Abiodun, M. Alawida, A. Alabdulatif, and R. S. Alkhawaldeh, “A systematic review
of emerging feature selection optimization methods for optimal text classification: The present state and prospective opportunities,”
Neural Computing and Applications, vol. 33, no. 22, pp. 15 091–15 118, Nov. 2021, https://doi.org/10.1007/
s00521-021-06406-8.
[44] H. Nasiri and S. A. Alavi, “A Novel Framework Based on Deep Learning and ANOVA Feature Selection Method for Diagnosis
of COVID-19 Cases from Chest X-Ray Images,” Computational Intelligence and Neuroscience, vol. 2022, pp. 1–11, Jan. 2022,
https://doi.org/10.1155/2022/4694567.
[45] M. Alassaf and A. M. Qamar, “Improving Sentiment Analysis of Arabic Tweets by One-way ANOVA,” Journal of King Saud
University - Computer and Information Sciences, vol. 34, no. 6, pp. 2849–2859, Jun. 2022, https://doi.org/10.1016/j.jksuci.
2020.10.023.
[46] S. Williamson, K. Vijayakumar, and V. J. Kadam, “Predicting breast cancer biopsy outcomes from BI-RADS findings using
random forests with chi-square and MI features,” Multimedia Tools and Applications, vol. 81, no. 26, pp. 36 869–36 889, Nov.
2022, https://doi.org/10.1007/s11042-021-11114-5.
[47] A. S. Sumant and D. Patil, “Ensemble Feature Subset Selection: Integration of Symmetric Uncertainty and Chi-Square techniques
with RReliefF,” Journal of The Institution of Engineers (India): Series B, vol. 103, no. 3, pp. 831–844, Jun. 2022,
https://doi.org/10.1007/s40031-021-00684-5.
[48] Y. E. Isik, Y. Gormez, Z. Aydin, and B. Bakir-Gungor, “The Determination of Distinctive Single Nucleotide Polymorphism Sets
for the Diagnosis of Behc¸et’s Disease,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 19, no. 3,
pp. 1909–1918, May 2022, https://doi.org/10.1109/tcbb.2021.3053429.
[49] J. Azimjonov and T. Kim, “Stochastic gradient descent classifier-based lightweight intrusion detection systems using the efficient
feature subsets of datasets,” Expert Systems with Applications, vol. 237, p. 121493, Mar. 2024, https://doi.org/10.1016/j.
eswa.2023.121493.
[50] P. Dhal and C. Azad, “A comprehensive survey on feature selection in the various fields of machine learning,” Applied Intelligence,
vol. 52, no. 4, pp. 4543–4581, Mar. 2022, https://doi.org/10.1007/s10489-021-02550-9.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Ahmad Zein Al Wafi, Febry Putra Rochim, Veda Bezaleel

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- M Safii, Husain Husain, Khairan Marzuki, Support Vector Machine Optimization for Diabetes Prediction Using Grid Search Integrated with SHapley Additive exPlanations , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Lalu Zazuli Azhar Mardedi, Fahry Fahry, Miftahul Madani, Hairani Hairani, Detection of Rice Diseases Using Leaf Images with Visual Geometric Group (VGG-19) Architecture and Different Optimizers , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Gallen cakra adhi wibowo, Sri Yulianto Joko Prasetyo, Irwan Sembiring, Tsunami Vulnerability and Risk Assessment in Banyuwangi District using machine learning and Landsat 8 image data , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Nurun Latifah, Ramaditia Dwiyansaputra, Gibran Satya Nugraha, Multiclass Text Classification of Indonesian Short Message Service (SMS) Spam using Deep Learning Method and Easy Data Augmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Wikky Fawwaz Al Maki, Amien Jafar Makrufi, Support vector machine with a firefly optimization algorithm for classification of apple fruit disease , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Miftahuddin Fahmi, Anton Yudhana, Sunardi Sunardi, Image Processing Using Morphology on Support Vector Machine Classification Model for Waste Image , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Alya Masitha, Muhammad Kunta Biddinika, Herman Herman, K Value Effect on Accuracy Using the K-NN for Heart Failure Dataset , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Ni Wayan Sumartini Saraswati, I Wayan Dharma Suryawan, Ni Komang Tri Juniartini, I Dewa Made Krishna Muku, Poria Pirozmand, Weizhi Song, Recognizing Pneumonia Infection in Chest X-Ray Using Deep Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Anthony Anggrawan, Mayadi Mayadi, Application of KNN Machine Learning and Fuzzy C-Means to Diagnose Diabetes , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Hartono, Khusnul Khotimah, Rokin Maharjan, Improving Detection Accuracy of Brute-Force Attacks on MariaDB Using Standard Isolation Forest: A Comparative Analysis with RotatedVariant , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
You may also start an advanced similarity search for this article.
.png)











