Enhancing Predictive Models: An In-depth Analysis of Feature Selection Techniques Coupled with Boosting Algorithms
DOI:
https://doi.org/10.30812/matrik.v23i2.3788Keywords:
Boosting Algorithm, Feature Selection, Fetal Health Dataset, Fetal Health, Recursive Feature EliminationAbstract
This research addresses the critical need to enhance predictive models for fetal health classification using Cardiotocography (CTG) data. The literature review underscores challenges in imbalanced labels, feature selection, and efficient data handling. This paper aims to enhance predictive models for fetal health classification using Cardiotocography (CTG) data by addressing challenges related to imbalanced labels, feature selection, and efficient data handling. The study uses Recursive Feature Elimination (RFE) and boosting algorithms (XGBoost, AdaBoost, LightGBM, CATBoost, and Histogram-Based Boosting) to refine model performance. The results reveal notable variations in precision, Recall, F1-Score, accuracy, and AUC across different algorithms and RFE applications. Notably, Random Forest with XGBoost exhibits superior performance in precision (0.940), Recall (0.890), F1-Score (0.920), accuracy (0.950), and AUC (0.960). Conversely, Logistic Regression with AdaBoost demonstrates lower performance. The absence of RFE also impacts model effectiveness. In conclusion, the study successfully employs RFE and boosting algorithms to enhance fetal health classification models, contributing valuable insights for improved prenatal diagnosis.
Downloads
References
machine learning methods to improve pregnancy outcomes,†Briefings in Bioinformatics, vol. 22, no. 5, p. bbaa369, Sep. 2021.
[Online]. Available: https://academic.oup.com/bib/article/doi/10.1093/bib/bbaa369/6065792
[2] P. Garcia-Canadilla, S. Sanchez-Martinez, F. Crispi, and B. Bijnens, “Machine Learning in Fetal Cardiology:
What to Expect,†Fetal Diagnosis and Therapy, vol. 47, no. 5, pp. 363–372, 2020. [Online]. Available: https:
//www.karger.com/Article/FullText/505021
[3] N. Muhammad Hussain, A. U. Rehman, M. T. B. Othman, J. Zafar, H. Zafar, and H. Hamam, “Accessing Artificial Intelligence
for Fetus Health Status Using Hybrid Deep Learning Algorithm (AlexNet-SVM) on Cardiotocographic Data,†Sensors, vol. 22,
no. 14, p. 5103, Jul. 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/14/5103
[4] R. Fung, J. Villar, A. Dashti, L. C. Ismail, E. Staines-Urias, E. O. Ohuma, L. J. Salomon, C. G. Victora, F. C. Barros,
A. Lambert, M. Carvalho, Y. A. Jaffer, J. A. Noble, M. G. Gravett, M. Purwar, R. Pang, E. Bertino, S. Munim, A. M.
Min, R. McGready, S. A. Norris, Z. A. Bhutta, S. H. Kennedy, A. T. Papageorghiou, A. Ourmazd, S. Norris, S. Abbott,
A. Abubakar, J. Acedo, I. Ahmed, F. Al-Aamri, J. Al-Abduwani, J. Al-Abri, D. Alam, E. Albernaz, H. Algren, F. Al-Habsi,
M. Alija, H. Al-Jabri, H. Al-Lawatiya, B. Al-Rashidiya, D. Altman, W. Al-Zadjali, H. Andersen, L. Aranzeta, S. Ash,
M. Baricco, F. Barros, H. Barsosio, C. Batiuk, M. Batra, J. Berkley, E. Bertino, M. Bhan, B. Bhat, Z. Bhutta, I. Blakey,
S. Bornemeier, A. Bradman, M. Buckle, O. Burnham, F. Burton, A. Capp, V. Cararra, R. Carew, V. Carrara, A. Carter,
M. Carvalho, P. Chamberlain, I. L. Cheikh, L. Cheikh Ismail, A. Choudhary, S. Choudhary, W. Chumlea, C. Condon,
L. Corra, C. Cosgrove, R. Craik, M. Da Silveira, D. Danelon, T. De Wet, E. De Leon, S. Deshmukh, G. Deutsch, J. Dhami,
N. P. Di, M. Dighe, H. Dolk, M. Domingues, D. Dongaonkar, D. Enquobahrie, B. Eskenazi, F. Farhi, M. Fernandes,
D. Finkton, S. Fonseca, I. Frederick, M. Frigerio, P. Gaglioti, C. Garza, G. Gilli, P. Gilli, M. Giolito, F. Giuliani, J. Golding,
M. Gravett, S. Gu, Y. Guman, Y. He, L. Hoch, S. Hussein, D. Ibanez, C. Ioannou, N. Jacinta, N. Jackson, Y. Jaffer,
S. Jaiswal, J. Jimenez-Bustos, F. Juangco, L. Juodvirsiene, M. Katz, B. Kemp, S. Kennedy, M. Ketkar, V. Khedikar, M. Kihara,
J. Kilonzo, C. Kisiang’ani, J. Kizidio, C. Knight, H. Knight, N. Kunnawar, A. Laister, A. Lambert, A. Langer, T. Lephoto,
A. Leston, T. Lewis, H. Liu, S. Lloyd, P. Lumbiganon, S. Macauley, E. Maggiora, C. Mahorkar, M. Mainwaring, L. Malgas,
A. Matijasevich, K. McCormick, R. McGready, R. Miller, A. Min, A. Mitidieri, V. Mkrtychyan, B. Monyepote, D. Mota,I. Mulik, S. Munim, D. Muninzwa, N. Musee, S. Mwakio, H. Mwangudzah, R. Napolitano, C. Newton, V. Ngami, J. Noble,
S. Norris, T. Norris, F. Nosten, K. Oas, M. Oberto, L. Occhi, R. Ochieng, E. Ohuma, E. Olearo, I. Olivera, M. Owende,
C. Pace, Y. Pan, R. Pang, A. Papageorghiou, B. Patel, V. Paul, W. Paulsene, F. Puglia, M. Purwar, V. Rajan, A. Raza,
D. Reade, J. Rivera, D. Rocco, F. Roseman, S. Roseman, C. Rossi, P. Rothwell, I. Rovelli, K. Saboo, R. Salam, M. Salim,
L. Salomon, L. M. Sanchez, J. Sande, I. Sarris, S. Savini, I. Sclowitz, A. Seale, J. Shah, M. Sharps, C. Shembekar, Y. Shen,
M. Shorten, F. Signorile, A. Singh, S. Sohoni, A. Somani, T. Sorensen, A. Soria-Frisch, E. Staines Urias, A. Stein, W. Stones,
V. Taori, K. Tayade, T. Todros, R. Uauy, A. Varalda, M. Venkataraman, C. Victora, J. Villar, S. Vinayak, S. Waller,
L. Walusuna, J. Wang, L. Wang, S. Wanyonyi, D. Weatherall, S. Wiladphaingern, A. Wilkinson, D. Wilson, M. Wu, Q. Wu,
K. Wulff, D. Yellappan, Y. Yuan, S. Zaidi, G. Zainab, J. Zhang, and Y. Zhang, “Achieving accurate estimates of fetal
gestational age and personalised predictions of fetal growth based on data from an international prospective cohort study:
a population-based machine learning study,†The Lancet Digital Health, vol. 2, no. 7, pp. e368–e375, Jul. 2020. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/S258975002030131X
[5] M. T. Alam, M. A. I. Khan, N. N. Dola, T. Tazin, M. M. Khan, A. A. Albraikan, and F. A. Almalki, “Comparative Analysis of
Different Efficient Machine Learning Methods for Fetal Health Classification,†Applied Bionics and Biomechanics, vol. 2022,
pp. 1–12, Apr. 2022. [Online]. Available: https://www.hindawi.com/journals/abb/2022/6321884/
[6] N. Rahmayanti, H. Pradani, M. Pahlawan, and R. Vinarti, “Comparison of machine learning algorithms to classify
fetal health using cardiotocogram data,†Procedia Computer Science, vol. 197, pp. 162–171, 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S1877050921023541
[7] R. R. Dixit, “Predicting Fetal Health using Cardiotocograms: A Machine Learning Approach,†Journal of Advanced
Analytics in Healthcare Management, vol. 6, no. 1, pp. 43–57, Jan. 2022, number: 1. [Online]. Available:
https://research.tensorgate.org/index.php/JAAHM/article/view/38
[8] M. M. Islam, M. Rokunojjaman, A. Amin, M. N. Akhtar, and I. H. Sarker, “Diagnosis and Classification of Fetal Health Based
on CTG Data Using Machine Learning Techniques,†in Machine Intelligence and Emerging Technologies. Springer, Cham,
2023, pp. 3–16, iSSN: 1867-822X. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-031-34622-4 1
[9] J. Xia, L. Sun, S. Xu, Q. Xiang, J. Zhao, W. Xiong, Y. Xu, and S. Chu, “A Model Using Support
Vector Machines Recursive Feature Elimination (SVM-RFE) Algorithm to Classify Whether COPD Patients Have
Been Continuously Managed According to GOLD Guidelines,†International Journal of Chronic Obstructive Pulmonary
Disease, vol. Volume 15, pp. 2779–2786, Nov. 2020. [Online]. Available: https://www.dovepress.com/
a-model-using-support-vector-machines-recursive-feature-elimination-sv-peer-reviewed-article-COPD
[10] M. Awad and S. Fraihat, “Recursive Feature Elimination with Cross-Validation with Decision Tree: Feature Selection Method
for Machine Learning-Based Intrusion Detection Systems,†Journal of Sensor and Actuator Networks, vol. 12, no. 5, p. 67,
Sep. 2023. [Online]. Available: https://www.mdpi.com/2224-2708/12/5/67
[11] H. M. Alshanbari, T. Mehmood, W. Sami, W. Alturaiki, M. A. Hamza, and B. Alosaimi, “Prediction and Classification of
COVID-19 Admissions to Intensive Care Units (ICU) Using Weighted Radial Kernel SVM Coupled with Recursive Feature
Elimination (RFE),†Life, vol. 12, no. 7, p. 1100, Jul. 2022. [Online]. Available: https://www.mdpi.com/2075-1729/12/7/1100
[12] Y. Han, L. Huang, and F. Zhou, “A dynamic recursive feature elimination framework (dRFE) to further refine
a set of OMIC biomarkers,†Bioinformatics, vol. 37, no. 15, pp. 2183–2189, Aug. 2021. [Online]. Available:
https://academic.oup.com/bioinformatics/article/37/15/2183/6124282
[13] W. Lian, G. Nie, B. Jia, D. Shi, Q. Fan, and Y. Liang, “An Intrusion Detection Method Based on Decision Tree-Recursive
Feature Elimination in Ensemble Learning,†Mathematical Problems in Engineering, vol. 2020, pp. 1–15, Nov. 2020. [Online].
Available: https://www.hindawi.com/journals/mpe/2020/2835023/
[14] D. A. Otchere, T. O. A. Ganat, J. O. Ojero, B. N. Tackie-Otoo, and M. Y. Taki, “Application of gradient
boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation
predictions,†Journal of Petroleum Science and Engineering, vol. 208, p. 109244, Jan. 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0920410521008998
[15] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,†in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM, Aug. 2016, pp.
785–794. [Online]. Available: https://dl.acm.org/doi/10.1145/2939672.2939785
[16] Y. Freund and R. E. Schapire, “A Short Introduction to Boosting,†Society, vol. 14, no. 5, pp. 771–780, 2009.
[17] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “LightGBM: A Highly Efficient
Gradient Boosting Decision Tree,†Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available:
https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
[18] L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “CatBoost: unbiased boosting with
categorical features,†Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available:
https://proceedings.neurips.cc/paper/2018/hash/14491b756b3a51daac41c24863285549-Abstract.html
[19] Z. Yuan and L. Duan, “Construction Method of Sentiment Lexicon Based on Word2vec,†in 2019 IEEE 8th Joint International
Information Technology and Artificial Intelligence Conference (ITAIC). Chongqing, China: IEEE, May 2019, pp. 848–851.
[Online]. Available: https://ieeexplore.ieee.org/document/8785471/
[20] W. Ramadhan, S. Astri Novianty, and S. Casi Setianingsih, “Sentiment analysis using multinomial logistic regression,†in 2017
International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC). Yogyakarta: IEEE,
Sep. 2017, pp. 46–49. [Online]. Available: https://ieeexplore.ieee.org/document/8226700/
[21] M. T. H. K. Tusar and M. T. Islam, “A Comparative Study of Sentiment Analysis Using NLP and Different Machine
Learning Techniques on US Airline Twitter Data,†in 2021 International Conference on Electronics, Communications
and Information Technology (ICECIT). Khulna, Bangladesh: IEEE, Sep. 2021, pp. 1–4. [Online]. Available:
https://ieeexplore.ieee.org/document/9641336/
[22] L. Yu and N. Zhou, “Survey of Imbalanced Data Methodologies,†2021, publisher: [object Object] Version Number: 1.
[Online]. Available: https://arxiv.org/abs/2104.02240
[23] Dept. of Computer Science & Engineering, Hajee Mohammad Danesh Science and Technology University, Bangladesh,
P. Bhowmik, P. C. Bhowmik, U. A. M. Ehsan Ali, and M. Sohrawordi, “Cardiotocography Data Analysis to Predict Fetal
Health Risks with Tree-Based Ensemble Learning,†International Journal of Information Technology and Computer Science,
vol. 13, no. 5, pp. 30–40, Oct. 2021. [Online]. Available: https://www.mecs-press.org/ijitcs/ijitcs-v13-n5/v13n5-3.html
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Irma Binti Sya'idah, Sugiyarto Surono, Goh Khang Wen, DynamicWeighted Particle Swarm Optimization - Support Vector Machine Optimization in Recursive Feature Elimination Feature Selection , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Rofik Rofik, Roshan Aland Hakim, Jumanto Unjung, Budi Prasetiyo, Much Aziz Muslim, Optimization of SVM and Gradient Boosting Models Using GridSearchCV in Detecting Fake Job Postings , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 2 (2024)
- Abd Mizwar A Rahim, Andi Sunyoto, Muhammad Rudyanto Arief, Stroke Prediction Using Machine Learning Method with Extreme Gradient Boosting Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
- Annisa Nurul Puteri, Arizal Arizal, Andini Dani Achmad, Feature Selection Correlation-Based pada Prediksi Nasabah Bank Telemarketing untuk Deposito , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 20 No. 2 (2021)
- Muhamad Nur Gunawan, Titi Farhanah, Siti Ummi Masruroh, Ahmad Mukhlis Jundulloh, Nafdik Zaydan Raushanfikar, Rona Nisa Sofia Amriza, Accuracy of K-Nearest Neighbors Algorithm Classification For Archiving Research Publications , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Roudlotul Jannah Alfirdausy, Nurissaidah Ulinnuha, Wika Dianita Utami, Implementation of The Extreme Gradient Boosting Algorithm with Hyperparameter Tuning in Celiac Disease Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Muhammad Amirul Mukminin, Tio Dharmawan, Muhamad Arief Hidayat, Gender Classification Using Viola Jones, Orthogonal Difference Local Binary Pattern and Principal Component Analysis , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Baiq Rima Mozarita Erdiani, Aryo Yudo Husodo, Ida Bagus Ketut Widiartha, Novel Application of K-Means Algorithm for Unique Sentiment Clustering in 2024 Korean Movie Reviews on TikTok Platform , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Indradi Rahmatullah, Gibran Satya Nugraha, Arik Aranta, Feature Selection on Grouping Students Into Lab Specializations for the Final Project Using Fuzzy C-Means , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Dela Ananda Setyarini, Agnes Ayu Maharani Dyah Gayatri, Christian Sri Kusuma Aditya, Didih Rizki Chandranegara, Stroke Prediction with Enhanced Gradient Boosting Classifier and Strategic Hyperparameter , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 2 (2024)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Uswatun Hasanah, Neny Sulistianingsih, PEMODELAN SISTEM PENJADWALAN PRAKTIKUM LABORATORIUM MENGGUNAKAN ALJABAR MAXPLUS (STUDI KASUS DI STMIK BUMIGORA MATARAM) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 1 (2015)