Enhancing Predictive Models: An In-depth Analysis of Feature Selection Techniques Coupled with Boosting Algorithms
DOI:
https://doi.org/10.30812/matrik.v23i2.3788Keywords:
Boosting Algorithm, Feature Selection, Fetal Health Dataset, Fetal Health, Recursive Feature EliminationAbstract
This research addresses the critical need to enhance predictive models for fetal health classification using Cardiotocography (CTG) data. The literature review underscores challenges in imbalanced labels, feature selection, and efficient data handling. This paper aims to enhance predictive models for fetal health classification using Cardiotocography (CTG) data by addressing challenges related to imbalanced labels, feature selection, and efficient data handling. The study uses Recursive Feature Elimination (RFE) and boosting algorithms (XGBoost, AdaBoost, LightGBM, CATBoost, and Histogram-Based Boosting) to refine model performance. The results reveal notable variations in precision, Recall, F1-Score, accuracy, and AUC across different algorithms and RFE applications. Notably, Random Forest with XGBoost exhibits superior performance in precision (0.940), Recall (0.890), F1-Score (0.920), accuracy (0.950), and AUC (0.960). Conversely, Logistic Regression with AdaBoost demonstrates lower performance. The absence of RFE also impacts model effectiveness. In conclusion, the study successfully employs RFE and boosting algorithms to enhance fetal health classification models, contributing valuable insights for improved prenatal diagnosis.
Downloads
References
machine learning methods to improve pregnancy outcomes,†Briefings in Bioinformatics, vol. 22, no. 5, p. bbaa369, Sep. 2021.
[Online]. Available: https://academic.oup.com/bib/article/doi/10.1093/bib/bbaa369/6065792
[2] P. Garcia-Canadilla, S. Sanchez-Martinez, F. Crispi, and B. Bijnens, “Machine Learning in Fetal Cardiology:
What to Expect,†Fetal Diagnosis and Therapy, vol. 47, no. 5, pp. 363–372, 2020. [Online]. Available: https:
//www.karger.com/Article/FullText/505021
[3] N. Muhammad Hussain, A. U. Rehman, M. T. B. Othman, J. Zafar, H. Zafar, and H. Hamam, “Accessing Artificial Intelligence
for Fetus Health Status Using Hybrid Deep Learning Algorithm (AlexNet-SVM) on Cardiotocographic Data,†Sensors, vol. 22,
no. 14, p. 5103, Jul. 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/14/5103
[4] R. Fung, J. Villar, A. Dashti, L. C. Ismail, E. Staines-Urias, E. O. Ohuma, L. J. Salomon, C. G. Victora, F. C. Barros,
A. Lambert, M. Carvalho, Y. A. Jaffer, J. A. Noble, M. G. Gravett, M. Purwar, R. Pang, E. Bertino, S. Munim, A. M.
Min, R. McGready, S. A. Norris, Z. A. Bhutta, S. H. Kennedy, A. T. Papageorghiou, A. Ourmazd, S. Norris, S. Abbott,
A. Abubakar, J. Acedo, I. Ahmed, F. Al-Aamri, J. Al-Abduwani, J. Al-Abri, D. Alam, E. Albernaz, H. Algren, F. Al-Habsi,
M. Alija, H. Al-Jabri, H. Al-Lawatiya, B. Al-Rashidiya, D. Altman, W. Al-Zadjali, H. Andersen, L. Aranzeta, S. Ash,
M. Baricco, F. Barros, H. Barsosio, C. Batiuk, M. Batra, J. Berkley, E. Bertino, M. Bhan, B. Bhat, Z. Bhutta, I. Blakey,
S. Bornemeier, A. Bradman, M. Buckle, O. Burnham, F. Burton, A. Capp, V. Cararra, R. Carew, V. Carrara, A. Carter,
M. Carvalho, P. Chamberlain, I. L. Cheikh, L. Cheikh Ismail, A. Choudhary, S. Choudhary, W. Chumlea, C. Condon,
L. Corra, C. Cosgrove, R. Craik, M. Da Silveira, D. Danelon, T. De Wet, E. De Leon, S. Deshmukh, G. Deutsch, J. Dhami,
N. P. Di, M. Dighe, H. Dolk, M. Domingues, D. Dongaonkar, D. Enquobahrie, B. Eskenazi, F. Farhi, M. Fernandes,
D. Finkton, S. Fonseca, I. Frederick, M. Frigerio, P. Gaglioti, C. Garza, G. Gilli, P. Gilli, M. Giolito, F. Giuliani, J. Golding,
M. Gravett, S. Gu, Y. Guman, Y. He, L. Hoch, S. Hussein, D. Ibanez, C. Ioannou, N. Jacinta, N. Jackson, Y. Jaffer,
S. Jaiswal, J. Jimenez-Bustos, F. Juangco, L. Juodvirsiene, M. Katz, B. Kemp, S. Kennedy, M. Ketkar, V. Khedikar, M. Kihara,
J. Kilonzo, C. Kisiang’ani, J. Kizidio, C. Knight, H. Knight, N. Kunnawar, A. Laister, A. Lambert, A. Langer, T. Lephoto,
A. Leston, T. Lewis, H. Liu, S. Lloyd, P. Lumbiganon, S. Macauley, E. Maggiora, C. Mahorkar, M. Mainwaring, L. Malgas,
A. Matijasevich, K. McCormick, R. McGready, R. Miller, A. Min, A. Mitidieri, V. Mkrtychyan, B. Monyepote, D. Mota,I. Mulik, S. Munim, D. Muninzwa, N. Musee, S. Mwakio, H. Mwangudzah, R. Napolitano, C. Newton, V. Ngami, J. Noble,
S. Norris, T. Norris, F. Nosten, K. Oas, M. Oberto, L. Occhi, R. Ochieng, E. Ohuma, E. Olearo, I. Olivera, M. Owende,
C. Pace, Y. Pan, R. Pang, A. Papageorghiou, B. Patel, V. Paul, W. Paulsene, F. Puglia, M. Purwar, V. Rajan, A. Raza,
D. Reade, J. Rivera, D. Rocco, F. Roseman, S. Roseman, C. Rossi, P. Rothwell, I. Rovelli, K. Saboo, R. Salam, M. Salim,
L. Salomon, L. M. Sanchez, J. Sande, I. Sarris, S. Savini, I. Sclowitz, A. Seale, J. Shah, M. Sharps, C. Shembekar, Y. Shen,
M. Shorten, F. Signorile, A. Singh, S. Sohoni, A. Somani, T. Sorensen, A. Soria-Frisch, E. Staines Urias, A. Stein, W. Stones,
V. Taori, K. Tayade, T. Todros, R. Uauy, A. Varalda, M. Venkataraman, C. Victora, J. Villar, S. Vinayak, S. Waller,
L. Walusuna, J. Wang, L. Wang, S. Wanyonyi, D. Weatherall, S. Wiladphaingern, A. Wilkinson, D. Wilson, M. Wu, Q. Wu,
K. Wulff, D. Yellappan, Y. Yuan, S. Zaidi, G. Zainab, J. Zhang, and Y. Zhang, “Achieving accurate estimates of fetal
gestational age and personalised predictions of fetal growth based on data from an international prospective cohort study:
a population-based machine learning study,†The Lancet Digital Health, vol. 2, no. 7, pp. e368–e375, Jul. 2020. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/S258975002030131X
[5] M. T. Alam, M. A. I. Khan, N. N. Dola, T. Tazin, M. M. Khan, A. A. Albraikan, and F. A. Almalki, “Comparative Analysis of
Different Efficient Machine Learning Methods for Fetal Health Classification,†Applied Bionics and Biomechanics, vol. 2022,
pp. 1–12, Apr. 2022. [Online]. Available: https://www.hindawi.com/journals/abb/2022/6321884/
[6] N. Rahmayanti, H. Pradani, M. Pahlawan, and R. Vinarti, “Comparison of machine learning algorithms to classify
fetal health using cardiotocogram data,†Procedia Computer Science, vol. 197, pp. 162–171, 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S1877050921023541
[7] R. R. Dixit, “Predicting Fetal Health using Cardiotocograms: A Machine Learning Approach,†Journal of Advanced
Analytics in Healthcare Management, vol. 6, no. 1, pp. 43–57, Jan. 2022, number: 1. [Online]. Available:
https://research.tensorgate.org/index.php/JAAHM/article/view/38
[8] M. M. Islam, M. Rokunojjaman, A. Amin, M. N. Akhtar, and I. H. Sarker, “Diagnosis and Classification of Fetal Health Based
on CTG Data Using Machine Learning Techniques,†in Machine Intelligence and Emerging Technologies. Springer, Cham,
2023, pp. 3–16, iSSN: 1867-822X. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-031-34622-4 1
[9] J. Xia, L. Sun, S. Xu, Q. Xiang, J. Zhao, W. Xiong, Y. Xu, and S. Chu, “A Model Using Support
Vector Machines Recursive Feature Elimination (SVM-RFE) Algorithm to Classify Whether COPD Patients Have
Been Continuously Managed According to GOLD Guidelines,†International Journal of Chronic Obstructive Pulmonary
Disease, vol. Volume 15, pp. 2779–2786, Nov. 2020. [Online]. Available: https://www.dovepress.com/
a-model-using-support-vector-machines-recursive-feature-elimination-sv-peer-reviewed-article-COPD
[10] M. Awad and S. Fraihat, “Recursive Feature Elimination with Cross-Validation with Decision Tree: Feature Selection Method
for Machine Learning-Based Intrusion Detection Systems,†Journal of Sensor and Actuator Networks, vol. 12, no. 5, p. 67,
Sep. 2023. [Online]. Available: https://www.mdpi.com/2224-2708/12/5/67
[11] H. M. Alshanbari, T. Mehmood, W. Sami, W. Alturaiki, M. A. Hamza, and B. Alosaimi, “Prediction and Classification of
COVID-19 Admissions to Intensive Care Units (ICU) Using Weighted Radial Kernel SVM Coupled with Recursive Feature
Elimination (RFE),†Life, vol. 12, no. 7, p. 1100, Jul. 2022. [Online]. Available: https://www.mdpi.com/2075-1729/12/7/1100
[12] Y. Han, L. Huang, and F. Zhou, “A dynamic recursive feature elimination framework (dRFE) to further refine
a set of OMIC biomarkers,†Bioinformatics, vol. 37, no. 15, pp. 2183–2189, Aug. 2021. [Online]. Available:
https://academic.oup.com/bioinformatics/article/37/15/2183/6124282
[13] W. Lian, G. Nie, B. Jia, D. Shi, Q. Fan, and Y. Liang, “An Intrusion Detection Method Based on Decision Tree-Recursive
Feature Elimination in Ensemble Learning,†Mathematical Problems in Engineering, vol. 2020, pp. 1–15, Nov. 2020. [Online].
Available: https://www.hindawi.com/journals/mpe/2020/2835023/
[14] D. A. Otchere, T. O. A. Ganat, J. O. Ojero, B. N. Tackie-Otoo, and M. Y. Taki, “Application of gradient
boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation
predictions,†Journal of Petroleum Science and Engineering, vol. 208, p. 109244, Jan. 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0920410521008998
[15] T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,†in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM, Aug. 2016, pp.
785–794. [Online]. Available: https://dl.acm.org/doi/10.1145/2939672.2939785
[16] Y. Freund and R. E. Schapire, “A Short Introduction to Boosting,†Society, vol. 14, no. 5, pp. 771–780, 2009.
[17] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, “LightGBM: A Highly Efficient
Gradient Boosting Decision Tree,†Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available:
https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
[18] L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “CatBoost: unbiased boosting with
categorical features,†Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available:
https://proceedings.neurips.cc/paper/2018/hash/14491b756b3a51daac41c24863285549-Abstract.html
[19] Z. Yuan and L. Duan, “Construction Method of Sentiment Lexicon Based on Word2vec,†in 2019 IEEE 8th Joint International
Information Technology and Artificial Intelligence Conference (ITAIC). Chongqing, China: IEEE, May 2019, pp. 848–851.
[Online]. Available: https://ieeexplore.ieee.org/document/8785471/
[20] W. Ramadhan, S. Astri Novianty, and S. Casi Setianingsih, “Sentiment analysis using multinomial logistic regression,†in 2017
International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC). Yogyakarta: IEEE,
Sep. 2017, pp. 46–49. [Online]. Available: https://ieeexplore.ieee.org/document/8226700/
[21] M. T. H. K. Tusar and M. T. Islam, “A Comparative Study of Sentiment Analysis Using NLP and Different Machine
Learning Techniques on US Airline Twitter Data,†in 2021 International Conference on Electronics, Communications
and Information Technology (ICECIT). Khulna, Bangladesh: IEEE, Sep. 2021, pp. 1–4. [Online]. Available:
https://ieeexplore.ieee.org/document/9641336/
[22] L. Yu and N. Zhou, “Survey of Imbalanced Data Methodologies,†2021, publisher: [object Object] Version Number: 1.
[Online]. Available: https://arxiv.org/abs/2104.02240
[23] Dept. of Computer Science & Engineering, Hajee Mohammad Danesh Science and Technology University, Bangladesh,
P. Bhowmik, P. C. Bhowmik, U. A. M. Ehsan Ali, and M. Sohrawordi, “Cardiotocography Data Analysis to Predict Fetal
Health Risks with Tree-Based Ensemble Learning,†International Journal of Information Technology and Computer Science,
vol. 13, no. 5, pp. 30–40, Oct. 2021. [Online]. Available: https://www.mecs-press.org/ijitcs/ijitcs-v13-n5/v13n5-3.html
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Taufik Hidayat, Mohammad Ridwan, Muhamad Fajrul Iqbal, Sukisno Sukisno, Robby Rizky, William Eric Manongga, Determining Toddler's Nutritional Status with Machine Learning Classification Analysis Approach , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Tb Ai Munandar, Ajif Yunizar Yusuf Pratama, Regional Clustering Based on Types of Non-Communicable Diseases Using k-Means Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 2 (2024)
- Christofer Satria, Peter Wijaya Sugijanto, Anthony Anggrawan, I Nyoman Yoga Sumadewa, Aprilia Dwi Dayani, Rini Anggriani, Multi-Algorithm Approach to Enhancing Social Assistance Efficiency Through Accurate Poverty Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- yusri ikhwani, Khairan Marzuki, As’ary Ramadhan, Automated University Lecture Schedule Generator based on Evolutionary Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Anas Syaifudin, Purwanto Purwanto, Heribertus Himawan, M. Arief Soeleman, Customer Segmentation with RFM Model using Fuzzy C-Means and Genetic Programming , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Ahmad Fatoni Dwi Putra, Muhamad Nizam Azmi, Heri Wijayanto, Satria Utama, I Gede Putu Wirarama Wedashwara Wirawan, Optimizing Rain Prediction Model Using Random Forest and Grid Search Cross-Validation for Agriculture Sector , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Muhammad Alkaff, Muhammad Afrizal Miqdad, Muhammad Fachrurrazi, Muhammad Nur Abdi, Ahmad Zainul Abidin, Raisa Amalia, Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Lathifatul Mahabbati, Andy Hidayat Jatmika, Raphael Bianco Huwae, Reducing Transmission Signal Collisions on Optimized Link State Routing Protocol Using Dynamic Power Transmission , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Prihandoko Prihandoko, Deny Jollyta, Gusrianty Gusrianty, Muhammad Siddik, Johan Johan, Cluster Validity for Optimizing Classification Model: Davies Bouldin Index – Random Forest Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- yusri ikhwani, As`ary Ramadhan, Muhammad Bahit, Taufik Hidayat Faesal, Single elimination tournament design using dynamic programming algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Uswatun Hasanah, Neny Sulistianingsih, PEMODELAN SISTEM PENJADWALAN PRAKTIKUM LABORATORIUM MENGGUNAKAN ALJABAR MAXPLUS (STUDI KASUS DI STMIK BUMIGORA MATARAM) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 15 No. 1 (2015)