Enhancing Predictive Models: An In-depth Analysis of Feature Selection Techniques Coupled with Boosting Algorithms

Neny Sulistianingsih; Galih Hendro Martono

doi:10.30812/matrik.v23i2.3788

Authors

Neny Sulistianingsih Universitas Bumigora, Mataram, Indonesia
Galih Hendro Martono Universitas Bumigora, Mataram, Indonesia https://orcid.org/0000-0002-0697-010X

DOI:

https://doi.org/10.30812/matrik.v23i2.3788

Keywords:

Boosting Algorithm, Feature Selection, Fetal Health Dataset, Fetal Health, Recursive Feature Elimination

Abstract

This research addresses the critical need to enhance predictive models for fetal health classification using Cardiotocography (CTG) data. The literature review underscores challenges in imbalanced labels, feature selection, and efficient data handling. This paper aims to enhance predictive models for fetal health classification using Cardiotocography (CTG) data by addressing challenges related to imbalanced labels, feature selection, and efficient data handling. The study uses Recursive Feature Elimination (RFE) and boosting algorithms (XGBoost, AdaBoost, LightGBM, CATBoost, and Histogram-Based Boosting) to refine model performance. The results reveal notable variations in precision, Recall, F1-Score, accuracy, and AUC across different algorithms and RFE applications. Notably, Random Forest with XGBoost exhibits superior performance in precision (0.940), Recall (0.890), F1-Score (0.920), accuracy (0.950), and AUC (0.960). Conversely, Logistic Regression with AdaBoost demonstrates lower performance. The absence of RFE also impacts model effectiveness. In conclusion, the study successfully employs RFE and boosting algorithms to enhance fetal health classification models, contributing valuable insights for improved prenatal diagnosis.

Downloads

Download data is not yet available.

Author Biographies

Neny Sulistianingsih, Universitas Bumigora, Mataram, Indonesia

Lecturer from Departement of Computer Science, Faculty of Engineering, Universitas Bumigora, Mataram, Indonesia, 83127
Galih Hendro Martono, Universitas Bumigora, Mataram, Indonesia

Lecturer from Departement of Computer Science, Faculty of Engineering, Universitas Bumigora, Mataram, Indonesia, 83127

References

[1] L. Davidson and M. R. Boland, â€œTowards deep phenotyping pregnancy: a systematic review on artificial intelligence and
machine learning methods to improve pregnancy outcomes,â€ Briefings in Bioinformatics, vol. 22, no. 5, p. bbaa369, Sep. 2021.
[Online]. Available: https://academic.oup.com/bib/article/doi/10.1093/bib/bbaa369/6065792
[2] P. Garcia-Canadilla, S. Sanchez-Martinez, F. Crispi, and B. Bijnens, â€œMachine Learning in Fetal Cardiology:
What to Expect,â€ Fetal Diagnosis and Therapy, vol. 47, no. 5, pp. 363â€“372, 2020. [Online]. Available: https:
//www.karger.com/Article/FullText/505021
[3] N. Muhammad Hussain, A. U. Rehman, M. T. B. Othman, J. Zafar, H. Zafar, and H. Hamam, â€œAccessing Artificial Intelligence
for Fetus Health Status Using Hybrid Deep Learning Algorithm (AlexNet-SVM) on Cardiotocographic Data,â€ Sensors, vol. 22,
no. 14, p. 5103, Jul. 2022. [Online]. Available: https://www.mdpi.com/1424-8220/22/14/5103
[4] R. Fung, J. Villar, A. Dashti, L. C. Ismail, E. Staines-Urias, E. O. Ohuma, L. J. Salomon, C. G. Victora, F. C. Barros,
A. Lambert, M. Carvalho, Y. A. Jaffer, J. A. Noble, M. G. Gravett, M. Purwar, R. Pang, E. Bertino, S. Munim, A. M.
Min, R. McGready, S. A. Norris, Z. A. Bhutta, S. H. Kennedy, A. T. Papageorghiou, A. Ourmazd, S. Norris, S. Abbott,
A. Abubakar, J. Acedo, I. Ahmed, F. Al-Aamri, J. Al-Abduwani, J. Al-Abri, D. Alam, E. Albernaz, H. Algren, F. Al-Habsi,
M. Alija, H. Al-Jabri, H. Al-Lawatiya, B. Al-Rashidiya, D. Altman, W. Al-Zadjali, H. Andersen, L. Aranzeta, S. Ash,
M. Baricco, F. Barros, H. Barsosio, C. Batiuk, M. Batra, J. Berkley, E. Bertino, M. Bhan, B. Bhat, Z. Bhutta, I. Blakey,
S. Bornemeier, A. Bradman, M. Buckle, O. Burnham, F. Burton, A. Capp, V. Cararra, R. Carew, V. Carrara, A. Carter,
M. Carvalho, P. Chamberlain, I. L. Cheikh, L. Cheikh Ismail, A. Choudhary, S. Choudhary, W. Chumlea, C. Condon,
L. Corra, C. Cosgrove, R. Craik, M. Da Silveira, D. Danelon, T. De Wet, E. De Leon, S. Deshmukh, G. Deutsch, J. Dhami,
N. P. Di, M. Dighe, H. Dolk, M. Domingues, D. Dongaonkar, D. Enquobahrie, B. Eskenazi, F. Farhi, M. Fernandes,
D. Finkton, S. Fonseca, I. Frederick, M. Frigerio, P. Gaglioti, C. Garza, G. Gilli, P. Gilli, M. Giolito, F. Giuliani, J. Golding,
M. Gravett, S. Gu, Y. Guman, Y. He, L. Hoch, S. Hussein, D. Ibanez, C. Ioannou, N. Jacinta, N. Jackson, Y. Jaffer,
S. Jaiswal, J. Jimenez-Bustos, F. Juangco, L. Juodvirsiene, M. Katz, B. Kemp, S. Kennedy, M. Ketkar, V. Khedikar, M. Kihara,
J. Kilonzo, C. Kisiangâ€™ani, J. Kizidio, C. Knight, H. Knight, N. Kunnawar, A. Laister, A. Lambert, A. Langer, T. Lephoto,
A. Leston, T. Lewis, H. Liu, S. Lloyd, P. Lumbiganon, S. Macauley, E. Maggiora, C. Mahorkar, M. Mainwaring, L. Malgas,
A. Matijasevich, K. McCormick, R. McGready, R. Miller, A. Min, A. Mitidieri, V. Mkrtychyan, B. Monyepote, D. Mota,I. Mulik, S. Munim, D. Muninzwa, N. Musee, S. Mwakio, H. Mwangudzah, R. Napolitano, C. Newton, V. Ngami, J. Noble,
S. Norris, T. Norris, F. Nosten, K. Oas, M. Oberto, L. Occhi, R. Ochieng, E. Ohuma, E. Olearo, I. Olivera, M. Owende,
C. Pace, Y. Pan, R. Pang, A. Papageorghiou, B. Patel, V. Paul, W. Paulsene, F. Puglia, M. Purwar, V. Rajan, A. Raza,
D. Reade, J. Rivera, D. Rocco, F. Roseman, S. Roseman, C. Rossi, P. Rothwell, I. Rovelli, K. Saboo, R. Salam, M. Salim,
L. Salomon, L. M. Sanchez, J. Sande, I. Sarris, S. Savini, I. Sclowitz, A. Seale, J. Shah, M. Sharps, C. Shembekar, Y. Shen,
M. Shorten, F. Signorile, A. Singh, S. Sohoni, A. Somani, T. Sorensen, A. Soria-Frisch, E. Staines Urias, A. Stein, W. Stones,
V. Taori, K. Tayade, T. Todros, R. Uauy, A. Varalda, M. Venkataraman, C. Victora, J. Villar, S. Vinayak, S. Waller,
L. Walusuna, J. Wang, L. Wang, S. Wanyonyi, D. Weatherall, S. Wiladphaingern, A. Wilkinson, D. Wilson, M. Wu, Q. Wu,
K. Wulff, D. Yellappan, Y. Yuan, S. Zaidi, G. Zainab, J. Zhang, and Y. Zhang, â€œAchieving accurate estimates of fetal
gestational age and personalised predictions of fetal growth based on data from an international prospective cohort study:
a population-based machine learning study,â€ The Lancet Digital Health, vol. 2, no. 7, pp. e368â€“e375, Jul. 2020. [Online].
Available: https://linkinghub.elsevier.com/retrieve/pii/S258975002030131X
[5] M. T. Alam, M. A. I. Khan, N. N. Dola, T. Tazin, M. M. Khan, A. A. Albraikan, and F. A. Almalki, â€œComparative Analysis of
Different Efficient Machine Learning Methods for Fetal Health Classification,â€ Applied Bionics and Biomechanics, vol. 2022,
pp. 1â€“12, Apr. 2022. [Online]. Available: https://www.hindawi.com/journals/abb/2022/6321884/
[6] N. Rahmayanti, H. Pradani, M. Pahlawan, and R. Vinarti, â€œComparison of machine learning algorithms to classify
fetal health using cardiotocogram data,â€ Procedia Computer Science, vol. 197, pp. 162â€“171, 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S1877050921023541
[7] R. R. Dixit, â€œPredicting Fetal Health using Cardiotocograms: A Machine Learning Approach,â€ Journal of Advanced
Analytics in Healthcare Management, vol. 6, no. 1, pp. 43â€“57, Jan. 2022, number: 1. [Online]. Available:
https://research.tensorgate.org/index.php/JAAHM/article/view/38
[8] M. M. Islam, M. Rokunojjaman, A. Amin, M. N. Akhtar, and I. H. Sarker, â€œDiagnosis and Classification of Fetal Health Based
on CTG Data Using Machine Learning Techniques,â€ in Machine Intelligence and Emerging Technologies. Springer, Cham,
2023, pp. 3â€“16, iSSN: 1867-822X. [Online]. Available: https://link.springer.com/chapter/10.1007/978-3-031-34622-4 1
[9] J. Xia, L. Sun, S. Xu, Q. Xiang, J. Zhao, W. Xiong, Y. Xu, and S. Chu, â€œA Model Using Support
Vector Machines Recursive Feature Elimination (SVM-RFE) Algorithm to Classify Whether COPD Patients Have
Been Continuously Managed According to GOLD Guidelines,â€ International Journal of Chronic Obstructive Pulmonary
Disease, vol. Volume 15, pp. 2779â€“2786, Nov. 2020. [Online]. Available: https://www.dovepress.com/
a-model-using-support-vector-machines-recursive-feature-elimination-sv-peer-reviewed-article-COPD
[10] M. Awad and S. Fraihat, â€œRecursive Feature Elimination with Cross-Validation with Decision Tree: Feature Selection Method
for Machine Learning-Based Intrusion Detection Systems,â€ Journal of Sensor and Actuator Networks, vol. 12, no. 5, p. 67,
Sep. 2023. [Online]. Available: https://www.mdpi.com/2224-2708/12/5/67
[11] H. M. Alshanbari, T. Mehmood, W. Sami, W. Alturaiki, M. A. Hamza, and B. Alosaimi, â€œPrediction and Classification of
COVID-19 Admissions to Intensive Care Units (ICU) Using Weighted Radial Kernel SVM Coupled with Recursive Feature
Elimination (RFE),â€ Life, vol. 12, no. 7, p. 1100, Jul. 2022. [Online]. Available: https://www.mdpi.com/2075-1729/12/7/1100
[12] Y. Han, L. Huang, and F. Zhou, â€œA dynamic recursive feature elimination framework (dRFE) to further refine
a set of OMIC biomarkers,â€ Bioinformatics, vol. 37, no. 15, pp. 2183â€“2189, Aug. 2021. [Online]. Available:
https://academic.oup.com/bioinformatics/article/37/15/2183/6124282
[13] W. Lian, G. Nie, B. Jia, D. Shi, Q. Fan, and Y. Liang, â€œAn Intrusion Detection Method Based on Decision Tree-Recursive
Feature Elimination in Ensemble Learning,â€ Mathematical Problems in Engineering, vol. 2020, pp. 1â€“15, Nov. 2020. [Online].
Available: https://www.hindawi.com/journals/mpe/2020/2835023/
[14] D. A. Otchere, T. O. A. Ganat, J. O. Ojero, B. N. Tackie-Otoo, and M. Y. Taki, â€œApplication of gradient
boosting regression model for the evaluation of feature selection techniques in improving reservoir characterisation
predictions,â€ Journal of Petroleum Science and Engineering, vol. 208, p. 109244, Jan. 2022. [Online]. Available:
https://linkinghub.elsevier.com/retrieve/pii/S0920410521008998
[15] T. Chen and C. Guestrin, â€œXGBoost: A Scalable Tree Boosting System,â€ in Proceedings of the 22nd ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. San Francisco California USA: ACM, Aug. 2016, pp.
785â€“794. [Online]. Available: https://dl.acm.org/doi/10.1145/2939672.2939785
[16] Y. Freund and R. E. Schapire, â€œA Short Introduction to Boosting,â€ Society, vol. 14, no. 5, pp. 771â€“780, 2009.
[17] G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T.-Y. Liu, â€œLightGBM: A Highly Efficient
Gradient Boosting Decision Tree,â€ Advances in Neural Information Processing Systems, vol. 30, 2017. [Online]. Available:
https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html
[18] L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, â€œCatBoost: unbiased boosting with
categorical features,â€ Advances in Neural Information Processing Systems, vol. 31, 2018. [Online]. Available:
https://proceedings.neurips.cc/paper/2018/hash/14491b756b3a51daac41c24863285549-Abstract.html
[19] Z. Yuan and L. Duan, â€œConstruction Method of Sentiment Lexicon Based on Word2vec,â€ in 2019 IEEE 8th Joint International
Information Technology and Artificial Intelligence Conference (ITAIC). Chongqing, China: IEEE, May 2019, pp. 848â€“851.
[Online]. Available: https://ieeexplore.ieee.org/document/8785471/
[20] W. Ramadhan, S. Astri Novianty, and S. Casi Setianingsih, â€œSentiment analysis using multinomial logistic regression,â€ in 2017
International Conference on Control, Electronics, Renewable Energy and Communications (ICCREC). Yogyakarta: IEEE,
Sep. 2017, pp. 46â€“49. [Online]. Available: https://ieeexplore.ieee.org/document/8226700/
[21] M. T. H. K. Tusar and M. T. Islam, â€œA Comparative Study of Sentiment Analysis Using NLP and Different Machine
Learning Techniques on US Airline Twitter Data,â€ in 2021 International Conference on Electronics, Communications
and Information Technology (ICECIT). Khulna, Bangladesh: IEEE, Sep. 2021, pp. 1â€“4. [Online]. Available:
https://ieeexplore.ieee.org/document/9641336/
[22] L. Yu and N. Zhou, â€œSurvey of Imbalanced Data Methodologies,â€ 2021, publisher: [object Object] Version Number: 1.
[Online]. Available: https://arxiv.org/abs/2104.02240
[23] Dept. of Computer Science & Engineering, Hajee Mohammad Danesh Science and Technology University, Bangladesh,
P. Bhowmik, P. C. Bhowmik, U. A. M. Ehsan Ali, and M. Sohrawordi, â€œCardiotocography Data Analysis to Predict Fetal
Health Risks with Tree-Based Ensemble Learning,â€ International Journal of Information Technology and Computer Science,
vol. 13, no. 5, pp. 30â€“40, Oct. 2021. [Online]. Available: https://www.mecs-press.org/ijitcs/ijitcs-v13-n5/v13n5-3.html

Enhancing Predictive Models: An In-depth Analysis of Feature Selection Techniques Coupled with Boosting Algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

Author Biographies

References

Downloads

Published

Issue

Section

How to Cite

Similar Articles

Most read articles by the same author(s)

menubaru

tools

whatsapp

citation