Enhancement of Supervised Learning Models for Intrusion Detection Through Mutual Information and Hyperparameter Tuning
DOI:
https://doi.org/10.30812/matrik.v25i2.5760Keywords:
Hyperparameter Tuning, Intrusion Detection, Mutual information, Supervised learningAbstract
Enhancing the performance of supervised learning algorithms through feature and hyperparameter testing remains challenging for users, particularly when detecting computer network intrusions. There are opportunities to assess whether a supervised learning algorithm performs optimally, depending on the number of features and the choice of hyperparameters. The purpose of this research is to enhance the network intrusion detection performance of three supervised learning algorithms, namely Support Vector Machine (SVM), eXtreme Gradient Boosting, and Random Forest, by using the Mutual Information feature selection approach and hyperparameter tuning. Mutual Information measures the dependency of features on the target. Features with high values are the most informative. Hyperparameters are not learned from the data; they are set before training begins. Hyperparameters are selected in accordance with the requirements of the three algorithms via iterative training and testing on the NSL-KDD dataset. The dataset was split into 80:20, 70:30, and 60:40. The results showed that the fifteen features with the highest mutual information were identified and trained on the data using appropriate hyperparameters. By splitting the data in an 80:20 ratio, the accuracy of Support Vector Machine reached its maximum, increasing from 90% to 98%. In contrast, eXtreme Gradient Boosting and Random Forest reached their maximum, increasing from 97% and 98% to 100%, respectively. The study’s findings advance our understanding of how algorithm performance depends on feature and hyperparameter selection.
Downloads
References
[1] W. Tang and Y. Liu, “University mobile employment network information system in the internet age,” Journal of Physics:
Conference Series, vol. 1881, no. 2, p. 022095, Apr. 2021, https://doi.org/10.1088/1742-6596/1881/2/022095.
[2] S. Rysbekov, A. Aitbanov, Z. Abdiakhmetova, and A. Kartbayev, “Advancing network security: A comparative research of
machine learning techniques for intrusion detection,” International Journal of Electrical and Computer Engineering (IJECE),
vol. 15, no. 2, p. 2271, Apr. 2025, https://doi.org/10.11591/ijece.v15i2.pp2271-2281.
[3] Z. Ahmad, A. Shahid Khan, C. Wai Shiang, J. Abdullah, and F. Ahmad, “Network intrusion detection system: A systematic
study of machine learning and deep learning approaches,” Transactions on Emerging Telecommunications Technologies, vol. 32,
no. 1, p. e4150, Jan. 2021, https://doi.org/10.1002/ett.4150.
[4] M. Kaif, P. P, and L. V, “A study on network intrusion detection system,” International Journal For Multidisciplinary Research,
vol. 6, no. 3, p. 20214, Jun. 2024, https://doi.org/10.36948/ijfmr.2024.v06i03.20214.
[5] Y. Zhang, “Fwa-svm network intrusion identification technology for network security,” IEEE Access, vol. 13, pp. 18 579–18 593,
January, 2025, https://doi.org/10.1109/ACCESS.2025.3532619.
[6] K. M. Abuali, L. Nissirat, and A. Al-Samawi, “Intrusion detection techniques in social media cloud:review and future directions,”
Wireless Communications and Mobile Computing, vol. 2023, pp. 1–25, Apr. 2023, https://doi.org/10.1155/2023/
6687023.
[7] K. A. Binsaeed and A. M. Hafez, “Enhancing intrusion detection systems with xgboost feature selection and deep learning
approaches,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 5, 2023, https://doi.org/10.
14569/IJACSA.2023.01405112.
[8] H. S. Neto, W. S. Lacerda, and R. V. Francozo, “Random forests for online intrusion detection in computer networks,” Journal
of Computer Science, vol. 17, no. 10, pp. 905–914, Oct. 2021, https://doi.org/10.3844/jcssp.2021.905.914.
[9] P. V. Chavan and N. V. Alone, “Optimizing intrusion detection with random forest:a high-accuracy approach using cic-ids 2017,”
International Journal of Computer Applications, vol. 187, no. 3, pp. 17–22, May 2025, https://doi.org/10.5120/ijca2025924816.
[10] S. A. Ajagbe, J. B. Awotunde, and H. Florez, “Intrusion detection:a comparison study of machine learning models using
unbalanced dataset,” SN Computer Science, vol. 5, no. 8, p. 1028, Nov. 2024, https://doi.org/10.1007/s42979-024-03369-0.
[11] A. Dhindsa, S. Bhatia, S. Agrawal, and B. S. Sohi, “An improvised machine learning model based on mutual information feature
selection approach for microbes classification,” Entropy, vol. 23, no. 2, p. 257, Feb. 2021, https://doi.org/10.3390/e23020257.
[12] M. Hassan and N. Kaabouch, “Impact of feature selection techniques on the performance of machine learning models for depression
detection using eeg data,” Applied Sciences, vol. 14, no. 22, p. 10532, Nov. 2024, https://doi.org/10.3390/app142210532.
[13] A. Alsahaf, N. Petkov, V. Shenoy, and G. Azzopardi, “A framework for feature selection through boosting,” Expert Systems
with Applications, vol. 187, p. 115895, Jan. 2022, https://doi.org/10.1016/j.eswa.2021.115895.
[14] L. Ragha and H. S. Deshpande, “A hybrid random forest-based feature selection model using mutual information and f-score
for preterm birth classification,” International Journal of Medical Engineering and Informatics, vol. 15, no. 1, p. 1, 2023,
https://doi.org/10.1504/IJMEI.2023.10051207.
[15] C. Arnold, L. Biedebach, A. Kupfer, and M. Neunhoeffer, “The role of hyperparameters in machine learning models and how
to tune them,” Political Science Research and Methods, vol. 12, no. 4, pp. 841–848, Oct. 2024, https://doi.org/10.1017/psrm.
2023.61.
[16] M. A. K. Raiaan, S. Sakib, N. M. Fahad, A. A. Mamun, M. A. Rahman, S. Shatabda, and M. S. H. Mukta, “A systematic review
of hyperparameter optimization techniques in convolutional neural networks,” Decision Analytics Journal, vol. 11, p. 100470,
Jun. 2024, https://doi.org/10.1016/j.dajour.2024.100470.
[17] J. A Ilemobayo, O. Durodola, O. Alade, O. J Awotunde, A. T Olanrewaju, O. Falana, A. Ogungbire, A. Osinuga, D. Ogunbiyi,
A. Ifeanyi, I. E Odezuligbo, and O. E Edu, “Hyperparameter tuning in machine learning:a comprehensive review,” Journal of
Engineering Research and Reports, vol. 26, no. 6, pp. 388–395, Jun. 2024, https://doi.org/10.9734/jerr/2024/v26i61188.
[18] H. Tariq, M. Majeed, and M. Ahmad, “Optimizing svm performance through combinatorial hyperparameter tuning and model
selection,” International Journal Bioautomation, vol. 29, no. 2, pp. 117–144, Jun. 2025, https://doi.org/10.7546/ijba.2025.29.2.
000981.
[19] “Machine learning-based network anomaly detection:design.”
[20] M. Das Nath and T. Bhattasali, “Anomaly Detection Using Machine Learning Approaches,” Azerbaijan Journal of High Performance
Computing, vol. 3, no. 2, pp. 196–206, Dec. 2020, https://doi.org/10.32010/26166127.2020.3.2.196.206.
[21] T. A. Deepak, “Xgboost classification based network intrusion detection system for big data using pysparkling water,” International
Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 1, pp. 377–382, Feb. 2020,
https://doi.org/10.30534/ijatcse/2020/55912020.
[22] Z. Arif Ali, Z. H. Abduljabbar, H. A. Tahir, A. Bibo Sallow, and S. M. Almufti, “extreme gradient boosting algorithm with
machine learning: A review,” Academic Journal of Nawroz University, vol. 12, no. 2, pp. 320–334, May 2023, https://doi.org/
10.25007/ajnu.v12n2a1612.
[23] E. Ismanto, J. Al Amien, and V. Vitriani, “A comparison of enhanced ensemble learning techniques for internet of things network
attack detection,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 23, no. 3, pp. 543–556, Jun.
2024, https://doi.org/10.30812/matrik.v23i3.3885.
[24] W. Li, “Optimization and application of random forest algorithm for applied mathematics specialty,” Security and Communication
Networks, vol. 2022, pp. 1–9, May 2022, https://doi.org/10.1155/2022/1131994.
[25] M. Savargiv, B. Masoumi, and M. R. Keyvanpour, “A new random forest algorithm based on learning automata,” Computational
Intelligence and Neuroscience, vol. 2021, no. 1, p. 5572781, Jan. 2021, https://doi.org/10.1155/2021/5572781.
[26] I. Ahmad and H. S. A. Qahtani, “A comparative analysis of gradient boosting, random forest and deep neural networks in
intrusion detection system,” ARPN Journal of Engineering and Applied Sciences, vol. 8, no. 12, pp. 1392–1402, aug 2023,
https://doi.org/10.59018/0623177.
[27] M. T. Abdelaziz, A. Radwan, H. Mamdouh, A. S. Saad, A. S. Abuzaid, A. A. AbdElhakeem, S. Zakzouk, K. Moussa, and M. S.
Darweesh, “Enhancing network threat detection with random forest-based nids and permutation feature importance,” Journal
of Network and Systems Management, vol. 33, no. 1, p. 2, Jan. 2025, https://doi.org/10.1007/s10922-024-09874-0.
[28] C. V. Priscilla and D. P. Prabha, “A two-phase feature selection technique using mutual information and XGB-RFE for credit
card fraud detection,” International Journal of Advanced Technology and Engineering Exploration, vol. 8, no. 85, pp. 1656–
1668, Dec. 2021, https://doi.org/10.19101/IJATEE.2021.874615.
[29] F. Aghamohammadi and F. Shakeri, “The critical role of hyperparameter tuning in machine learning: A focus on the svd
method for matrix completion,” International Journal of Computer Applications, vol. 187, no. 24, pp. 1–6, Jul. 2025, https:
//doi.org/10.5120/ijca2025925371.
[30] F. S. Nahm, “Receiver operating characteristic curve: Overview and practical use for clinicians,” Korean Journal of Anesthesiology,
vol. 75, no. 1, pp. 25–36, Feb. 2022, https://doi.org/10.4097/kja.21209.
[31] S. K. Corbaci ouglu and G. Aksel, “Receiver operating characteristic curve analysis in diagnostic accuracy studies: A guide to
interpreting the area under the curve value,” Turkish Journal of Emergency Medicine, vol. 23, no. 4, pp. 195–198, Oct. 2023,
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Deny Jollyta, Yoakhina Nicole Makaruku, Alyauma Hajjah, Yulvia Nora Marlim

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Deny Jollyta
.png)











