Optimizing Random Forest for IoT Cyberattack Detection UsingSMOTE: A Study on CIC-IoT2023 Dataset

Authors

  • Guntoro Guntoro Universitas Lancang Kuning, Pekanbaru, Indonesia
  • Lisnawita Lisnawita Universitas Lancang Kuning, Pekanbaru, Indonesia
  • Loneli Costaner Universitas Lancang Kuning, Pekanbaru, Indonesia

DOI:

https://doi.org/10.30812/matrik.v25i1.5382

Keywords:

CIC-IoT2023, Internet of Things, Intrusion Detection System, SMOTE, Random Forest

Abstract

The growing number of Internet of Things devices has led to an increased risk of complex and diverse cyberattacks. However, a significant challenge in this domain is the imbalanced class distribution in most Internet of Things datasets, cautilizing classification algorithms to be biased towards the majority class, hindering effective threat detection. This study addresses this issue by leveraging the Random Forest algorithm optimised by the Synthetic Minority Oversampling Technique. This research aims to develop an effective model for detecting cyberattacks in Internet of Things environments by resolving class imbalance issues inside of the CIC-IoT2023 dataset. The methodology involves several stages, comprising data preprocessing and applying Synthetic Minority Oversampling Technique for data balancing. The balanced dataset was then used to train a Random Forest model, by its performance evaluated utilizing accuracy, precision, recall, F1-score, and Cohen's Kappa metrics. The results demonstrate the model's effectiveness, achieving an accuracy of 99.01%, an F1-score of 98.96%, and a Cohen's Kappa of 98.92%. This marks a notable improvement in performance, particularly in detecting minority classes, compared to the model trained devoid of Synthetic Minority Oversampling Technique, that struggled to identify several less common attack types. The outcomes suggest that combining Random Forest by Synthetic Minority Oversampling Technique can significantly enhance the development of intrusion detection systems by improving detection accuracy for all 33 attack types and reducing the risks associated by undetected threats. In conclusion, this study advances Internet of Things cybersecurity by presenting an effective and efficient method for addressing data imbalance in attack detection. Future research should focus on evaluating the model's robustness utilizing more complex datasets and enhancing its performance for real-time deployment on resource-constrained Internet of Things Devices.

Downloads

Download data is not yet available.

References

[1] L. Babun, K. Denney, Z. B. Celik, P. McDaniel, and A. S. Uluagac, “A survey on IoT platforms: Communication, security, and

privacy perspectives,” vol. 192, p. 108040, June, 2021, https://doi.org/10.1016/j.comnet.2021.108040.

[2] M. A. Ferrag, L. Maglaras, S. Moschoyiannis, and H. Janicke, “Deep learning for cyber security intrusion detection: Approaches,

datasets, and comparative study,” vol. 50, p. 102419, Februari, 2020, https://doi.org/10.1016/j.jisa.2019.102419.

[3] C. M. Patterson, J. R. C. Nurse, and V. N. L. Franqueira, “Learning from cyber security incidents: A systematic review and future

research agenda,” Computers and Security, vol. 132, p. 103309, September, 2023, https://doi.org/10.1016/j.cose.2023.103309.

[4] I. Cvitic, D. Perakovic, B. B. Gupta, and K.-K. R. Choo, “Boosting-Based DDoS Detection in Internet of Things Systems,”

vol. 9, no. 3, pp. 2109–2123, Februari, 2022, https://doi.org/10.1109/JIOT.2021.3090909.

[5] O. Zorlu and A. Ozsoy, “A blockchain-based secure framework for data management,” vol. 18, no. 10, pp. 628–653, June, 2024,

https://doi.org/10.1049/cmu2.12781.

[6] Z. Ahmad, A. Shahid Khan, C. Wai Shiang, J. Abdullah, and F. Ahmad, “Network intrusion detection system: A systematic

study of machine learning and deep learning approaches,” vol. 32, no. 1, p. e4150, Januari, 2021, https://doi.org/10.1002/ett.

4150.

[7] F.-S. Zamfir, M. Carbureanu, and S. F. Mihalache, “Application of Machine Learning Models in Optimizing Wastewater Treatment

Processes: A Review,” vol. 15, no. 15, p. 8360, July, 2025, https://doi.org/10.3390/app15158360.

[8] R. Ahsan, W. Shi, and J. Corriveau, “Network intrusion detection using machine learning approaches: Addressing data imbalance,”

vol. 7, no. 1, pp. 30–39, March, 2022, https://doi.org/10.1049/cps2.12013.

[9] P. Kaliyaperumal, S. Periyasamy, M. Thirumalaisamy, B. Balusamy, and F. Benedetto, “A Novel Hybrid Unsupervised Learning

Approach for Enhanced Cybersecurity in the IoT,” vol. 16, no. 7, p. 253, July, 2024, https://doi.org/10.3390/fi16070253.

[10] A. S. Tarawneh, A. B. A. Hassanat, K. Almohammadi, D. Chetverikov, and C. Bellinger, “SMOTEFUNA: Synthetic Minority

Over-Sampling Technique Based on Furthest Neighbour Algorithm,” vol. 8, pp. 59 069–59 082, March, 2020, https://doi.org/

10.1109/ACCESS.2020.2983003.

[11] F. Omer Albasheer, R. Ramesh Haibatti, M. Agarwal, and S. Yeob Nam, “A Novel IDS Based on Jaya Optimizer and Smote-

ENN for Cyberattacks Detection,” vol. 12, pp. 101 506–101 527, July, 2024, https://doi.org/10.1109/ACCESS.2024.3431534.

[12] H. Q. Gheni and W. L. Al-Yaseen, “Two-step data clustering for improved intrusion detection system using CICIoT2023

dataset,” e-Prime-Advances in Electrical Engineering, Electronics and Energy, vol. 9, p. 100673, September, 2024, https:

//doi.org/10.1016/j.prime.2024.100673.

[13] J. Li, M. S. Othman, H. Chen, and L. M. Yusuf, “Optimizing IoT intrusion detection system: Feature selection versus feature

extraction in machine learning,” vol. 11, no. 1, p. 36, Februari, 2024, https://doi.org/10.1186/s40537-024-00892-y.

[14] S. Chen and W. Zheng, “RRMSE-enhanced weighted voting regressor for improved ensemble regression,” vol. 20, no. 3, p.

e0319515, March, 2025, https://doi.org/10.1371/journal.pone.0319515.

[15] X. Larriva-Novo, V. A. Villagr´a, M. Vega-Barbas, D. Rivera, and M. Sanz Rodrigo, “An IoT-Focused Intrusion Detection

System Approach Based on Preprocessing Characterization for Cybersecurity Datasets,” vol. 21, no. 2, p. 656, Januari, 2021,

https://doi.org/10.3390/s21020656.

[16] F. Bolikulov, R. Nasimov, A. Rashidov, F. Akhmedov, and Y.-I. Cho, “Effective Methods of Categorical Data Encoding for

Artificial Intelligence Algorithms,” vol. 12, no. 16, p. 2553, August, 2024, https://doi.org/10.3390/math12162553.

[17] S. Abbas, I. Bouazzi, S. Ojo, A. Al Hejaili, G. A. Sampedro, A. Almadhor, and M. Gregus, “Evaluating deep learning variants

for cyber-attacks detection and multi-class classification in IoT networks,” vol. 10, p. e1793, Januari, 2024, https://doi.org/10.

7717/peerj-cs.1793.

[18] A. Alamleh, O. S. Albahri, A. A. Zaidan, A. S. Albahri, A. H. Alamoodi, B. B. Zaidan, S. Qahtan, H. A. Alsatar, M. S. Al-

Samarraay, and A. N. Jasim, “Federated Learning for IoMT Applications: A Standardization and Benchmarking Framework of

Intrusion Detection Systems,” vol. 27, no. 2, pp. 878–887, Februari, 2023, https://doi.org/10.1109/JBHI.2022.3167256.

[19] S. Hizal, U. Cavusoglu, and D. Akgun, “A novel deep learning-based intrusion detection system for IoT DDoS security,” vol. 28,

p. 101336, December, 2024, https://doi.org/10.1016/j.iot.2024.101336.

[20] S. K. Erskine, “Real-Time Large-Scale Intrusion Detection and Prevention System (IDPS) CICIoT Dataset Traffic Assessment

Based on Deep Learning,” vol. 8, no. 2, p. 52, April, 2025, https://doi.org/10.3390/asi8020052.

[21] T. S. Naseri and F. S. Gharehchopogh, “A Feature Selection Based on the Farmland Fertility Algorithm for Improved Intrusion

Detection Systems,” vol. 30, no. 3, p. 40, July, 2022, https://doi.org/10.1007/s10922-022-09653-9.

[22] A. Hussain, K. Naseer Qureshi, K. Javeed, and M. Alhussein, “An Enhanced Intelligent Intrusion Detection System to Secure

E-Commerce Communication Systems,” vol. 47, no. 2, pp. 2513–2528, 2023, https://doi.org/10.32604/csse.2023.040305.

Downloads

Published

2025-11-21

Issue

Section

Articles

How to Cite

[1]
G. Guntoro, L. Lisnawita, and L. Costaner, “Optimizing Random Forest for IoT Cyberattack Detection UsingSMOTE: A Study on CIC-IoT2023 Dataset”, MATRIK, vol. 25, no. 1, pp. 83–96, Nov. 2025, doi: 10.30812/matrik.v25i1.5382.

Similar Articles

1-10 of 203

You may also start an advanced similarity search for this article.