Evaluation of Classification Methods for Predicting Junior High School Accreditation Ranks in Indonesia

Authors

  • ID Miftahul Jannah Universitas Diponegoro, Semarang, Indonesia
  • ID Prajna Pramita Izati Universitas Diponegoro, Semarang, Indonesia

DOI:

https://doi.org/10.30812/ijecsa.v5i1.6032

Keywords:

Classification, Random Forest, Boosting, Support Vector Machine, Area Under Curve (AUC)

Abstract

In Indonesia, school accreditation is a crucial process for assessing the eligibility of educational institutions to meet national education standards. However, this process is resource-intensive and requires significant time, manpower, and financial resources. This study aimed to explore the application of machine learning classification methods: Random Forest, Boosting, and Support Vector Machine (SVM) to predict the accreditation ranks of Junior High Schools in Indonesia. The goal was to create an efficient, automated model to predict school accreditation status, improve the efficiency of the accreditation process, and facilitate better resource allocation. Data preparation included handling missing values, reducing the data dimensions, and addressing data imbalances. The dataset consisted of 23,954 Junior Schools from 34 provinces, with 37 variables, including 36 predictors and one target variable (accreditation status). The study found that Random Forest outperformed Boosting and SVM, with the highest Area Under Curve (AUC) of 0.8133. Random Forest also demonstrated the lowest average classification error of 19.32%, indicating its superior performance in predicting junior high school accreditation ranks. The results suggest that machine learning models, particularly Random Forest, can provide a more efficient and reliable alternative to manual accreditation evaluations. This approach can optimize educational assessments, improve resource allocation, and offer valuable insights for policymakers to enhance school performance, particularly in under-served regions.

Downloads

Download data is not yet available.

References

[1] T. A. Yoga Siswa and Naufal Azmi Verdikha, “KOMPARASI ALGORITMA KLASIFIKASI UNTUK MENENTUKAN EVALUASI KINERJA TERBAIK PADA STATUS AKREDITASI SEKOLAH/MADRASAH KALIMANTAN TIMUR BERDASARKAN IASP 2020,” Jurnal Informatika Teknologi dan Sains (Jinteks), vol. 4, no. 3, pp. 185–192, Aug. 2022, doi: 10.51401/jinteks.v4i3.1807.

[2] D. I. Syarip, K. A. Notodiputro, and B. Sartono, “COMPARISON OF RANDOM FOREST AND SUPPORT VECTOR MACHINE CLASSIFICATION METHODS FOR PREDICTING THE ACCURACY LEVEL OF MADRASAH DATA,” MEDIA STATISTIKA, vol. 18, no. 1, pp. 37–48, Oct. 2025, doi: 10.14710/medstat.18.1.37-48.

[3] Yanuarini Nur Sukmaningtyas, R. Makhfuddin Akbar, and G. Rohma Utami Asyafiiyah, “Penerapan Predictive Analytics untuk Analisis Faktor-faktor yang Mempengaruhi Performa Akademik Siswa,” Arcitech: Journal of Computer Science and Artificial Intelligence, vol. 4, no. 2, pp. 127–145, Dec. 2024, doi: 10.29240/arcitech.v4i2.12048.

[4] M. B. Musthafa, N. Ngatmari, C. Rahmad, R. A. Asmara, and F. Rahutomo, “Evaluation of university accreditation prediction system,” IOP Conf. Ser. Mater. Sci. Eng., vol. 732, no. 1, p. 012041, 2020, doi: 10.1088/1757-899X/732/1/012041.

[5] S. Wibowo, “Building a Classification Model to Predict School Quality in Indonesia,” in Proceedings of the International Conference on Educational Assessment and Policy (ICEAP 2020), Atlantis Press, 2021, pp. 111–114. doi: 10.2991/assehr.k.210423.074.

[6] A. Warjaya et al., “KOMBINASI LATENT SEMANTIC INDEXING DAN SUPPORT VECTOR MACHINE PADA KLASIFIKASI DOKUMEN AKREDITASI,” JATI (Jurnal Mahasiswa Teknik Informatika), vol. 9, no. 4, pp. 6400–6407, May 2025, doi: 10.36040/jati.v9i4.14102.

[7] M. I. Habibie et al., “Integrating Sentinel-2 and ESA world cover for effective land use and land cover assessment using machine learning,” Advances in Space Research, vol. 76, no. 9, pp. 4925–4958, 2025, doi: https://doi.org/10.1016/j.asr.2025.07.083.

[8] V. Nasiri, A. A. Darvishsefat, H. Arefi, V. C. Griess, S. M. M. Sadeghi, and S. A. Borz, “Modeling Forest Canopy Cover: A Synergistic Use of Sentinel-2, Aerial Photogrammetry Data, and Machine Learning,” Remote Sens. (Basel)., vol. 14, no. 6, 2022, doi: 10.3390/rs14061453.

[9] M. Sipper and J. H. Moore, “Conservation machine learning: a case study of random forests,” Sci. Rep., vol. 11, no. 1, p. 3629, 2021, doi: 10.1038/s41598-021-83247-4.

[10] Anita Fadila, Syafriandi Syafriandi, Yenni Kurniawati, and Admi Salma, “Classification of Dropout Rates in West Sumatra Using the Random Forest Algorithm with Synthetic Minority Oversampling Technique,” UNP Journal of Statistics and Data Science, vol. 2, no. 3, pp. 279–286, Aug. 2024, doi: 10.24036/ujsds/vol2-iss3/183.

[11] Y. Xi, J. Tian, H. Jiang, Q. Tian, H. Xiang, and N. Xu, “Mapping tree species in natural and planted forests using Sentinel-2 images,” Remote Sensing Letters, vol. 13, no. 6, pp. 544–555, Jun. 2022, doi: 10.1080/2150704X.2022.2051636.

[12] M. S. Başarslan and F. Bal, “The effect of text representation and model selection on classification performance: A comprehensive comparison of TF-IDF, Bow and Transformer-based methods on the Covid19-FNIR dataset,” Niğde Ömer Halisdemir Üniversitesi Mühendislik Bilimleri Dergisi, vol. 14, no. 4, pp. 1447–1461, 2025, doi: 10.28948/ngumuh.1694988.

[13] R. Saini, “Integrating Vegetation Indices and Spectral Features for Vegetation Mapping from Multispectral Satellite Imagery Using AdaBoost and Random Forest Machine Learning Classifiers,” Geomatics and Environmental Engineering, vol. 17, no. 1, pp. 57–74, Dec. 2022, doi: 10.7494/geom.2023.17.1.57.

[14] K. Mohammed et al., “Integrating participatory GIS, remote sensing, and explainable machine learning to assess forest provisioning services,” Environ. Impact Assess. Rev., vol. 117, p. 108245, 2026, doi: https://doi.org/10.1016/j.eiar.2025.108245.

[15] P. K. Rajput, “Machine learning approach for forest biomass modelling with in-situ and remote sensing data in Narmadapuram central India,” Model. Earth Syst. Environ., vol. 11, no. 5, p. 350, 2025, doi: 10.1007/s40808-025-02527-4.

[16] K. S. Bjerreskov, T. Nord-Larsen, and R. Fensholt, “Classification of Nemoral Forests with Fusion of Multi-Temporal Sentinel-1 and 2 Data,” Remote Sens. (Basel)., vol. 13, no. 5, 2021, doi: 10.3390/rs13050950.

[17] L. Yang et al., “Mapping above-ground biomass and canopy mean height in high mountainous forest areas with Sentinel-2 multi-spectral image based on machine learning algorithms,” Int. J. Digit. Earth, vol. 18, no. 2, p. 2558924, Dec. 2025, doi: 10.1080/17538947.2025.2558924.

[18] M. Berriri, S. Djema, G. Rey, and C. Dartigues-Pallez, “Multi-Class Assessment Based on Random Forests,” Educ. Sci. (Basel)., vol. 11, no. 3, 2021, doi: 10.3390/educsci11030092.

[19] H. A. Salman, A. Kalakech, and A. Steiti, “Random Forest Algorithm Overview,” Dec. 15, 2024, Mesopotamian Academic Press. doi: 10.58496/BJML/2024/007.

[20] T. Chernenkova, I. Kotlov, N. Belyaeva, and E. Suslova, “Spatiotemporal Modeling of Coniferous Forests Dynamics along the Southern Edge of Their Range in the Central Russian Plain,” Remote Sens. (Basel)., vol. 13, no. 10, 2021, doi: 10.3390/rs13101886.

[21] Sk. W. Akram, A. S. T. Mothadaka, P. A. Shaik, M. Mandala, and N. S. Pandi, “Comparative Analysis Using Machine Learning Algorithms to Detect Parkinson’s Disease using Voice Dataset,” Int. J. Res. Appl. Sci. Eng. Technol., vol. 12, no. 3, pp. 1933–1942, Mar. 2024, doi: 10.22214/ijraset.2024.59252.

[22] Q. Fan, Y. Jiang, Y. Wang, and G. Fan, “Forest Carbon Storage Dynamics and Influencing Factors in Southeastern Tibet: GEE and Machine Learning Analysis,” Forests, vol. 16, no. 5, 2025, doi: 10.3390/f16050825.

[23] A. Indryani, U. Khaira, and M. F. Putri, “Rainfall Prediction Using Long Short-Term Memory Method (Case Study: Jambi City),” Jurnal Pepadun, vol. 6, no. 1, pp. 57–70, Apr. 2025, doi: 10.23960/pepadun.v6i1.256.

[24] B. Alsubhi et al., “Effective Feature Prediction Models for Student Performance,” Engineering, Technology and Applied Science Research, vol. 13, no. 5, pp. 11937–11944, Oct. 2023, doi: 10.48084/etasr.6345.

[25] O. Iparraguirre-Villanueva et al., “Comparison of Predictive Machine Learning Models to Predict the Level of Adaptability of Students in Online Education,” International Journal of Advanced Computer Science and Applications, vol. 14, no. 4, 2023, doi: 10.14569/IJACSA.2023.0140455.

[26] A. B. I. Bernardo, M. O. Cordel, M. R. C. Lapinid, J. M. M. Teves, S. A. Yap, and U. C. Chua, “Contrasting Profiles of Low-Performing Mathematics Students in Public and Private Schools in the Philippines: Insights from Machine Learning,” J. Intell., vol. 10, no. 3, 2022, doi: 10.3390/jintelligence10030061.

[27] M. Wang and S. Liu, “Machine Learning-Based Research on the Adaptability of Adolescents to Online Education,” Aug. 2024, doi: https://doi.org/10.48550/arXiv.2408.16849.

[28] M. K. Huang, T. H. School, K. Huang, I. Zimmerman, and D. Bein, “Study on the Use of Random Forest Classifier model and Multi Output Classifier model for Predicting Student Academic Performance and Identifying Area of Concern,” 2025. doi: 10.18260/1-2--57159.

Downloads

Published

2026-03-17

Issue

Section

Articles

How to Cite

[1]
M. Jannah and P. P. Izati, “Evaluation of Classification Methods for Predicting Junior High School Accreditation Ranks in Indonesia”, IJECSA, vol. 5, no. 1, pp. 43–54, Mar. 2026, doi: 10.30812/ijecsa.v5i1.6032.