Essay auto-scoring using N-Gram and Jaro Winkler based Indonesian Typos

  • Herlina Jayadianti Universitas Pembangunan Nasional Veteran Yogyakarta, Yogyakarta, Indonesia
  • Budi Santosa Universitas Pembangunan Nasional Veteran Yogyakarta, Yogyakarta, Indonesia
  • Judanti Cahyaning Universitas Pembangunan Nasional Veteran Yogyakarta, Yogyakarta, Indonesia
  • Shoffan Saifullah Universitas Pembangunan Nasional Veteran Yogyakarta, Yogyakarta, Indonesia
  • Rafal Drezewski AGH University of Science and Technology,Cracow, Poland
Keywords: Automation, Spelling error detection and correction, N-Gram, Jaro Winkler


Writing errors on e-essay exams reduce scores. Thus, detecting and correcting errors automatically in writing answers is necessary. The implementation of Levenshtein Distance and N-Gram can detect writing errors. However, this process needed a long time because of the distance method used. Therefore, this research aims to hybrid Jaro Winker and N-Gram methods to detect and correct writing errors automatically. This process required preprocessing and finding the best word recommendations by the Jaro Winkler method, which refers to Kamus Besar Bahasa Indonesia (KBBI). The N-Gram method refers to the corpus. The final scoring used the Vector Space Model (VSM) method based on the similarity of words between the answer keys and the respondent’s answers. Datasets used 115 answers from 23 respondents with some writing errors. The results of Jaro Winkler and N-Gram methods are good in detecting and correcting Indonesian words with the accuracy of detection averages of 83.64% (minimum of 57.14% and maximum of 100.00%). In contrast, the error correction accuracy averages 78.44% (minimum of 40.00% and maximum of 100.00%). However, Natural Language Processing (NLP) needs to improve these results for word recommendations.


Download data is not yet available.


[1] G. Giray, “An assessment of student satisfaction with e-learning: An empirical study with computer and software engineering undergraduate students in Turkey under pandemic conditions,” Educ. Inf. Technol., vol. 26, no. 6, pp. 6651–6673, Nov. 2021, doi: 10.1007/s10639-021-10454-x.
[2] D. Ramesh and S. K. Sanampudi, “An automated essay scoring systems: a systematic literature review,” Artif. Intell. Rev., vol. 55, no. 3, pp. 2495–2527, Mar. 2022, doi: 10.1007/s10462-021-10068-2.
[3] R. Fitri and A. N. Asyikin, “Aplikasi Penilaian Ujian Essay Otomatis Menggunakan Metode Cosine Similarity,” J. Poros Tek., vol. 7, no. 2, pp. 88–94, 2015, doi: 10.31961/porosteknik.v7i2.218.
[4] M. A. Hussein, H. Hassan, and M. Nassef, “Automated language essay scoring systems: a literature review,” PeerJ Comput. Sci., vol. 5, no. August, pp. 1–28, Aug. 2019, doi: 10.7717/peerj-cs.208.
[5] N. Süzen, A. N. Gorban, J. Levesley, and E. M. Mirkes, “Automatic short answer grading and feedback using text mining methods,” in Procedia Computer Science, 2020, pp. 726–743. doi: 10.1016/j.procs.2020.02.171.
[6] E. Hartati and M. Mardiana, “Evaluasi Penerapan Computer Based Test (CBT) sebagai Upaya Perbaikan Sistem pada Ujian Nasional untuk Sekolah Terpencil di Sumatera Selatan,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 18, no. 1, pp. 58–64, Nov. 2018, doi: 10.30812/matrik.v18i1.321.
[7] S. Link, M. Mehrzad, and M. Rahimi, “Impact of automated writing evaluation on teacher feedback, student revision, and writing improvement,” Comput. Assist. Lang. Learn., vol. 35, no. 4, pp. 605–634, May 2022, doi: 10.1080/09588221.2020.1743323.
[8] M. Zhu, O. L. Liu, and H.-S. Lee, “The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing,” Comput. Educ., vol. 143, no. January, pp. 1–43, Jan. 2020, doi: 10.1016/j.compedu.2019.103668.
[9] E. Lindgren, A. Westum, H. Outakoski, and K. P. H. Sullivan, “Revising at the Leading Edge: Shaping Ideas or Clearing up Noise,” in Observing Writing, BRILL, 2019, pp. 346–365. doi: 10.1163/9789004392526_017.
[10] S. J. Putra, T. Mantoro, and M. N. Gunawan, “Text mining for Indonesian translation of the Quran: A systematic review,” in 2017 International Conference on Computing, Engineering, and Design (ICCED), Nov. 2017, pp. 1–5. doi: 10.1109/CED.2017.8308122.
[11] I. Ganguli, R. S. Bhowmick, and J. Sil, “Deep Insights of Erroneous Bengali–English Code-Mixed Bilingual Language,” IETE J. Res., pp. 1–12, Jun. 2021, doi: 10.1080/03772063.2021.1934125.
[12] D. Deksne, “Bidirectional LSTM Tagger for Latvian Grammatical Error Detection,” in Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science, vol. 11697, 2019, pp. 58–68. doi: 10.1007/978-3-030-27947-9_5.
[13] W. Wei and Y. (Katherine) Cao, “Written Corrective Feedback Strategies Employed by University English Lecturers: A Teacher Cognition Perspective,” SAGE Open, vol. 10, no. 3, pp. 1–12, Jul. 2020, doi: 10.1177/2158244020934886.
[14] J. L. Hernández, F. M. Molina, and Á. Almela, “Analysis of Context-Dependent Errors in the Medical Domain in Spanish: A Corpus-Based Study,” SAGE Open, vol. 13, no. 1, pp. 1–11, Jan. 2023, doi: 10.1177/21582440221148454.
[15] J. Zhang, C. Wang, A. Muthu, and V. M. Varatharaju, “Computer multimedia assisted language and literature teaching using Heuristic hidden Markov model and statistical language model,” Comput. Electr. Eng., vol. 98, no. March, p. 107715, Mar. 2022, doi: 10.1016/j.compeleceng.2022.107715.
[16] P. Samanta and B. B. Chaudhuri, “A simple real-word error detection and correction using local word bigram and trigram,” in Proceedings of the 25th Conference on Computational Linguistics and Speech Processing ({ROCLING} 2013), Oct. 2013, pp. 211–220.
[17] D. Sudigyo, A. A. Hidayat, R. Nirwantono, R. Rahutomo, J. P. Trinugroho, and B. Pardamean, “Literature study of stunting supplementation in Indonesian utilizing text mining approach,” in Procedia Computer Science, 2023, pp. 722–729. doi: 10.1016/j.procs.2022.12.189.
[18] A. Musyafa, Y. Gao, A. Solyman, C. Wu, and S. Khan, “Automatic Correction of Indonesian Grammatical Errors Based on Transformer,” Appl. Sci., vol. 12, no. 20, pp. 1–17, Oct. 2022, doi: 10.3390/app122010380.
[19] A. Bannayeva and M. Aslanov, “Development of the N-gram Model for Azerbaijani Language,” in 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT), Oct. 2020, pp. 1–5. doi: 10.1109/AICT50176.2020.9368645.
[20] C. Lai, “Fast Retrieval Algorithm of English Sentences Based on Artificial Intelligence Machine Translation,” in In: Atiquzzaman, M., Yen, N., Xu, Z. (eds) 2021 International Conference on Big Data Analytics for Cyber-Physical System in Smart City. BDCPS 2021. Lecture Notes on Data Engineering and Communications Technologies, 2022, pp. 1057–1065. doi: 10.1007/978-981-16-7466-2_117.
[21] F. Friendly, “Jaro–Winkler Distance Improvement For Approximate String Search Using Indexing Data For Multiuser Application,” J. Phys. Conf. Ser., vol. 1361, no. 1, pp. 1–7, Nov. 2019, doi: 10.1088/1742-6596/1361/1/012080.
[22] Y. Rochmawati and R. Kusumaningrum, “Studi Perbandingan Algoritma Pencarian String dalam Metode Approximate String Matching untuk Identifikasi Kesalahan Pengetikan Teks,” J. Buana Inform., vol. 7, no. 2, pp. 125–134, Jan. 2016, doi: 10.24002/jbi.v7i2.491.
[23] P. Pitchandi and M. Balakrishnan, “Document clustering analysis with aid of adaptive Jaro Winkler with Jellyfish search clustering algorithm,” Adv. Eng. Softw., vol. 175, no. January, p. 103322, Jan. 2023, doi: 10.1016/j.advengsoft.2022.103322.
[24] D. A. Anggoro and I. Nurfadilah, “Active Verb Spell Checking Mem- + P in Indonesian Language Using the Jaro-Winkler Distance Algorithm,” Iraqi J. Sci., vol. 63, no. 4, pp. 1811–1822, Apr. 2022, doi: 10.24996/ijs.2022.63.4.38.
[25] F. Shole, “Perbandingan Metode Smoothing Untuk Deteksi Dan Koreksi Kesalahan Kata Dalam Teks Berbahasa Indonesia,” Unikom Repos. Diploma thesis, Univ. Komput. Indones., vol. 63, no. 4, pp. 1811–1822, 2018.
[26] I. Ahamed, M. Jahan, Z. Tasnim, T. Karim, S. M. S. Reza, and D. A. Hossain, “Spell corrector for Bangla language using Norvig’s algorithm and Jaro-Winkler distance,” Bull. Electr. Eng. Informatics, vol. 10, no. 4, pp. 1997–2005, Aug. 2021, doi: 10.11591/eei.v10i4.2410.
[27] A. M. Fanani and S. Suyanto, “Syllabification Model of Indonesian Language Named-Entity Using Syntactic n-Gram,” in Procedia Computer Science, 2021, pp. 721–727. doi: 10.1016/j.procs.2021.01.058.
[28] H. Jayadianti, W. Kaswidjanti, A. T. Utomo, S. Saifullah, F. A. Dwiyanto, and R. Drezewski, “Sentiment analysis of Indonesian reviews using fine-tuning IndoBERT and R-CNN,” Ilk. J. Ilm., vol. 14, no. 3, pp. 348–354, 2022, doi: 10.33096/ilkom.v14i3.1505.348-354.
[29] P. S. Br Ginting, B. Irawan, and C. Setianingsih, “Hate Speech Detection on Twitter Using Multinomial Logistic Regression Classification Method,” in 2019 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS), Nov. 2019, pp. 105–111. doi: 10.1109/IoTaIS47347.2019.8980379.
[30] Y. Fauziah, S. Saifullah, and A. S. Aribowo, “Design Text Mining for Anxiety Detection using Machine Learning based-on Social Media Data during COVID-19 pandemic,” in Proceeding of LPPM UPN “Veteran” Yogyakarta Conference Series 2020–Engineering and Science Series, 2020, pp. 253–261. doi: 10.31098/ess.v1i1.117.
[31] S. Saifullah, Y. Fauziyah, and A. S. Aribowo, “Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data,” J. Inform., vol. 15, no. 1, pp. 45–55, Feb. 2021, doi: 10.26555/jifo.v15i1.a20111.
[32] V. C. M., R. Rudy, and D. S. Naga, “Fast and Accurate Spelling Correction Using Trie and Damerau-levenshtein Distance Bigram,” TELKOMNIKA (Telecommunication Comput. Electron. Control., vol. 16, no. 2, pp. 827–833, Apr. 2018, doi: 10.12928/telkomnika.v16i2.6890.
[33] A. Indriani, M. Muhammad, S. Suprianto, and H. Hadriansa, “Implementasi Jaccard Index dan N-Gram Pada Rekayasa Aplikasi Koreksi Kata Berbahasa Indonesia,” Sebatik, vol. 22, no. 2, pp. 95–101, Dec. 2018, doi: 10.46984/sebatik.v22i2.314.
[34] K. Chang, “5 Text Analysis (NLP) Buzzwords for Market Research,” Kai Analytics, 2019.
[35] A. A. P. Ratna, R. Sanjaya, T. Wirianata, and P. Dewi Purnamasari, “Word level auto-correction for latent semantic analysis based essay grading system,” in 2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineering, Jul. 2017, pp. 235–240. doi: 10.1109/QIR.2017.8168488.
[36] I. E. Agbehadji, H. Yang, S. Fong, and R. Millham, “The Comparative Analysis of Smith-Waterman Algorithm with Jaro-Winkler Algorithm for the Detection of Duplicate Health Related Records,” in 2018 International Conference on Advances in Big Data, Computing and Data Communication Systems (icABCD), Aug. 2018, pp. 1–10. doi: 10.1109/ICABCD.2018.8465458.
[37] T. Tinaliah and T. Elizabeth, “Perbandingan Hasil Deteksi Plagiarisme Dokumen dengan Metode Jaro-Winkler Distance dan Metode Latent Semantic Analysis,” J. Teknol. dan Sist. Komput., vol. 6, no. 1, pp. 7–12, Jan. 2018, doi: 10.14710/jtsiskom.6.1.2018.7-12.
[38] Y. Yulianingsih, “Implementasi Algoritma Jaro-Winkler dan Levenstein Distance dalam Pencarian Data pada Database,” STRING (Satuan Tulisan Ris. dan Inov. Teknol., vol. 2, no. 1, pp. 18–27, Aug. 2017, doi: 10.30998/string.v2i1.1720.
[39] D. Jurafsky and J. H. Martin, Speech and Language Processing, 3rd ed. 2021.
[40] C. Slamet, A. R. Atmadja, D. S. Maylawati, R. S. Lestari, W. Darmalaksana, and M. A. Ramdhani, “Automated Text Summarization for Indonesian Article Using Vector Space Model,” in IOP Conference Series: Materials Science and Engineering, Jan. 2018, pp. 1–6. doi: 10.1088/1757-899X/288/1/012037.
[41] M. E. Sulistyo, R. Saptono, and A. Asshidiq, “Penilaian Ujian Bertipe Essay Menggunakan Metode Text Similarity,” Telematika, vol. 12, no. 2, pp. 146–158, Jul. 2015, doi: 10.31315/telematika.v12i2.1422.
[42] S. Saifullah, N. H. Cahyana, Y. Fauziah, A. S. Aribowo, F. A. Dwiyanto, and R. Drezewski, “Text Annotation Automation for Hate Speech Detection using SVM-classifier based on Feature Extraction,” in International Conference on Advanced Research in Engineering and Technology, 2022.
[43] T. Tundo and S. Saifullah, “Fuzzy Inference System Mamdani dalam Prediksi Produksi Kain Tenun Menggunakan Rule Berdasarkan Random Tree,” J. Teknol. Inf. dan Ilmu Komput., vol. 9, no. 3, pp. 443–451, Jun. 2022, doi: 10.25126/jtiik.2022924212.
[44] M. R. Pratama and M. Yunus, “Sistem Deteksi Struktur Kalimat Bahasa Arab Menggunakan Algoritma Light Stemming,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 1, pp. 109–118, Nov. 2019, doi: 10.30812/matrik.v19i1.509.
[45] N. H. Cahyana, S. Saifullah, Y. Fauziah, A. S. Aribowo, and R. Drezewski, “Semi-supervised Text Annotation for Hate Speech Detection using K-Nearest Neighbors and Term Frequency-Inverse Document Frequency,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 10, pp. 147–151, 2022, doi: 10.14569/IJACSA.2022.0131020.
How to Cite
Jayadianti, H., Santosa, B., Cahyaning, J., Saifullah, S., & Drezewski, R. (2023). Essay auto-scoring using N-Gram and Jaro Winkler based Indonesian Typos. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 22(2), 325-338.