Enhancing Semantic Similarity in Concept Maps Using LargeLanguage Models
DOI:
https://doi.org/10.30812/matrik.v24i3.4727Keywords:
Concept Map, Large Language Model, Semantic Similarity, TransformerAbstract
This research uses advanced models, Generative Pre-trained Transformer-4 and Bidirectional Encoder Representations from Transformers, to generate embeddings that analyze semantic relationships in open-ended concept maps. The problem addressed is the challenge of accurately capturing complex relationships between concepts in concept maps, commonly used in educational settings, especially in relational database learning. These maps, created by students, involve numerous interconnected concepts, making them difficult for traditional models to analyze effectively. In this study, we compare two variants of the Artificial Intelligence model to evaluate their ability to generate semantic
embeddings for a dataset consisting of 1,206 student-generated concepts and 616 link nodes (Mean Concept = 4, Standard Deviation = 4.73). These student-generated maps are compared with a reference map created by a teacher containing 50 concepts and 25 link nodes. The goal is to assess the models’ performance in capturing the relationships between concepts in an open-ended learning environment. The results show that demonstrate that Generative Pretrained Transformers outperform other models in generating more accurate semantic embeddings. Specifically, Generative Pre-trained Transformer achieves 92% accuracy, 96% precision, 96% recall, and 96% F1-score. This highlights the Generative Pretrained Transformer’s ability to handle the complexity of large, student-generated
concept maps while avoiding overfitting, an issue observed with the Bidirectional Encoder Representations
from Transformer models. The key contribution of this research is the ability of two complex models and multi-faceted relationships among concepts with high precision. This makes it particularly valuable in educational environments, where precise semantic analysis of open-ended data is crucial, offering potential for enhancing concept map-based learning with scalable and accurate solutions.
Downloads
References
[1] D. D. Prasetya, T. Widiyaningtyas, and T. Hirashima, “Interrelatedness patterns of knowledge representation in extension
concept mapping,” Research and Practice in Technology Enhanced Learning, vol. 20, no. 09, pp. 2–18, may 2024,
https://doi.org/10.58459/rptel.2025.20009.
[2] A. Pinandito, C. P. Wulandari, D. D. Prasetya, Y. Hayashi, and T. Hirashima, “Knowledge Reconstruction with Kit-Build
Concept Map: A Review from Student Experience,” in 7th International Conference on Sustainable Information Engineering
and Technology 2022. New York, NY, USA: ACM, nov 2022, pp. 263–270, https://doi.org/10.1145/3568231.3568274.
[Online]. Available: https://dl.acm.org/doi/10.1145/3568231.3568274
[3] X. Wang, C. F. Lee, Y. Li, and X. Zhu, “Digital Transformation of Education: Design of a “Project-Based Teaching” Service
Platform to Promote the Integration of Production and Education,” Sustainability (Switzerland), vol. 15, no. 16, pp. 02–21, aug
2023, https://doi.org/10.3390/su151612658.
[4] S. Papadakis, “Tools for evaluating educational apps for young children: a systematic review of the literature,” Interactive
Technology and Smart Education, vol. 18, no. 1, pp. 18–49, may 2021, https://doi.org/10.1108/ITSE-08-2020-0127.
[5] S. Schneider, F. Krieglstein, M. Beege, and G. D. Rey, “How organization highlighting through signaling, spatial contiguity
and segmenting can influence learning with concept maps,” Computers and Education Open, vol. 2, p. 100040, dec 2021,
https://doi.org/10.1016/j.caeo.2021.100040.
[6] F. Sciarrone and M. Temperini, “A Sentence-Embedding-Based Dashboard to Support Teacher Analysis of Learner Concept
Maps,” Electronics, vol. 13, no. 9, p. 1756, may 2024, https://doi.org/10.3390/electronics13091756.
[7] R. Mandasari and S. Winduwati, “Upaya Public Relations Pusbisindo dalam Mengampanyekan Penggunaan Bahasa Isyarat
Indonesia di Kalangan Masyarakat,” Prologia, vol. 6, no. 2, pp. 355–361, nov 2022, https://doi.org/10.24912/pr.v6i2.15572.
[8] D. D. Prasetya and T. Hirashima, “Associated Patterns in Open-Ended Concept Maps within E-Learning,” Knowledge
Engineering and Data Science, vol. 5, no. 2, p. 179, dec 2022, https://doi.org/10.17977/um018v5i22022p179-187. [Online].
Available: http://journal2.um.ac.id/index.php/keds/article/view/38346
[9] C.-H. Chuan, K. Agres, and D. Herremans, “From context to concept: exploring semantic relationships in music
with word2vec,” Neural Computing and Applications, vol. 32, no. 4, pp. 1023–1036, feb 2020, https://doi.org/10.1007/
s00521-018-3923-1.
[10] F. Sakketou and N. Ampazis, “A constrained optimization algorithm for learning GloVe embeddings with semantic lexicons,”
Knowledge-Based Systems, vol. 195, no. 14, pp. 02–10, may 2020, https://doi.org/10.1016/j.knosys.2020.105628.
[11] F. Lan, “Research on Text Similarity Measurement Hybrid Algorithm with Term Semantic Information and TF-IDF Method,”
Advances in Multimedia, vol. 2022, no. 7, pp. 1–11, apr 2022, https://doi.org/10.1155/2022/7923262.
[12] C. Tulu, “Experimental Comparison of Pre-Trained Word Embedding Vectors of Word2Vec, Glove, FastText for Word Level
Semantic Text Similarity Measurement in Turkish,” Advances in Science and Technology Research Journal, vol. 16, no. 4, pp.
147–156, oct 2022, https://doi.org/10.12913/22998624/152453.
[13] M. Umer, Z. Imtiaz, M. Ahmad, M. Nappi, C. Medaglia, G. S. Choi, and A. Mehmood, “Impact of convolutional neural network
and FastText embedding on text classification,” Multimedia Tools and Applications, vol. 82, no. 4, pp. 5569–5585, feb 2023,
https://doi.org/10.1007/s11042-022-13459-x.
[14] A. P. Bhopale and A. Tiwari, “Transformer based contextual text representation framework for intelligent information retrieval,”
Expert Systems with Applications, vol. 238, no. 3, p. 121629, mar 2024, https://doi.org/10.1016/j.eswa.2023.121629.
[15] D. D. Prasetya, A. Pinandito, Y. Hayashi, and T. Hirashima, “Analysis of quality of knowledge structure and students’ perceptions
in extension concept mapping,” Research and Practice in Technology Enhanced Learning, vol. 17, no. 1, p. 14, dec 2022,
https://doi.org/10.1186/s41039-022-00189-9.
[16] M. Shin, M. Yoo, S. Kang, S. Choi, and S. Kim, “Proposal of smart contract collection and detection automation framework
based on regular expression pattern matching,” Journal of the Korea Institute of Information and Communication Engineering,
vol. 28, no. 4, pp. 454–466, apr 2024, https://doi.org/10.6109/jkiice.2024.28.4.454.
[17] M. Sun, G. Xie, F. Zhang,W. Guo, X. Fan, T. Li, L. Chen, and J. Du, “PTME: A Regular Expression Matching Engine Based on
Speculation and Enumerative Computation on FPGA,” ACM Transactions on Reconfigurable Technology and Systems, vol. 18,
no. 1, pp. 1–28, mar 2025, https://doi.org/10.1145/3655626.
[18] K. M. S. Prasad, “Text mining: identification of similarity of text documents using hybrid similarity model,” Iran Journal of
Computer Science, vol. 6, no. 2, pp. 123–135, jun 2023, https://doi.org/10.1007/s42044-022-00127-4.
[19] C. P. Chai, “Comparison of text preprocessing methods,” Natural Language Engineering, vol. 29, no. 3, pp. 509–553, may
2023, https://doi.org/10.1017/S1351324922000213.
[20] Z. Jin, “Principle, Methodology and Application for Data Cleaning techniques,” BCP Business & Management, vol. 26, pp.
724–732, sep 2022, https://doi.org/10.54691/bcpbm.v26i.2032.
[21] A. Petukhova and N. Fachada, “TextCL: A Python package for NLP preprocessing tasks,” SoftwareX, vol. 19, no. 10, p. 101122,
jul 2022, https://doi.org/10.1016/j.softx.2022.101122.
[22] V. Mehta, S. Bawa, and J. Singh, “WEClustering: word embeddings based text clustering technique for large datasets,” Complex
& Intelligent Systems, vol. 7, no. 6, pp. 3211–3224, dec 2021, https://doi.org/10.1007/s40747-021-00512-9.
[23] H. Yang, S. Wei, and Y. Wang, “STFEformer: Spatial–Temporal Fusion Embedding Transformer for Traffic Flow Prediction,”
Applied Sciences, vol. 14, no. 10, p. 4325, may 2024, https://doi.org/10.3390/app14104325.
[24] M. Chiny, M. Chihab, A. A. Lahcen, O. Bencharef, and Y. Chihab, “Effect of word embedding vector dimensionality on
sentiment analysis through short and long texts,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 12, no. 2, p.
823, jun 2023, https://doi.org/10.11591/ijai.v12.i2.pp823-830.
[25] P. Rubin-Delanchy, J. Cape, M. Tang, and C. E. Priebe, “A Statistical Interpretation of Spectral Embedding: The Generalised
Random Dot Product Graph,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 84, no. 4, pp.
1446–1473, sep 2022, https://doi.org/10.1111/rssb.12509.
[26] Y. Shin, J. Choi, H. Wi, and N. Park, “An Attentive Inductive Bias for Sequential Recommendation beyond the Self-Attention,”
Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 8, pp. 8984–8992, mar 2024, https://doi.org/10.1609/
aaai.v38i8.28747.
[27] R. K. Singh, “Advancements in Natural language Processing: An In-depth Review of Language Transformer Models,” International
Journal for Research in Applied Science and Engineering Technology, vol. 12, no. 6, pp. 1719–1732, jun 2024,
https://doi.org/10.22214/ijraset.2024.63408.
[28] R. Jia, Z. Zhang, Y. Jia, M. Papadopoulou, and C. Roche, “Improved GPT2 Event Extraction Method Based on Mixed Attention
Collaborative Layer Vector,” IEEE Access, vol. 12, no. 12, pp. 160 074–160 082, 2024, https://doi.org/10.1109/ACCESS.2024.
3487836.
[29] A. de Santana Correia and E. L. Colombini, “Attention, please! A survey of neural attention models in deep learning,” Artificial
Intelligence Review, vol. 55, no. 8, pp. 6037–6124, dec 2022, https://doi.org/10.1007/s10462-022-10148-x.
[30] Y. Tian, F. Han, M. Zhu, X. Xu, and Y. Li, “Research on sign language gesture division and gesture extraction in complex
background,” in International Conference on Computer Vision, Application, and Algorithm (CVAA 2022), H. Imane, Ed. SPIE,
apr 2023, p. 21, https://doi.org/10.1117/12.2673290.
[31] S. Lyu, X. Zhou, X. Wu, Q. Chen, and H. Chen, “Self-Attention Over Tree for Relation Extraction With Data-Efficiency and
Computational Efficiency,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 8, no. 2, pp. 1253–1263,
apr 2024, https://doi.org/10.1109/TETCI.2023.3286268.
[32] K. Singh, M. Mishra, and E. S. Singh, “Content-based Recommender System Using Cosine Similarity,” International Journal
for Research in Applied Science and Engineering Technology, vol. 12, no. 5, pp. 2541–2548, may 2024, https://doi.org/10.
22214/ijraset.2024.61835.
[33] Y. Li, J. Wang, B. Pullman, N. Bandeira, and Y. Papakonstantinou, “Index-based, High-dimensional, Cosine Threshold Querying
with Optimality Guarantees,” Theory of Computing Systems, vol. 65, no. 1, pp. 42–83, jan 2021, https://doi.org/10.1007/
s00224-020-10009-6.
[34] T. Alqahtani, H. A. Badreldin, M. Alrashed, A. I. Alshaya, S. S. Alghamdi, K. bin Saleh, S. A. Alowais, O. A. Alshaya,
I. Rahman, M. S. Al Yami, and A. M. Albekairy, “The emergent role of artificial intelligence, natural learning processing, and
large language models in higher education and research,” Research in Social and Administrative Pharmacy, vol. 19, no. 8, pp.
1236–1242, aug 2023, https://doi.org/10.1016/j.sapharm.2023.05.016.
[35] V. J. Owan, K. B. Abang, D. O. Idika, E. O. Etta, and B. A. Bassey, “Exploring the potential of artificial intelligence tools in
educational measurement and assessment,” Eurasia Journal of Mathematics, Science and Technology Education, vol. 19, no. 8,
p. em2307, aug 2023, https://doi.org/10.29333/ejmste/13428.
[36] A. Subakti, H. Murfi, and N. Hariadi, “The performance of BERT as data representation of text clustering,” Journal of Big Data,
vol. 9, no. 1, p. 15, dec 2022, https://doi.org/10.1186/s40537-022-00564-9.
[37] Y.-G. Xu, X.-P. Qiu, L.-G. Zhou, and X.-J. Huang, “Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation,”
Journal of Computer Science and Technology, vol. 38, no. 4, pp. 853–866, jul 2023, https://doi.org/10.1007/s11390-021-1119-0.
[38] G. Le Mens, B. Kov´acs, M. T. Hannan, and G. Pros, “Uncovering the semantics of concepts using GPT-4,” Proceedings of the
National Academy of Sciences, vol. 120, no. 49, pp. 1–7, dec 2023, https://doi.org/10.1073/pnas.2309350120.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Muhammad Zaki Wiryawan, Didik Dwi Prasetya, Anik Nur Handayani, Tsukasa Hirashima, Wahyu Styo Pratama, Lalu Ganda Rady Putra

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Dewa Ayu Kadek Pramita, Ni Wayan Sumartini Saraswati, I Putu Dedy Sandana, Poria Pirozmand, I Kadek Agus Bisena, Optimizing Hotel Room Occupancy Prediction Using an Enhanced Linear Regression Algorithms , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Anthony Anggrawan, Christofer Satria, Tinjauan Kritis Jurnal Ilmiah: Pengembangan dan Evaluasi Formatif Studi Kasus Multimedia untuk Siswa Desain dan Teknologi Pembelajaran , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 1 (2018)
- Nurun Latifah, Ramaditia Dwiyansaputra, Gibran Satya Nugraha, Multiclass Text Classification of Indonesian Short Message Service (SMS) Spam using Deep Learning Method and Easy Data Augmentation , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Susandri Susandri, Ahmad Zamsuri, Nurliana Nasution, Yoyon Efendi, Hiba Basim Alwan, The Mitigating Overfitting in Sentiment Analysis Insights from CNN-LSTM Hybrid Models , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Imanuddin Imanuddin, Fachrid Alhadi, Raza Oktafian, Ahmad Ihsan, Deteksi Mata Mengantuk pada Pengemudi Mobil Menggunakan Metode Viola Jones , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 18 No. 2 (2019)
- Bambang Saras Yulistiawan, Rifka Widyastuti , RR Octanty Mulianingtyas , Galih Prakoso Rizky A, Hengki Tamando Sihotang, Developing the Adaptive Digital IT Governance Framework forNext-Generation IT Governance , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Ismarmiaty Ismarmiaty, ANALISIS MODEL PENERIMAAN DAN PENGGUNAAN SISTEM INFORMASI WEBSITE PADAMU NEGERI OLEH PENGGUNA MENGGUNAKAN MODEL UNIFIED THEORY OF ACCEPTANCE AND USE OF TECHNOLOGY (UTAUT) , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 16 No. 1 (2016)
- Erlin Erlin, Yenny Desnelita, Nurliana Nasution, Laili Suryati, Fransiskus Zoromi, Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 3 (2022)
- Rizky Hafizh Jatmiko, Yoga Pristyanto, Investigating The Effectiveness of Various Convolutional Neural Network Model Architectures for Skin Cancer Melanoma Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Ni Wayan Sumartini Saraswati, I Wayan Dharma Suryawan, Ni Komang Tri Juniartini, I Dewa Made Krishna Muku, Poria Pirozmand, Weizhi Song, Recognizing Pneumonia Infection in Chest X-Ray Using Deep Learning , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Wahyu Styo Pratama, Didik Dwi Prasetya, Triyanna Widyaningtyas, Muhammad Zaki Wiryawan, Lalu Ganda Rady Putra, Tsukasa Hirashima, Performance Evaluation of Artificial Intelligence Models for Classification in Concept Map Quality Assessment , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- F.ti Ayyu Sayyidul Laily, Didik Dwi Prasetya, Anik Nur Handayani, Tsukasa Hirashima, Revealing Interaction Patterns in Concept Map Construction Using Deep Learning and Machine Learning Models , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Reni Fatrisna Salsabila, Didik Dwi Prasetya, Triyanna Widyaningtyas, Tsukasa Hirashima, Comparison of Text Representation for Clustering Student Concept Maps , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Reo Wicaksono, Didik Dwi Prasetya, Ilham Ari Elbaith Zaeni, Nadindra Dwi Ariyanta, Tsukasa Hirashima, Machine Learning for Open-ended Concept Map Proposition Assessment: Impact of Length on Accuracy , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Nadindra Dwi Ariyanta, Didik Dwi Prasetya, Ilham Ari Elbaith Zaeni, Tsukasa Hirashima, Reo Wicaksono, Assessing the Semantic Alignment in Multilingual Student-TeacherConcept Maps Using mBERT , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
.png)











