Comparative Analysis of TF-IDF and Modern Text Embedding for theClassification of Islamic Ideologies on Indonesian Twitter
DOI:
https://doi.org/10.30812/matrik.v25i1.5600Keywords:
Islamic Ideologies, Machine Learning, Social Media, Support Vector Machine, Text ClassificationAbstract
The ideological polarization that has emerged on social media platforms like Twitter, particularly regarding discussions on Islamic ideologies in Indonesia, has led to the rapid spread of da’wah. However, it has also created challenges in effectively classifying tweets into distinct Islamic ideologies, such as Liberal Islam and Moderate Islam (Wasathiyyah). The lack of effective methods for accurately
classifying such nuanced content presents a significant challenge. To address this problem, the research aimed to develop and evaluate a machine learning model that compares the effectiveness of traditional word vectorization methods (TF-IDF) with modern text embedding models (Nomic Embed v2). The study utilized the Knowledge Discovery in Databases (KDD) framework, scraped relevant data using the Twitter API, and annotated the dataset based on ideology. Preprocessing techniques such as case folding, stopword removal, and symbol removal were applied to the dataset. Classification was carried out using an SVM model, and cross-validation was employed to assess the model’s accuracy. The findings indicate that the embedding model improved the accuracy by providing nuanced semantic context for the tweets, suggesting that modern semantic models can outperform traditional methods in
classifying complex, context-dependent texts.
Downloads
References
[1] M. Murniati, “Ruang Publik dan Wacana Agama: Dinamika Dakwah di Tengah Polarisasi Sosial,” vol. 1, no. 1, pp. 26–33,
June,2025, https://doi.org/10.70742/khazanah.v1i1.260.
[2] S. R. I. Rezeki, Y. Restiviani, and R. Zahara, “Penggunaan Sosial Media Twitter dalam Komunikasi Organisasi (Studi Kasus
Pemerintah Provinsi DKI Jakarta dalam Penanganan Covid-19),” vol. 4, no. 2, pp. 63–78, 2020, https://doi.org/10.18592/jils.
v4i2.3812.
[3] S. Hudaa, N. Nuryani, and B. Sumadyo, “Pesan Dakwah Hijrah Influencer untuk Kalangan Muda di Media Sosial,” vol. 17,
no. 2, pp. 105–121, January,2023, https://doi.org/10.47651/mrf.v17i2.198.
[4] A. S. Amin and M. S. Syarifah, “Liberal Islam and Its Influences on the Development of Quranic Exegesis in Indonesia and
Malaysia,” vol. 22, no. 1, pp. 137–160, January,2021, https://doi.org/10.14421/qh.2021.2201-07.
[5] N. Rubani, “Elemen Islam Liberal dalam Idea Pembaharuan Islam AhmadWahib: Elements of Liberal Islam In AhmadWahib’s
Idea Of Islamic Reform,” vol. 16, no. 1, pp. 9–21, May,2023, https://doi.org/10.53840/jpi.v16i1.235.
[6] A. Maksum, I. Abdullah, S. Mas’udah, and M. Saud, “Islamic Movements in Indonesia: A Critical Study of Hizbut Tahrir
Indonesia and Jaringan Islam Liberal,” vol. 17, no. 2, pp. 71–82, December,2022, https://doi.org/10.22452/JAT.vol17no2.6.
[7] A. Halim, H. Hosaini, A. Zukin, and R. Mahtum, “Paradigma Islam Moderat di Indonesia dalam Membentuk Perdamaian
Dunia,” vol. 1, no. 4, pp. 705–708, October,2022, https://doi.org/10.59004/jisma.v1i4.239.
[8] M. Mudhofi, I. Supena, A. Karim, S. Safrodin, and S. Solahuddin, “Public opinion analysis for moderate religious: Social media
data mining approach,” vol. 43, no. 1, pp. 1–27, May,2023, https://doi.org/10.21580/jid.v43.1.16101.
[9] N. Nuwairah and M. Munsyi, “Classification Content in Indonesian Website Da’wah using Text Mining for Detecting Islamic
Radical Understanding:,” February,2022, pp. 11–16, https://doi.org/10.2991/assehr.k.220206.002.
[10] K. T. Mursi, M. D. Alahmadi, F. S. Alsubaei, and A. S. Alghamdi, “Detecting Islamic Radicalism Arabic Tweets Using Natural
Language Processing,” vol. 10, pp. 72 526–72 534, July, 2022, https://doi.org/10.1109/ACCESS.2022.3188688.
[11] A. Olowolayemo and S. Moustafa Sharey Moustafa, “Classifying Muslim Ideologies from IslamicWebsites using Text Analysis
Based on Naive Bayes and TF-IDF,” vol. 10, no. 1, pp. 8–15, January,2024, https://doi.org/10.31436/ijpcc.v10i1.321.
[12] W. Gonz´alez-Baquero, J. J. Amores, and C. Arcila-Calder´on, “The Conversation around Islam on Twitter: Topic Modeling
and Sentiment Analysis of Tweets about the Muslim Community in Spain since 2015,” vol. 14, no. 6, p. 724, May,2023,
https://doi.org/10.3390/rel14060724.
[13] A. Palanivinayagam, C. Z. El-Bayeh, and R. Damaˇseviˇcius, “Twenty Years of Machine-Learning-Based Text Classification: A
Systematic Review,” vol. 16, no. 5, p. 236, April,2023, https://doi.org/10.3390/a16050236.
[14] X. Shu and Y. Ye, “Knowledge Discovery: Methods from data mining and machine learning,” vol. 110, p. 102817, February,
2023, https://doi.org/10.1016/j.ssresearch.2022.102817.
[15] R. Ulgasesa, A. B. P. Negara, and T. Tursina, “Pengaruh Stemming Terhadap Performa Klasifikasi Sentimen Masyarakat Tentang
Kebijakan New Normal,” vol. 10, no. 3, p. 286, September,2022, https://doi.org/10.26418/justin.v10i3.53880.
[16] E. Dewi, “Islam Liberal di Indonesia (Pemikiran dan Pengaruhnya dalam Pemikiran Politik Islam di Indonesia),” vol. 2, no. 2,
pp. 18–32, January,2018, https://doi.org/10.14710/jiip.v2i2.2119.
[17] K. Bustamam-Ahmad, “Contemporary Islamic Thought in Indonesian and Malay World: Islam Liberal, Islam Hadhari, and
Islam Progresif,” vol. 5, no. 1, p. 91, June,2011, https://doi.org/10.15642/JIIS.2011.5.1.91-129.
[18] C. T. Agustina, “Pergerakan jaringan islam liberal (jil) di indonesia tahun 2001-2005,” vol. 4, p. 242059, September,2012.
[Online]. Available: https://www.neliti.com/publications/242059/
[19] D. E. Cahyani and I. Patasik, “Performance comparison of TF-IDF and Word2Vec models for emotion text classification,”
vol. 10, no. 5, pp. 2780–2788, October,2021, https://doi.org/10.11591/eei.v10i5.3157.
[20] Z. Nussbaum, J. X. Morris, B. Duderstadt, and A. Mulyar. (2024) Nomic Embed: Training a Reproducible Long Context Text
Embedder. https://doi.org/10.48550/ARXIV.2402.01613.
[21] J. Mutinda, W. Mwangi, and G. Okeyo, “Sentiment Analysis of Text Reviews Using Lexicon-Enhanced Bert Embedding
(LeBERT) Model with Convolutional Neural Network,” vol. 13, no. 3, p. 1445, 2023-01-21, https://doi.org/10.3390/
app13031445.
[22] X. Zhang, N. Thakur, O. Ogundepo, E. Kamalloo, D. Alfonso-Hermelo, X. Li, Q. Liu, M. Rezagholizadeh, and J. Lin,
“MIRACL : A Multilingual Retrieval Dataset Covering 18 Diverse Languages,” vol. 11, pp. 1114–1131, September,2023,
https://doi.org/10.1162/tacl a 00595.
[23] H. Abdelmotaleb, C. Mcneile, and M. Wojty´s, “A comparative study of word embedding techniques for classification of star
ratings,” vol. 297, p. 129037, February,2026, https://doi.org/10.1016/j.eswa.2025.129037.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Siti Ummi Masruroh, Cong Dai Nguyen, Doni Febrianus

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
How to Cite
Similar Articles
- Taufik Hidayat, Mohammad Ridwan, Muhamad Fajrul Iqbal, Sukisno Sukisno, Robby Rizky, William Eric Manongga, Determining Toddler's Nutritional Status with Machine Learning Classification Analysis Approach , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 2 (2025)
- Reo Wicaksono, Didik Dwi Prasetya, Ilham Ari Elbaith Zaeni, Nadindra Dwi Ariyanta, Tsukasa Hirashima, Machine Learning for Open-ended Concept Map Proposition Assessment: Impact of Length on Accuracy , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Ni Wayan Sumartini Saraswati, I Gusti Ayu Agung Diatri Indradewi, Recognize The Polarity of Hotel Reviews using Support Vector Machine , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Putu Tisna Putra, Anthony Anggrawan, Hairani Hairani, Comparison of Machine Learning Methods for Classifying User Satisfaction Opinions of the PeduliLindungi Application , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Annisa Nurul Puteri, Suryadi Syamsu, Topan Leoni Putra, Andita Dani Achmad, Support Vector Machine for Predicting Candlestick Chart Movement on Foreign Exchange , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Fitra Ahya Mubarok, Mohammad Reza Faisal, Dwi Kartini, Dodon Turianto Nugrahadi, Triando Hamonangan Saragih, Gender Classification of Twitter Users Using Convolutional Neural Network , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Firda Yunita Sari, Maharani sukma Kuntari, Hani Khaulasari, Winda Ari Yati, Comparison of Support Vector Machine Performance with Oversampling and Outlier Handling in Diabetic Disease Detection Classification , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- M Safii, Husain Husain, Khairan Marzuki, Support Vector Machine Optimization for Diabetes Prediction UsingGrid Search Integrated with SHapley Additive exPlanations , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 25 No. 1 (2025)
- Wahyu Styo Pratama, Didik Dwi Prasetya, Triyanna Widyaningtyas, Muhammad Zaki Wiryawan, Lalu Ganda Rady Putra, Tsukasa Hirashima, Performance Evaluation of Artificial Intelligence Models for Classification in Concept Map Quality Assessment , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
- Ahmad Zein Al Wafi, Febry Putra Rochim, Veda Bezaleel, Investigating Liver Disease Machine Learning Prediction Performancethrough Various Feature Selection Methods , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 3 (2025)
You may also start an advanced similarity search for this article.
Most read articles by the same author(s)
- Viva Arifin, Velia Handayani, Luh Kesuma Wardhani, Hendra Bayu Suseno, Siti Ummi Masruroh, User Interface and Exprience Gamification-Based E-Learning with Design Science Research Methodology , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Siti Ummi Masruroh, Andrew Fiade, Muhammad Ikhsan Tanggok, Rizka Amalia Putri, Luigi Ajeng Pratiwi, Convolutional Neural Network for Colorization of Black and White Photos , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Muhamad Nur Gunawan, Titi Farhanah, Siti Ummi Masruroh, Ahmad Mukhlis Jundulloh, Nafdik Zaydan Raushanfikar, Rona Nisa Sofia Amriza, Accuracy of K-Nearest Neighbors Algorithm Classification For Archiving Research Publications , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
.png)











