Novel Application of K-Means Algorithm for Unique Sentiment Clustering in 2024 Korean Movie Reviews on TikTok Platform
DOI:
https://doi.org/10.30812/matrik.v24i2.4794Keywords:
Clustering, IndoBERT, K-Means Algorithm, Korean, Sentiment Analysis, TiktokAbstract
In recent years, social media has become one of the main factors influencing public perception of films. As a rapidly growing video-sharing platform, TikTok plays a crucial role in shaping audience opinions through comments, short reviews, and user discussions. This phenomenon is increasingly relevant in the Korean film industry, attracting global attention with its diverse genres and engaging narratives. However, a deep understanding of how audiences respond to films based on genre remains limited, especially in the dynamic context of social media. Therefore, this study aims to analyze audience sentiment toward Korean films released in 2024 on TikTok, focusing on sentiment distribution across four main genres: comedy, romance, action, and fun stories. The research methodology includes data collection through web crawling on TikTok, followed by text preprocessing and feature extraction using IndoBERT. Sentiment classification uses SentimentIntensityAnalyzer to categorize comments into positive, negative, or neutral. Since the dataset consists of unlabeled text, K-Means clustering is employed to identify sentiment groupings, with validation using principal component analysis to ensure cluster quality. The findings indicate that the romance and comedy genres are predominantly associated with neutral sentiment, reaching 89.6% and 87.4%, respectively. In contrast, the action genre exhibits higher sentiment polarization, with 14.9% positive and 24.7% negative sentiment. The fun story genre shows a more evenly distributed sentiment pattern. The main challenges include determining the optimal number of clusters and addressing imbalanced sentiment distribution across genres. This study provides valuable insights for filmmakers and marketers to understand audience reactions on social media better, enabling more targeted promotional strategies. Additionally, it contributes to the literature on sentiment analysis in the film industry, emphasizing the importance of genre-specific audience reception patterns for future research.
Downloads
References
108–120, https://doi.org/10.32509/wacana.v22i1.2671.
[2] F. T. Laily and A. P. Purbantina, “Digitalisasi Industri Perfilman Korea Selatan melalui Netflix sebagai Alternatif Pasar Ekspor
Film,†vol. 4, no. 2, p. 141, https://doi.org/10.33021/exp.v4i2.1494.
[3] S. V. Mahardhika, I. Nurjannah, I. I. Ma’una, and Z. Islamiyah, “Faktor-Faktor Penyebab Tingginya Minat
Generasi Post-Millenial di Indonesia terhadap Penggunaan Aplikasi TikTok,†vol. 2, no. 1, pp. 40–53, https:
//doi.org/10.26740/sosearch.v2n1.p40-53.
[4] P. S. Rahmadani, F. C. Tampubolon, A. N. Jannah, N. L. H. Hutabarat, and A. M. Simarmata, “Tiktok Social Media Sentiment
Analysis Using the Nave Bayes Classifier Algorithm,†Sinkron: jurnal dan penelitian teknik informatika, vol. 6, no. 3, pp.
995–999, 2022, https://doi.org/10.33395/sinkron.v7i3.11579.
[5] J. C. Setiawan, K. M. Lhaksmana, and B. Bunyamin, “Sentiment Analysis of Indonesian TikTok Review Using LSTM and
IndoBERTweet Algorithm,†vol. 8, no. 3, pp. 774–780, https://doi.org/10.29100/jipi.v8i3.3911.
[6] E. Apriani, F. Oktavianalisti, L. D. H. Monasari, I. Winarni, and I. F. Hanif, “Analisis Sentimen Penggunaan TikTok Sebagai
Media Pembelajaran Menggunakan Algoritma Na¨ıve Bayes Classifier: Sentiment Analysis of Using TikTok as a Learning
Media Using the Na¨ıve Bayes Classifiers Algorithm,†vol. 4, no. 3, pp. 1160–1168, https://doi.org/10.57152/malcom.v4i3.1482.
[7] S. Jung, D. Murthy, B. S. Bateineh, A. Loukas, and A. V.Wilkinson, “The Normalization of Vaping on TikTok Using Computer
Vision, Natural Language Processing, and Qualitative Thematic Analysis: Mixed Methods Study,†vol. 26,December, p.
e55591, https://doi.org/10.2196/55591.
[8] C. Chen, B. Xu, J.-H. Yang, and M. Liu, “Sentiment Analysis of Animated Film Reviews Using Intelligent Machine Learning,â€
vol. 2022, July, pp. 1–8, https://doi.org/10.1155/2022/8517205.
[9] R. Merdiansah, S. Siska, and A. A. Ridha, “Analisis Sentimen Pengguna X Indonesia Terkait Kendaraan Listrik Menggunakan
IndoBERT,†vol. 7, no. 1, pp. 221–228, https://doi.org/10.55338/jikomsi.v7i1.2895.
[10] D. Abimanyu, E. Budianita, E. P. Cynthia, F. Yanto, and Y. Yusra, “Analisis Sentimen Akun Twitter Apex Legends
Menggunakan VADER,†vol. 5, no. 3, pp. 423–431, https://doi.org/10.32672/jnkti.v5i3.4382.
[11] N. A. Maori and E. Evanita, “Metode Elbow dalam Optimasi Jumlah Cluster pada K-Means Clustering,†vol. 14, no. 2, pp.
277–288, https://doi.org/10.24176/simet.v14i2.9630.
[12] L. Efrizoni, S. Defit, and M. Tajuddin, “Hybrid Modeling to Classify and Detect Outliers on Multilabel Dataset based on
Content and Context,†vol. 13, no. 12, pp. 550–559, 2022/34/30, https://doi.org/10.14569/IJACSA.2022.0131267.
[13] S. Armand, M. Hafid T, and M. Rafi Muttaqin, “Analisis Sentimen Sistem E-tilang pada Platform Twitter Menggunakan
Metode Na¨ıve Bayes,†vol. 7, no. 3, pp. 1989–1994, https://doi.org/10.36040/jati.v7i3.7023.
[14] D. Khyani, B. S. Siddhartha, N. M. Niveditha, B. M. Divya, and Y. M. Manu, “An Interpretation of Lemmatization and Stemming
in Natural Language Processing,†vol. 22, no. 10, pp. 350–357, https://www.researchgate.net/publication/348306833.
[15] R. Rinandyaswara, Y. A. Sari, and M. T. Furqon, “Pembentukan Daftar Stopword Menggunakan Term Based Random
Sampling Pada Analisis Sentimen Dengan Metode Na¨ıve Bayes (Studi Kasus: Kuliah Daring Di Masa Pandemi),†vol. 9, no. 4,
p. 717, https://doi.org/10.25126/jtiik.2022934707.
[16] Febriyanto A, D. S. S. Anggie, and I. Mulyadi, “Penerapan Algoritma K-Means terhadap Evaluasi Website E-commerce,â€
vol. 3, no. 12, pp. 12–20, https://doi.org/10.59003/nhj.v3i12.1124.
[17] A. B. Saputra, P. W. Cahyo, M. Habibi, and A. Priadana, “Analysis and Visualization of BPJS on Twitter Using K-Means
Clustering,†vol. 3, no. 3, pp. 109–117, https://doi.org/10.31101/ijhst.v3i3.2466.
[18] D. Puspita and R. Syahri, “Penerapan Metode K-Means Clustering Untuk Pengelompokan Potensi Padi di Kota Pagar Alam,â€
JATI (Jurnal Mahasiswa Teknik Informatika), vol. 8, no. 2, pp. 2187–2193, 2024, https://doi.org/10.36040/jati.v8i2.9432.
Downloads
Published
Issue
Section
How to Cite
Similar Articles
- Anas Syaifudin, Purwanto Purwanto, Heribertus Himawan, M. Arief Soeleman, Customer Segmentation with RFM Model using Fuzzy C-Means and Genetic Programming , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 2 (2023)
- Donny Kurniawan, Anthony Anggrawan, Hairani Hairani, Graduation Prediction System on Students Using C4.5 Algorithm , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 19 No. 2 (2020)
- Annisa’ul Mubarokah, Rita Ambarwati, Dedy Dedy, Mashhura Toirхonovna Alimova, Unsafe Conditions Identification Using Social Networks in Power Plant Safety Reports , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 2 (2024)
- Putu Tisna Putra, Anthony Anggrawan, Hairani Hairani, Comparison of Machine Learning Methods for Classifying User Satisfaction Opinions of the PeduliLindungi Application , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Indradi Rahmatullah, Gibran Satya Nugraha, Arik Aranta, Feature Selection on Grouping Students Into Lab Specializations for the Final Project Using Fuzzy C-Means , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 1 (2023)
- Virdiana Sriviana Fatmawaty, Imam Riadi, Herman Herman, Higher Education Institution Clustering Based on Key Performance Indicators using Quartile Binning Method , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 24 No. 1 (2024)
- Muhammad Tajuddin, Ahmat Adil, Andi Sofyan Anas, Game for Sasak Script Based on Knuth Morris Pratt Algorithm and ADDIE Model , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 1 (2022)
- Hepatika Zidny Ilmadina, Muhammad Naufal, Dega Surono Wibowo, Drowsiness Detection Based on Yawning Using Modified Pre-trained Model MobileNetV2 and ResNet50 , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 22 No. 3 (2023)
- Muhammad Amirul Mukminin, Tio Dharmawan, Muhamad Arief Hidayat, Gender Classification Using Viola Jones, Orthogonal Difference Local Binary Pattern and Principal Component Analysis , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 23 No. 3 (2024)
- Suhirman Suhirman, Shoffan Saifullah, Ahmad Tri Hidayat, Rr Hajar Puji Sejati, Otsu Method for Chicken Egg Embryo Detection based-on Increase Image Quality , MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer: Vol. 21 No. 2 (2022)
You may also start an advanced similarity search for this article.