Recognize The Polarity of Hotel Reviews using Support Vector Machine

  • Ni Wayan Sumartini Saraswati STMIK STIKOM Indonesia
  • I Gusti Ayu Agung Diatri Indradewi Universitas Pendidikan Ganesha, Bali, Indonesia
Keywords: Hotel Reviews, K-Fold Cross Validation, Support Vector Machines, Text Classification, TripAdvisor Review


A brand is very dependent on consumer perceptions of the product or services. In assessing consumer perceptions of products and services, companies are often faced with data analysis problems. One of the data that is very useful to produce a picture of consumer perceptions of the products and services is review data. So that the company's ability to process review data means that the company has a picture of the strength of the brand it has. Some of the most popular machine learning algorithms for creating text classification models include the naive Bayes family of algorithms, support vector machines (SVM) and deep learning algorithms.  In this research, SVM has been proven to be a reliable method in pattern recognition. In particular, this study aims to produce a model that can be used to classify the polarity of hotel reviews automatically. The experimental data comes from review data on hotels in Europe sourced from TripAdvisor with a total of 38000 reviews. We also measure the quality of the classification engine model. The test results of the SVM model built from hotel review data are quite good. The average accuracy of the classification engine is 92.48%. Because the recall and precision values ​​are balanced, the accuracy value is considered sufficient to describe the quality of the classification.


Download data is not yet available.


[1] H. Irawan, G. Akmalia, dan R. A. Masrury, “Mining tourist’s perception toward Indonesia tourism destination using sentiment analysis and topic modelling,” in ACM International Conference Proceeding Series, 2019, no. 1, hal. 7–12, doi: 10.1145/3361821.3361829.
[2] N. W. S. Saraswati, K. K. Widiartha, dan L. P. A. Prapitasari, “Vector machine to predict student retention: A computerized approach,” J. Phys. Conf. Ser., vol. 1469, no. 1, 2020, doi: 10.1088/1742-6596/1469/1/012045.
[3] I. G. A. A. D. Indradewi, N. W. S. Saraswati, dan N. W. Wardani, “COVID-19 Chest X-Ray Detection Performance Through Variations of Wavelets Basis Function,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 21, no. 1, hal. 31–42, 2021, doi: 10.30812/matrik.v21i1.1089.
[4] N. W. S. Saraswati, N. W. Wardani, dan I. G. A. A. D. Indradewi, “Detection of Covid Chest X-Ray using Wavelet and Support Vector Machines,” Int. J. Eng. Emerg. Technol., vol. 5, no. 2, hal. 116–121, 2020, doi:
[5] A. Darmawan, “Penerapan Model Support Vector Machine Text Mining Pada Komentar Review Smartphone Android Vs Blackberry Dengan Teknik Optimasi Genetic Algorithm,” Fakt. Exacta, vol. 8, no. 2, hal. 100–115, 2015, doi:
[6] N. W. S. Saraswati, “Text mining dengan metode naïve bayes classifier dan support vector machines untuk sentiment analysis,” Udayana, 2011.
[7] F. Fatmawati dan M. Affandes, “Klasifikasi Keluhan Menggunakan Metode Support Vector Machine (SVM) Pada Akun Facebook Group iRaise Helpdesk,” J. CoreIT J. Has. Penelit. Ilmu Komput. dan Teknol. Inf., vol. 3, no. 1, hal. 24, 2018, doi: 10.24014/coreit.v3i1.3552.
[8] F. D. Ananda dan Y. Pristyanto, “Analisis Sentimen Pengguna Twitter Terhadap Layanan Internet Provider Menggunakan Algoritma Support Vector Machine,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 20, no. 2, hal. 407–416, 2021, doi: 10.30812/matrik.v20i2.1130.
[9] S. Efendi dan P. Sihombing, “Sentiment Analysis of Food Order Tweets to Find Out Customer Demographic Profile using SVM,” Matrik J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 21, no. 3, 2022, doi: 10.30812/matrik.v21i3.1898.
[10] C. Darujati, “Perbandingan Klasifikasi Dokumen Teks Menggunakan Metode Naïve Bayes Dengan K-Nearest Neighbor,” Univ. Narotama, vol. 13, no. 1, hal. 1–9, 2010.
[11] C. F. Suharno, M. A. Fauzi, dan R. S. Perdana, “Klasifikasi Teks Bahasa Indonesia Pada Dokumen Pengaduan Sambat Online Menggunakan Metode K-Nearest Neighbors Dan Chi-square,” Syst. Inf. Syst. Informatics J., vol. 3, no. 1, hal. 25–32, 2017, doi: 10.29080/systemic.v3i1.191.
[12] J. Harsono, R. M. No, P. Minggu, dan J. S. Jakarta, “Klasifikasi Teks Berbahasa Indonesia Pada Artikel Berita Menggunakan Metode K-Nearest Neighbor Dengan Fungsi Squared Euclidean Distance Classification of Indonesian Text on News Articles Using K-Nearest Neighbor Method With Squared,” BRITech (Jurnal Ilm. Ilmu Komputer, Sains dan Teknol. Ter., vol. 1, no. 2, hal. 60–65, 2020.
[13] A. Ridok dan R. Latifah, “Klasifikasi Teks Bahasa Indonesia Pada Corpus Tak Seimbang Menggunakan NWKNN,” Konf. Nas. Sist. dan Inform. 2015, no. Oktober, hal. 222–227, 2015.
[14] B. M. Hsu, “Comparison of supervised classification models on textual data,” Mathematics, vol. 8, no. 5, 2020, doi: 10.3390/MATH8050851.
[15] U. Desi Arni, “Apa Itu Text Mining ?,” 2021. (diakses Mar 31, 2021).
[16] T. Wijaya, “Pengertian NLP dan Text Mining,” Algoritma, 2018. (diakses Mar 31, 2021).
[17] I. P. A. M. Utama, S. S. Prasetyowati, dan Y. Sibaroni, “Multi-Aspect Sentiment Analysis Hotel Review Using RF, SVM, and Naïve Bayes based Hybrid Classifier,” J. Media Inform. Budidarma, vol. 5, no. 2, hal. 630, 2021, doi: 10.30865/mib.v5i2.2959.
[18] A. Taufik, “Komparasi Algoritma Klasifikasi Text Mining Untuk Analisis Sentimen Pada Review Restoran,” J. Tek. Komput. AMIK BSI, vol. 4, no. 2, hal. 112–118, 2018, doi: 10.31294/jtk.v4i2.3461.
How to Cite
Saraswati, N. W., & Diatri Indradewi, I. G. A. (2022). Recognize The Polarity of Hotel Reviews using Support Vector Machine. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 22(1), 25-36.