Performance Improvement of The Random Forest Method Based on Smote-Tomek Link on Lombok Tourism Analysis Sentiment

  • Khairan Marzuki Universitas Bumigora, Mataram, Indonesia
  • Lalu Ganda Rady Putra Universitas Bumigora, Mataram, Indonesia
  • Hairani Hairani Universitas Bumigora, Mataram, Indonesia
  • Lalu Zazuli Azhar Mardedi Universitas Bumigora
  • Juvinal Ximenes Guterres Universidade Oriental Timur Lorosae
Keywords: Lombok Tourism, Smote-Tonek Link, Random Forest, Text Mining

Abstract

Background: Tourists visiting Lombok Island can access various sources of tourist information and can share their views and tourist experiences through social media such as positive and negative experiences. Objective: This research aims to analyze the sentiment of Lombok tourism reviews using the Smote-Tomek Link and Random Forest algorithms.
Methods: The research was carried out in several stages, namely collecting the Lombok tourism dataset, text preprocessing, text weighting using the Term Frequency-Inverse Document Frequency (TF-IDF) method, data sampling using SMOTE-Tomek Link, text classification using Random Forest, and the final stage was performance testing based on accuracy. Result: The research results obtained using the Smote-Tomek Link and Random Forest methods in sentiment analysis analysis of tourist reviews about Lombok were 94%. Conclusion: The use of the Smote-Tomek Link and Random Forest methods in Lombok tourism sentiment analysis produces very good accuracy.

References

[1] N. A. Deraman, A. G. Buja, K. A. F. A. Samah, M. N. H. H. Jono, M. A. M. Isa, and S. Saad, “A social media mining using topic modeling and sentiment analysis on tourism in Malaysia during COVID19,” IOP Conference Series: Earth and Environmental Science, vol. 704, no. 1, pp. 1–9, 2021, doi: 10.1088/1755-1315/704/1/012020.
[2] S. M. Alrashidi and A. M. Awadelkarim, “Machine Learning-Based Sentiment Analysis for Tweets Saudi Tourism,” Journla of Theoretical and Applied Information Technology, vol. 100, no. 16, pp. 5096–5109, 2022.
[3] Y. Wang, C. Chu, and T. Lan, “Sentiment Classification of Educational Tourism Reviews Based on Parallel CNN and LSTM with Attention Mechanism,” Mobile Information System, pp. 1–13, 2022.
[4] F. H. Rachman, Imamah, and B. S. Rintyarna, “Sentiment Analysis of Madura Tourism in New Normal Era using Text Blob and KNN with Hyperparameter Tuning,” in 2021 International Seminar on Machine Learning, Optimization, and Data Science (ISMODE), Jan. 2022, pp. 23–27. doi: 10.1109/ISMODE53584.2022.9742894.
[5] C. Steven and W. Wella, “The Right Sentiment Analysis Method of Indonesian Tourism in Social Media Twitter,” IJNMT (International Journal of New Media Technology), vol. 7, no. 2, pp. 102–110, 2020, doi: 10.31937/ijnmt.v7i2.1732.
[6] N. Hanafiah, Y. Setiawan, A. Buntaran, and M. Reynaldi, “Sentiment Analysis of Tourism Objects on Trip Advisor Using LSTM Method,” Journal of Computer Science and Technology Studies, vol. 4, no. 2, pp. 01–06, 2022, doi: 10.32996/jcsts.2022.4.2.1.
[7] R. K. Mishra, S. Urolagin, J. A. A. Jothi, A. S. Neogi, and N. Nawaz, “Deep Learning-based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic,” Frontiers in Computer Science, vol. 3, no. November, pp. 1–14, 2021, doi: 10.3389/fcomp.2021.775368.
[8] N. Leelawat et al., “Twitter data sentiment analysis of tourism in Thailand during the COVID-19 pandemic using machine learning,” Heliyon, vol. 8, no. 10, p. e10894, 2022, doi: 10.1016/j.heliyon.2022.e10894.
[9] A. I. J. Nisa, R. Prawiro, and N. Trisna, “Analisis Hybrid DSS untuk Menentukan Lokasi Wisata Terbaik,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 2, pp. 238–246, 2021, doi: 10.29207/resti.v5i2.2915.
[10] N. L. P. M. Putu, Ahmad Zuli Amrullah, and Ismarmiaty, “Analisis Sentimen dan Pemodelan Topik Pariwisata Lombok Menggunakan Algoritma Naive Bayes dan Latent Dirichlet Allocation,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 1, pp. 123–131, 2021, doi: 10.29207/resti.v5i1.2587.
[11] N. L. W. S. R. Ginantra, C. P. Yanti, G. D. Prasetya, I. B. G. Sarasvananda, and I. K. A. G. Wiguna, “Analisis Sentimen Ulasan Villa di Ubud Menggunakan Metode Naive Bayes, Decision Tree, dan K-NN,” Jurnal Nasional Pendidikan Teknik Informatika (JANAPATI), vol. 11, no. 3, pp. 205–215, 2022, doi: 10.23887/janapati.v11i3.49450.
[12] D. Arsa, I. Weni, and A. Fahreza, “Analisis Sentimen Terhadap Pariwisata di MasaCovid-19 Menggunakan Naïve Bayes,” Jurnal Telematika, vol. 17, no. 1, pp. 49–54, 2022.
[13] D. I. Af’idah, D. Dairoh, S. F. Handayani, R. W. Pratiwi, and S. I. Sari, “Sentimen Ulasan Destinasi Wisata Pulau Bali Menggunakan Bidirectional Long Short Term Memory,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 21, no. 3, pp. 607–618, 2022, doi: 10.30812/matrik.v21i3.1402.
[14] H. Hairani, A. Anggrawan, and D. Priyanto, “Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link,” International Journal on Informatics Visualization, vol. 7, no. 1, pp. 258–264, 2023.
[15] H. Hairani, K. E. Saputro, and S. Fadli, “K-means-SMOTE untuk menangani ketidakseimbangan kelas dalam klasifikasi penyakit diabetes dengan C4.5, SVM, dan naive Bayes,” Jurnal Teknologi dan Sistem Komputer, vol. 8, no. 2, pp. 89–93, Apr. 2020, doi: https://doi.org/10.14710/jtsiskom.8.2.2020.89-93.
[16] K. Guo, X. Wan, L. Liu, Z. Gao, and M. Yang, “Fault diagnosis of intelligent production line based on digital twin and improved random forest,” Applied Sciences (Switzerland), vol. 11, no. 16, pp. 1–18, 2021, doi: 10.3390/app11167733.
[17] E. F. Swana, W. Doorsamy, and P. Bokoro, “Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset,” Sensors, vol. 22, no. 9, pp. 1–21, 2022, doi: 10.3390/s22093246.
[18] Y. Sun, H. Zhang, T. Zhao, Z. Zou, B. Shen, and L. Yang, “A New Convolutional Neural Network with Random Forest Method for Hydrogen Sensor Fault Diagnosis,” IEEE Access, vol. 8, pp. 85421–85430, 2020, doi: 10.1109/ACCESS.2020.2992231.
[19] H. Hairani, A. Anggrawan, and D. Priyanto, “Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link,” International Journal on Informatics Visualization, vol. 7, no. 1, pp. 258–264, 2023.
[20] I. N. Switrayana, D. Ashadi, H. Hairani, and A. Aminuddin, “Sentiment Analysis and Topic Modeling of Kitabisa Applications using Support Vector Machine (SVM) and Smote-Tomek Links Methods,” International Journal of Engineering and Computer Science Applications (IJECSA), vol. 2, no. 2, pp. 81–91, Sep. 2023, doi: 10.30812/ijecsa.v2i2.3406.
Published
2024-01-04
How to Cite
Marzuki, K., Rady Putra, L., Hairani, H., Mardedi, L., & Guterres, J. (2024). Performance Improvement of The Random Forest Method Based on Smote-Tomek Link on Lombok Tourism Analysis Sentiment. Jurnal Bumigora Information Technology (BITe), 5(2), 151-158. https://doi.org/https://doi.org/10.30812/bite.v5i2.3166
Section
Articles