Ekstraksi Informasi Destinasi Wisata Populer Jawa Timur Menggunakan Depth-First Crawling

  • Sepyan Purnama Kristanto politeknik negeri banyuwangi
  • Lutfi Hakim politeknik negeri banyuwangi
Keywords: Web Mining, Web Crawling, Tripadvisor, Jawa Timur


Travel Destinations are an inseparable part of human life today. As one of the provinces with a large area, East Java is one of the most visited areas for its tourism. Many people are competing in finding information related to these tourist destinations on the internet, one of which is the Tripadvisor application. Of the many tourist attractions, several tourist attractions have different attractions and experiences each time. Tourists have widely used the Tripadvisor application in determining the location where they will visit on their vacation activities. With various features ranging from reviews and recommendations for sharing photos, TripAdvisor is one of the best applications in the inventory of tourist attractions. Of the many tourist destinations, it is necessary to analyze and evaluate both tourist attractions that have many visitors with tourist attractions that are rarely visited by both local and foreign visitors. This goal, information mining (web mining), was carried out on the TripAdvisor application to obtain information on East Java Province's popular destinations. Crawling results on the TripAdvisor website, obtained various kinds of information such as names of tourist attractions, locations, visitor reviews, photos, and ratings of these tourist attractions. Spatial Analysis, a Tourist Sentiment Analyst on tourist objects, can then be carried out. It can also be developed into the recommendation system for the best tourist attractions in East Java Province


Download data is not yet available.


1] Wikipedia, “Jatim Info.” https://id.wikipedia.org/wiki/Jawa_Timur.
[2] “pariwisata.” http://disbudpar.jatimprov.go.id/.
[3] R. Hanifah and I. S. Nurhasanah, “Implementasi Web Crawling Untuk Mengumpulkan Web Crawling Implementation for Collecting,” J. Teknol. Inf. dan Ilmu Komput., vol. 5, no. 5, pp. 531–536, 2018, doi: 10.25126/jtiik20185842.
[4] E. Susanti and K. Mustofa, “Ekstraksi Informasi Halaman Web Menggunakan Pendekatan Bootstrapping pada Ontology-Based Information Extraction,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 9, no. 2, p. 111, 2015, doi: 10.22146/ijccs.7540.
[5] R. Qian, K. Zhang, and G. Zhao, “A topic-specific Web crawler based on content and structure mining,” Proc. 2013 3rd Int. Conf. Comput. Sci. Netw. Technol. ICCSNT 2013, pp. 458–461, 2014, doi: 10.1109/ICCSNT.2013.6967153.
[6] N. Pawar, “Search Medicinal Plants and Relevant Diseases.”
[7] H. Kang, S. J. Yoo, and D. Han, “Modeling web crawler wrappers to collect user reviews on shopping mall with various hierarchical tree structure,” 2009 Int. Conf. Web Inf. Syst. Mining, WISM 2009, pp. 69–73, 2009, doi: 10.1109/WISM.2009.22.
[8] A. B. Archana and J. Kumar, “Location based semantic information retrieval from web documents using web crawler,” Proc. 2015 Int. Conf. Appl. Theor. Comput. Commun. Technol. iCATccT 2015, pp. 370–375, 2016, doi: 10.1109/ICATCCT.2015.7456912.
[9] L. B. Ilmawan, “Membangun Web Crawler Berbasis Web Service Untuk Data Crawling Pada Website Google Play Store,” Ilk. J. Ilm., vol. 10, no. 2, pp. 215–224, 2018, doi: 10.33096/ilkom.v10i2.282.215-224.
[10] Z. Shi, M. Shi, and W. Lin, “The Implementation of Crawling News Page Based on Incremental Web Crawler,” Proc. - 4th Int. Conf. Appl. Comput. Inf. Technol. 3rd Int. Conf. Comput. Sci. Appl. Informatics, 1st Int. Conf. Big Data, Cloud Comput. Data Sci. Eng. ACIT-CSII-BCD 2016, pp. 348–351, 2017, doi: 10.1109/ACIT-CSII-BCD.2016.073.
[11] Y. Wang, Z. Hong, and M. Shi, “Research on LDA Model Algorithm of News-oriented Web Crawler,” Proc. - 17th IEEE/ACIS Int. Conf. Comput. Inf. Sci. ICIS 2018, pp. 748–753, 2018, doi: 10.1109/ICIS.2018.8466502.
[12] N. C. C. A. Phitaloka, “Web Content Mining Di Sektor Perbankan Pada Lq45 Untuk Pendukung Keputusan Investasi Saham,” Telematika, vol. 16, no. 1, p. 18, 2019, doi: 10.31315/telematika.v16i1.2989.
[13] S. P. Kristanto, J. A. Prasetyo, and E. Pramana, “Naive Bayes Classifier on Twitter Sentiment Analysis BPJS of HEALTH,” Proc. - 2019 2nd Int. Conf. Comput. Informatics Eng. Artif. Intell. Roles Ind. Revolut. 4.0, IC2IE 2019, pp. 24–28, 2019, doi: 10.1109/IC2IE47452.2019.8940900.
[14] S. Budi, “Text Mining Untuk Analisis Sentimen Review Film Menggunakan Algoritma K-Means,” Techno.Com, vol. 16, no. 1, pp. 1–8, 2017, doi: 10.33633/tc.v16i1.1263.
[15] M. Ibrahim, O. Abdillah, A. F. Wicaksono, and M. Adriani, “Buzzer Detection and Sentiment Analysis for Predicting Presidential Election Results in a Twitter Nation,” in Proceedings - 15th IEEE International Conference on Data Mining Workshop, ICDMW 2015, Jan. 2016, pp. 1348–1353, doi: 10.1109/ICDMW.2015.113.
[16] W. A. Luqyana, I. Cholissodin, and R. S. Perdana, “Analisis Sentimen Cyberbullying Pada Komentar Instagram dengan Metode Klasifikasi Support Vector Machine,” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. 11, pp. 4704–4713, 2018.
[17] “No Title.” https://www.scrapehero.com/how-to-scrape-tripadvisor/.
How to Cite
Kristanto, S., & Hakim, L. (2021). Ekstraksi Informasi Destinasi Wisata Populer Jawa Timur Menggunakan Depth-First Crawling. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 21(1), 229-236. https://doi.org/https://doi.org/10.30812/matrik.v21i1.1081