Detecting Disaster Trending Topics on Indonesian Tweets Using BNgram
Abstract
People on social media share information about natural disasters happening around them, such as the details about the situation and where the disasters are occurring. This information is valuable for understanding real-time events, but it can be challenging to use because social media posts often have an informal style with slang words. This research aimed to detect trending topics as a way to monitor and summarize disaster-related data originating from social media, especially Twitter, into valuable information. The research method used was BNgram. The selection of BNgram for detecting trending topics was based on its proven ability to recall topics well, as shown in previous research. Some stages in detection were data preprocessing, named entity recognition, calculation using DF-IDF, and
hierarchical clustering. The resulting trending topics were compared with the topics obtained using the Document pivot method as the basic method. This research showed that BNgram performs better in detecting trending natural disaster-based topics compared to the Document pivot. Overall, BNgram had a higher topic recall score, and its keyword precision and keyword recall values were slightly better. In conclusion, recognizing the significance of social media in disaster-related information can increase disaster response strategies and situational awareness. Based on the comparison, BNgram was proven to be a more effective method for extracting important information from social media during natural disasters.
Downloads
References
Terhadap Kemiskinan Di Wilayah Berdominasi Perkotaan Di Provinsi Jawa Barat Periode 2017-2020,” J-ESA (Jurnal Ekonomi
Syariah), vol. 5, no. 1, pp. 14–34, 2022.
[2] B. Kusumasari and N. P. A. Prabowo, “Scraping social media data for disaster communication: how the pattern of Twitter users
affects disasters in Asia and the Pacific,” Natural Hazards, vol. 103, no. 3, pp. 3415–3435, 2020.
[3] R. Nugroho, C. Paris, S. Nepal, J. Yang, and W. Zhao, “A survey of recent methods on deriving topics from Twitter: algorithm
to evaluation,” Knowledge and Information Systems, vol. 62, no. 7, pp. 2485–2519, 2020.
[4] Y. Shi, T. Sayama, K. Takara, and K. Ohtake, “Detecting flood inundation information through Twitter: The 2015 Kinu River
flood disaster in Japan,” Journal of Natural Disaster Science, vol. 40, no. 1, pp. 1–13, 2019.
[5] M. Sreenivasulu and M. Sridevi, “Comparative study of statistical features to detect the target event during disaster,” Big Data
Mining and Analytics, vol. 3, no. 2, pp. 121–130, 2020.
[6] S. Mendon, P. Dutta, A. Behl, and S. Lessmann, “A Hybrid Approach of Machine Learning and Lexicons to Sentiment Analysis:
Enhanced Insights from Twitter Data of Natural Disasters,” Information Systems Frontiers, vol. 23, no. 5, pp. 1145–1168, 2021.
[7] P. H. Barros, I. Cardoso-Pereira, H. Allende-Cid, O. A. Rosso, and H. S. Ramos, “Leveraging Phase Transition of Topics for
Event Detection in Social Media,” IEEE Access, vol. 8, pp. 70 505–70 518, 2020.
[8] G. C. a. Wibowo, S. Y. J. Prasetyo, and I. Sembiring, “Tsunami Vulnerability and Risk Assessment Using Machine Learning
and Landsat 8,” MATRIK : Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 22, no. 2, pp. 365–380, 2023.
[9] D. Priyanto, B. K. Triwijoyo, D. Jollyta, H. Hairani, N. Gusti, and A. Dasriani, “Data Mining Earthquake Prediction with
Multivariate Adaptive Regression Splines and Peak Ground Acceleration,” Matrik: Jurnal Manajemen, Teknik Informatika, dan
Rekayasa Komputer, vol. 22, no. 3, pp. 583–592, 2023.
[10] M. R. Aprillya and U. Chasanah, “Geographic Information System Multi Attribute Utility Theory for Flood Mitigation in
Agricultural Sector,” Matrik: Jurnal Manajemen, Teknik Informatika, dan Rekayasa Komputer, vol. 22, no. 1, pp. 117–128,
2022.
[11] Apriani, S. J. Putra, I. Ismarmiaty, and N. G. A. Dasriani, “E-Alert Application In Facing Earthquake Disaster,” MATRIK :
Jurnal Manajemen, Teknik Informatika dan Rekayasa Komputer, vol. 19, no. 2, pp. 187–194, 2020.
[12] L. M. Aiello, G. Petkos, C. Martin, D. Corney, S. Papadopoulos, R. Skraba, A. Goker, I. Kompatsiaris, and A. Jaimes, “Sensing
trending topics in twitter,” IEEE Transactions on Multimedia, vol. 15, no. 6, pp. 1268–1282, 2013.
[13] Indra, E. Winarko, and R. Pulungan, “Trending topics detection of Indonesian tweets using BN-grams and Doc-p,” Journal of
King Saud University - Computer and Information Sciences, vol. 31, no. 2, pp. 266–274, 2018.
[14] N. Aliyah Salsabila, Y. Ardhito Winatmoko, A. Akbar Septiandri, and A. Jamal, “Colloquial Indonesian Lexicon,” in Proceedings
of the 2018 International Conference on Asian Language Processing, IALP 2018. IEEE, 2018, pp. 226–229.
[15] M. Adriani, J. Asian, B. Nazief, S. M. Tahaghoghi, and H. E. Williams, “Stemming Indonesian,” ACM Transactions on Asian
Language Information Processing (ACM Trans. Asian Lang. Inf. Process.), vol. 6, no. 4, pp. 1–33, 2007.
[16] J. Li, A. Sun, J. Han, and C. Li, “A Survey on Deep Learning for Named Entity Recognition,” IEEE Transactions on Knowledge
and Data Engineering, vol. 34, no. 1, pp. 50–70, 2022.
[17] R. Al-Rfou, V. Kulkarni, B. Perozzi, and S. Skiena, “POLYGLOT-NER: Massive Multilingual named entity recognition,” in
SIAM International Conference on Data Mining 2015, SDM 2015, 2015, pp. 586–594.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.