Determining Bullying Text Classification Using Naive Bayes Classification on Social Media

  • Ade Clinton Sitepu Universitas Potensi Utama
  • Wanayumini Wanayumini Universitas Potensi Utama
  • Zakarias Situmorang Universitas Katolik Santo Thomas Medan
Keywords: Text Mining, Sentiment Analysis, Naïve Bayes Algorithm, Confussion Matrix, Phyton Programming

Abstract

Cyber-bullying includes repeated acts with the aim of scaring, angering, or embarrassing those who are targeted Cyber-bullying is happening along with the rapid development of technology and social media in society. The media and users need to filter out bully comments because they can indirectly affect the mental psychology that reads them especially directly aimed at that person. By utilizing information mining, the system is expected to be able to classify information circulating in the community. One of the classification techniques that can be applied to text-based classification is Naïve Bayes. The algorithm is good at performing the classification process. In this research, the precision of the algorithm's has been carried out on 1000 comment datasets. The data is grouped manually first into the labels "bully" and "not bully" then the data is divided into training data and test data. To test the system's ability, the classified data is analyzed using the confusion matrix method. The results showed that the Naïve Bayes Algorithm got the level of precision at 87%. and the level of  area under the curve (AUC) at 88%. In terms of speed of completing the system, the Naïve Bayes Algorithm has a very good rate of speed with completion time of 0.033 seconds.

References

El Asam, A., & Samara, M. (2016). Cyberbullying and the law: A review of psychological and legal challenges. Computers in Human Behavior, 65, 127–141.
Gal-Tzur, A., Grant-Muller, S. M., Kuflik, T., Minkov, E., Nocera, S., & Shoor, I. (2014). The potential of social media in delivering transport policy goals. Transport Policy, 32, 115–123.
Haddi, E., Liu, X., & Shi, Y. (2013). The role of text pre-processing in sentiment analysis. Procedia Computer Science, 17, 26–32.
Hendayanti, N. P. N., Putri, G. A. M. A., & Nurhidayati, M. (2018). Ketepatan Klasifikasi Penerima Beasiswa STMIK STIKOM Bali dengan Hybrid Self Organizing Maps dan Algoritma K-Mean. Jurnal Varian, 2(1), 1–7.
Jurafsky, D., & Martin, J. H. (2018). Speech and language processing (draft). Chapter A: Hidden Markov Models (Draft of September 11, 2018). Retrieved March, 19, 2019.
Laksana, B. A. (2017). Mensos: 84% Anak Usia 12-17 Tahun Mengalami Bullying. DetikNews; Detik News. https://news.detik.com/berita/d-3568407/mensos-84-anak-usia-12-17-tahun-mengalami-bullying
Livingstone et al., S. (2020). Cyberbullying: Apa itu dan bagaimana menghentikannya. UNICEF Indonesia. https://www.unicef.org/indonesia/id/child-protection/apa-itu-cyberbullying
Norwawi, N. M. (2020). Recognition decision-making model using temporal data mining technique. Journal of Information and Communication Technology, 4, 37–56.
Sanda, A. (2016). Tinjauan Yuridis Terhadap Fenomena Cyber Bullying Sebagai Kejahatan Di Dunia Cyber Dikaitkan Dengan Putusan Mahkama Konstitusi Nomor 50/Puu-Vi/2008. Repository.
Sari, Y., & Stevenson, M. (2016). Exploring Word Embeddings and Character N-Grams for Author Clustering. CLEF (Working Notes), 984–991.
Septian, G., Susanto, A., & Shidik, G. F. (2017). Indonesian news classification based on NaBaNA. 2017 International Seminar on Application for Technology of Information and Communication (ISemantic), 175–180.
Sitepu, A. C., Wanayumini, W., & Situmorang, Z. (2020). Comparative of ID3 and Naive Bayes in Predictid Indicators of House Worthiness. Jurnal Ipteks Terapan, 14(3), 212–218.
Sussolaikah, K., & Alwi, A. (2016). Sentiment Analysis Terhadap Acara Televisi Mata Najwa Berdasarkan Opini Masyarakat Pada Microblogging Twitter (Issue November). Universitas Muhammdiyah ponorogo. http://eprints.umpo.ac.id/2355/
Urano, Y., Takizawa, R., Ohka, M., Yamasaki, H., & Shimoyama, H. (2020). Cyber bullying victimization and adolescent mental health: the differential moderating effects of intrapersonal and interpersonal emotional competence. Journal of Adolescence, 80, 182–191.
Utomo, C. P., Pratiwi, P. S., Kardiana, A., Budi, I., & Suhartanto, H. (2014). Best-Parameterized Sigmoid ELM for Benign and Malignant Breast Cancer Detection. International Conference on Artificial Intelligence and Pattern Recognition, AIPR 2014.
Widianto, M. H. (2019). Algoritma Naive Bayes. Binus University. https://binus.ac.id/bandung/2019/12/algoritma-naive-bayes/
Wu, X., & Kumar, V. (2009). The top ten algorithms in data mining. CRC press.
Published
2021-04-30
How to Cite
[1]
A. Sitepu, W. Wanayumini, and Z. Situmorang, “Determining Bullying Text Classification Using Naive Bayes Classification on Social Media”, Jurnal Varian, vol. 4, no. 2, pp. 133 - 140, Apr. 2021.
Section
Articles