Gender Classification of Twitter Users Using Convolutional Neural Network

  • Fitra Ahya Mubarok Universitas Lambung Mangkurat, Banjarmasin, Indonesia
  • Mohammad Reza Faisal Universitas Lambung Mangkurat, Banjarmasin, Indonesia http://orcid.org/0000-0001-5748-7639
  • Dwi Kartini Universitas Lambung Mangkurat, Banjarmasin, Indonesia
  • Dodon Turianto Nugrahadi Universitas Lambung Mangkurat, Banjarmasin, Indonesia
  • Triando Hamonangan Saragih Universitas Lambung Mangkurat, Banjarmasin, Indonesia
Keywords: Gender classification, Social media analysis, Twitter, Word2vec

Abstract

Social media has become a place for social media analysts to obtain data to gain deeper insights and understanding of user behavior, trends, public opinion, and patterns associated with social media usage. Twitter is one of the most popular social media platforms where users can share messages or ”tweets” in a short text format. However, on Twitter, user information such as gender is not shown, but without realizing it or not, there is information about it in an unstructured manner. In social media analytics, gender is one of the important data that someone likes, so this research was conducted to determine the best accuracy for gender classification. The purpose of this study was to determine whether using combined data can improve the accuracy of gender classification using data from Twitter, tweets, and descriptions. The method used was word vector representation using word2vec and the application of a 2D Convolutional Neural Network (CNN) model. Word2vec was used to generate word vector representations that take into account the context and meaning of words in the text. The 2D CNN model extracted features from the word vector representation and performed gender classification. The research aimed to compare tweet data, descriptions, and a combination of tweets and descriptions to find the most accurate. The result of this study was that combined data between tweets and

Downloads

Download data is not yet available.

References

[1] U. Sivarajah, Z. Irani, S. Gupta, and K. Mahroof, “Role of big data and social media analytics for business to business sustainability:
A participatory web context,” Industrial Marketing Management, vol. 86, no. April, pp. 163–179, apr 2020.
[2] J. Choi, J. Yoon, J. Chung, B. Y. Coh, and J. M. Lee, “Social media analytics and business intelligence research: A systematic
review,” Information Processing and Management, vol. 57, no. 6, pp. 1–18, nov 2020.
[3] M. Vicente, F. Batista, and J. P. Carvalho, “Gender detection of Twitter users based on multiple information sources,” in Studies
in Computational Intelligence. Springer Verlag, 2019, vol. 794, pp. 39–54.
[4] E. Fosch-Villaronga, A. Poulsen, R. Søraa, and B. Custers, “A little bird told me your gender: Gender inferences in social
media,” Information Processing & Management, vol. 58, no. 3, pp. 1–13, may 2021.
[5] A. Selma Zakia, “Klasifikasi Jenis Kelamin Pengguna Twitter dengan menggunakan Metode BM25 dan K-Nearest Neighbor
(KNN),” Tech. Rep. 10, 2020.
[6] S. Park and J. Woo, “Gender classification using sentiment analysis and deep learning in a health web forum,” Applied Sciences
(Switzerland), vol. 9, no. 6, pp. 1–12, 2019.
[7] R. Alroobaea, S. Alafif, S. Alhomidi, A. Aldahass, R. Hamed, R. Mulla, and B. Alotaibi, “A Decision Support System for Detecting
Age and Gender from Twitter Feeds based on a Comparative Experiments,” International Journal of Advanced Computer
Science and Applications, vol. 11, no. 12, pp. 370–376, dec 2020.
[8] P. Vashisth and K. Meehan, “Gender Classification using Twitter Text Data,” in 2020 31st Irish Signals and Systems Conference
(ISSC). IEEE, jun 2020, pp. 1–6.
[9] I. R. Hendrawan, E. Utami, and A. D. Hartanto, “Analisis Perbandingan Metode Tf-Idf dan Word2vec pada Klasifikasi Teks
Sentimen Masyarakat Terhadap Produk Lokal di Indonesia,” Smart Comp, vol. 11, no. 3, pp. 497–503, 2022.
[10] M. R. Faisal, M. I. Mazdadi, R. A. Nugroho, F. Abadi, and Others, “EyeWitness Message Identification on Forest Fires Disaster
Using Convolutional Neural Network,” Journal of Data Science and Software Engineering, vol. 2, no. 2, pp. 100–108, 2021.
[11] K. Y. Firlia, M. R. Faisal, D. Kartini, R. A. Nugroho, and F. Abadi, “Analysis of New Features on the Performance of the
Support Vector Machine Algorithm in Classification of Natural Disaster Messages,” in Proceedings - 2021 4th International
Conference on Computer and Informatics Engineering: IT-Based Digital Industrial Innovation for the Welfare of Society, IC2IE
2021, 2021, pp. 317–322.
[12] M. Rusli, “Ekstraksi Fitur Menggunakan ModelWord2vec pada Sentiment Analysis Kolom Komentar Kuisioner Evaluasi Dosen
oleh Mahasiswa,” KLIK - Kumpulan Jurnal Ilmu Komputer, vol. 7, no. 1, pp. 35–47, mar 2020.
[13] M. Padhilah, D. Kartini, and D. T. Nugrahadi, “Implementasi Neural Network Multilayer Perceptron Dan Stemming Nazief &
Adriani Pada Chatbot Faq Prakerja,” Jurnal Sains Komputer & Informatika (J-SAKTI), vol. 6, no. 2, pp. 671–685, 2022.
[14] A. Nurdin, B. Anggo, S. Aji, A. Bustamin, and Z. Abidin, “Perbandingan Kinerja Word Embedding Word2vec, Glove, dan
Fasttext pada Klasifikasi Teks,” Jurnal Tekno Kompak, vol. 14, no. 2, pp. 74–79, 2020.
[15] L. Islami, I. Budiman, M. R. Faisal, and F. Abadi, “Prototype Generation Berdasarkan Geometric Mean Untuk Data Reduction
pada Algoritma K Nearest Neighbour,” Jurnal Data Science & Informatika ( JDSI ), vol. 2, no. 2, pp. 53–59, 2022.
[16] E. M. Dharma, F. L. Gaol, H. L. H. S. Warnars, and B. Soewito, “the Accuracy Comparison Among Word2Vec, Glove, and
Fasttext Towards Convolution Neural Network (CNN) Text Classification,” Journal of Theoretical and Applied Information
Technology, vol. 100, no. 2, pp. 349–359, 2022.
[17] J. Bai, I. Shim, and S. Park, “MEXN: Multi-Stage Extraction Network for Patent Document Classification,” Applied Sciences,
vol. 10, no. 18, pp. 1–14, sep 2020.
[18] N. Ketkar and J. Moolayil, Deep Learning with Python. Berkeley: Apress, 2021.
[19] G. S. Nandini, A. S. Kumar, and C. K, “Dropout technique for image classification based on extreme learning machine,” Global
Transitions Proceedings, vol. 2, no. 1, pp. 111–116, 2021.
[20] E. E.-D. Hemdan, M. A. Shouman, and M. E. Karar, “COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose
COVID-19 in X-Ray Images,” 2020.
Published
2023-11-09
How to Cite
Mubarok, F. A., Reza Faisal, M., Kartini, D., Nugrahadi, D. T., & Saragih, T. H. (2023). Gender Classification of Twitter Users Using Convolutional Neural Network. MATRIK : Jurnal Manajemen, Teknik Informatika Dan Rekayasa Komputer, 23(1), 79-92. https://doi.org/https://doi.org/10.30812/matrik.v23i1.3318
Section
Articles