Hate Speech Detection for Banjarese Languages on Instagram Using Machine Learning Methods

Hate speech refers to verbal expression or communication that aims to provoke or discriminate against individuals. The Ministry of Communication and Information of Indonesia has encountered and dealt with 3,640 cases of hate speech transmitted through digital channels between 2018 and 2021. Particularly in South Kalimantan, hate speech in the local language, Banjarese has become increasingly prevalent in recent years. Surprisingly, there is a lack of research on using machine learning to detect hate speech in the Banjarese language, speciﬁcally on Instagram. Therefore, this study aimed to address this gap by constructing a dataset of Banjarese language hate speech and comparing various feature extraction and machine learning models to detect Banjarese language hate speech effectively. This research used several feature extraction techniques and machine learning methods to detect Banjarese language hate speech. The feature extraction methods used were Word N-Gram, Term Frequency-Inverse Document Frequency (TF-IDF), a combination of Word N-Gram and TF-IDF, Word2Vec, and Glove, while the machine learning methods used were Support Vector Machine (SVM), Na¨ıve Bayes, and Decision Tree. The results of this study revealed that the combination of TF-IDF for feature extraction and SVM as the model achieves exceptional performance. The average Recall, Precision, Accuracy, and F1-Score score exceeded 90%, demonstrating the model’s ability to identify Banjarese hate speech accurately.


INTRODUCTION
Hate speech is an expression, writing, action, or performance intended to provoke violence or discrimination against someone based on the characteristics of their society; represent, such as race, ethnicity, gender, sexual orientation, religion, and other characteristics [1]. Hate speech is one of the important topics of discussion related to social media analysis. It is mainly associated with the freedom of users to share content and opinions on existing social media platforms [2]. Freedom of opinion in social media has also led to increased hate speech through social media. Hate speech containing harsh words or phrases accelerates social conflict because harsh words/phrases trigger emotions [3]. This problem affects the dynamics and interactions of online social communities. In Indonesia, the Ministry of Communication and Information Technology of the Republic of Indonesia (KOMINFO) handled 3,640 SARA-based Hate Speeches in the Digital Space from 2018 to April 26, 2021. In South Kalimantan, hate speech cases have been rampant in recent years. Quoted from several news pages in 2018, a social media account uploaded content that allegedly contained elements of hate speech that were considered insulting to a cleric from Banjar, South Kalimantan. In 2020, a State Civil Apparatus (ASN) was arrested for spreading hoax news and hate speech against the Indonesian National Police (POLRI) institution. In January 2021, when a major flood hit South Kalimantan, H Sahbirin Noor became the target of hate speech from South Kalimantan residents in his actions to deal with floods. In South Kalimantan, most of the hate speech uttered by residents of South Kalimantan uses the Banjarese language. From several social media, the most common hate speech found in it is Instagram.
Hate speech detection has become crucial in social media platforms, including Instagram. The Banjarese language is one of the languages spoken in Indonesia, and detecting hate speech in this language on Instagram is a relatively new area of research. This review aims to provide an overview of previous studies that can support and strengthen novelty's contribution to detecting the hate speech of Banjarese Language on Instagram. Previous research has extensively explored the accuracy of machine learning methods in detecting hate speech on social media. The effectiveness of these methods depends on the language and dataset used [4]. For instance, a study focused on the English language employed a dataset comprising 14,509 tweets from Twitter. The study applied the SVM Linear algorithm to classify hate speech, achieving an accuracy rate of 78%. Furthermore, a research endeavor on the Indonesian language involved a dataset of 13,169 tweets from Twitter. The study used RFDT (Random Forest Decision Tree) and LP (Linear Programming) transformation methods. Without identifying targets, categories, and levels, the classification process achieved an accuracy rate of 77.36%. In contrast, the classification with the identification of targets, categories, and levels yielded an accuracy rate of 66.12% [3]. Salim and Suhartono [5] conducted a systematic literature review of different machine-learning methods for hate speech detection. The study can be used to make an experimental approach to detecting hate speech and abusive language. Zhang et al. [2] observed that extremist violence tends to increase online hate speech, particularly on messages directly advocating violence [6]. Sinyangwe established that in the fore model, to detect hate speech and offensive language on online social media platforms, the data set must be categorized and presented in statistical form after running the model. Ghosal and Jain [7] identified the need for artificial intelligence (AI) in hate speech research. Awal [8] explored fine-tuning language models (LMs) to perform hate speech detection, and these solutions have yielded significant performance.
Li and Ning [9] researched anti-Asian hate speech detection via data-augmented semantic relation inference. Boishakhi et al. [10] Used a combined approach to detect hate speech from contents using video, audio, and speech by extracting feature images and feature values from audio and text. They used Machine learning, Deep learning, and Natural language processing to detect hate speech. In [11], the researchers used Long Short-Term Memory for hate speech and abusive language detection on Indonesian Youtube comment sections. Deshpande et al [12]. They have conducted experiments for a binary hate speech classification task in Multilingual-Train Monolingual-Test, Monolingual-Train Monolingual-Test, and Language-Family-Train Monolingual Test scenarios. Mozafari et al. [13] investigated the feasibility of applying a meta-learning approach in cross-lingual few-shot hate speech detection by leveraging two meta-learning models based on optimization-based and metric-based (MAML and Proto-MAML) methods. These findings demonstrate the varying performance of different machine learning approaches in hate speech detection, depending on the language and dataset under consideration. Therefore, the novelty of this research lies in investigating hate speech detection using machine learning techniques, specifically in the context of the Banjarese language on social media platforms. In order to address this gap in the literature, this study aims to explore existing methods and identify the most accurate approach for detecting hate speech in the Banjarese language.
The data utilized in this study comprises comments extracted from local Instagram accounts known for frequently containing hate speech. Three commonly employed models were chosen for text classification purposes: Support Vector Machine (SVM), Naïve Bayes, and Decision Tree. SVM is commonly employed as a binary classifier in natural language processing (NLP) tasks [14]. It constructs margins between classes to maximize the distance between the margins and the classes, thereby minimizing classification errors [15]. Naïve Bayes, widely recognized for its effective assumptions and ease of implementation, is extensively used for text classification [16]. Decision trees have been extensively employed in various machine learning tasks, as they possess a lucid structure that offers insights into the training data and facilitates straightforward implementation [17]. This study aims to determine the most accurate method for detecting hate speech on social media, particularly Instagram. Consequently, the findings of this research can serve as a valuable reference when selecting an appropriate machine-learning method to assess the accuracy of hate speech detection in the Banjarese language. The researchers aspire that this study will benefit other scholars, particularly those in the low-resource local language like Banjarese.

RESEARCH METHOD
This research aims to create a Banjarese language hate speech dataset and try several combinations of feature extraction and machine learning models to determine which combination has the best accuracy in classifying hate speech. The method used in this study can be seen in Figure 1.

Data Collection
Because this study focuses on detecting hate speech in the Banjarese language, where previously there was no dataset, the researchers created a dataset for this study by collecting comments on local Instagram accounts where many comments were found in Banjarese. Comments are mainly collected from posts that discuss disasters, politics, or other topics that trigger hate speech.

Data Filtering and Annotation
At the data filtering stage, the researcher removed the redundant data and changed the comments in languages other than Banjarese into Banjarese in the dataset. The process of language change refers to the Banjarese language dictionary and is validated by linguists. Dataset labeling will be done manually by the researchers themselves. Labeling is done by marking each data as "hate speech" with the number 1 or "not hate speech" with the number 0. Before annotating the data, the researcher prepared guidelines as the rules of hate speech used in this study.

Preprocessing
Before classifying the data, it is necessary to carry out several preprocessing procedures. Case folding involves changing words in a text into uniform lowercase letters to facilitate further processing [18,19]. Stop Word Removal, stop word is a common word that often appears in a sentence but has no meaning [18]. Removing stop words can increase the signal-to-noise ratio in unstructured text and thus increase the statistical significance of terms that may be important for a specific task [20]. Punctuation Removal, this flag -used to divide the text into sentences, paragraphs, and phrases -affects the result of any text processing approach, especially what depends on the frequency of occurrence of words and phrases because punctuation marks are often used in the text [21]. Most text and document data sets contain many unnecessary characters, such as punctuation and special characters [22]. Critical punctuation and special characters are essential for the human understanding of documents, but they can harm classification algorithms [23]. URLs Removal, URLs do not correlate with the meaning of a comment, which can reduce classification performance, and are also not used in the following process [24,25].

Feature Extraction
Machine learning algorithms cannot understand classification rules on unprocessed text. Machine learning algorithms need numeric features to understand classification. Therefore, feature extraction is one of the main steps in text classification. This step extracts the main features from the raw text and represents the features extracted in numerical form [26]. In this research, the feature extraction used by the researcher is Word N-gram, TF-IDF, a combination of Word N-gram and TF-IDF, Word2Vec, and Glove, shown in Table 1 It is a feature representation technique representing "word importance" to a document in the document set. It works in a combination of the frequency of word appearance in a document with no. of documents containing that word. [28] Word2vec It is a technique to learn vector representation of words, which can further be used to train machine learning models. [29] Glove Global log bilinear regression model that combines the advantages of the two main model families in literature: global matrix factorization and local context window method [30] 2.

Classification
The researcher classified the data by dividing the data into several classes, with class divisions, namely: true negatives (TN), false positives (FP), false negatives (FN), and true positives (TP). Several machine learning algorithms are applied in this research: SVM, Naïve Bayes, and Decision Tree, which detect hate speech in the Banjarese language. This algorithm is implemented using the scikit learn library [31].

Evaluation
For evaluation, the researcher applies the F1-measure and Accuracy as performance evaluation metrics in this study [31]. Accuracy is the ratio of correct predictions to the total number of samples, while F1-measure is the harmonic mean of Precision and Recall. Classifier Performance is measured by calculating true negatives (TN), false positives (FP), false negatives (FN), and true positives (TP), which will form a confusion matrix. The confusion matrix table is shown in Table 2. True Positive (TP) is the proportion of positive instances classified correctly [32]. False Positive (FP) refers to the number of incorrectly classified hate speeches [33]. False Negative (FP) is the number of incorrect dictions that an instance is negative [34]. True Negative (TN) represents the number of negative examples if the classification result is correct [35]. Different performance metrics are used to assess the performance of the classifier that has been made. Models built in this experiment were evaluated by calculating their F1-score [36,37]. Some performance details metrics are discussed briefly below [26]. The accuracy rate is the total number of correctly classified over the total number of samples (true positives and true negatives) [26,38]. The formula for the accuracy rate is shown in (1). The recall is the proportion of actual positives which are predicted positive [38]. The formula for the recall rate is shown in (2). Precision is also a positive predictive value indicating the algorithm's accuracy for each model that detects hate speech [26]. The formula for the precision rate is shown in (3). F1-measure evaluates the harmonic value between recall and precision [38]. The formula for the F1-measure rate is shown in (4).
P recision = T P T P + F P

RESULT AND ANALYSIS 3.1. Banjarnese Hate Speech Dataset
The Banjarese language hates speech dataset created comes from comments on local South Kalimantan Instagram accounts that speak Banjarese. The process of making this dataset goes through several stages: data collecting, data filtering and annotation, preprocessing, feature extraction, classification, and evaluation. The CSV-formatted dataset consists of 15,481 data instances, 2,039 classified as hate speech, and 13,442 as not being hate speech (See Table 3). The sample dataset and labels used in this study are shown in Table 4. Due to the data imbalance, the F1-measure metric will be used to measure accuracy. F1-measure is a composite metric considering precision and recall. Precision measures correctly predicted hate speech instances out of all predicted hate speech instances, while recall measures correctly predicted hate speech instances out of all actual hate speech instances. F1-measure provides a balanced evaluation metric, particularly for imbalanced datasets. The F1 measure enhances model performance when data imbalance is addressed appropriately [4].

The Combination of Feature Extraction and Model for Detecting Banjarese Hate Speech
After the dataset is collected, the next step is to perform feature and model extraction and then compare the combination of feature and model extraction with the Recall, Precision, Accuracy, and F1-Measure metrics to find the most accurate combination of feature extraction and model in detecting hate speech in Banjarese language. The dataset created was divided into 8:2 compositions for training and testing compositions. The results of combining feature extraction and models using the dataset created can be seen in Table 5. In the accuracy metric, the combination of feature extraction and model with the highest score after being applied to the Banjarese language hate speech dataset is TF-IDF and SVM, with a score of 91%. In the recall metric, there are two feature extraction combinations, and the model with the highest score with the same number. TF-IDF and Naïve Bayes, as well as TF-IDF and SVM, are the combination of feature extraction and model that has the highest score after being applied to the Banjarese language hate speech dataset with the same score of 91%. In the Precision metric, there are differences between the two previous metrics. The combination ISSN: 2476-9843 of feature extraction and model with the highest score is TF-IDF and Naïve Bayes with a score of 91%. In the F1-Measure metric, SVM and TF-IDF are the combinations of feature extraction and model with the highest score after being applied to the Banjarese language hate speech dataset with a score of 91%. It can be seen from Table 5 that Naïve Bayes and SVM models with N-Gram and TF-IDF feature extraction dominate the highest values for F1-measure, Accuracy, Precision, and Recall metrics. However, due to unbalanced data, the accuracy metric used is F1-measure, so SVM and TF-IDF are the best model and combinations of feature extraction from this research to detect hate speech in the Banjarese language. Table 6 shows the comparison of this research with previous research. The research [? ] conducts a comparative analysis of studies focusing on different languages, including Javanese, Sundanese, Madurese, Minangkabau, and Musi. In contrast, research [? ] specifically compares previous research on Sundanese and Javanese languages. The novelty aspect of each study is emphasized in the corresponding column, and the outcomes of prior investigations are contrasted with the present study's findings. The results presented in reference [39] demonstrate a positive correlation between dataset size and performance improvement. The current study employs a Banjarese language dataset comprising 15,481 instances, achieving an F1-measure of 91%. These results indicate superior performance compared to previous studies conducted on other regional languages.
On the other hand, reference [? ] focuses on comparing different algorithms and feature extraction techniques. The earlier research achieved F1-measures ranging from 80% to 82% using N-Gram feature extraction in combination with algorithms such as SVM, RFDT, and Naïve Bayes for Sundanese and Javanese languages. However, the present study surpasses these previous findings by employing TF-IDF feature extraction. By utilizing this approach in conjunction with SVM, the F1-measure for detecting Banjarese hate speech reaches 91%. The effectiveness of the TF-IDF feature extraction method stems from its ability to assign higher weights to words that offer greater information content within a specific document while considering their rarity across the entire dataset. This weighting scheme proves instrumental in capturing the discriminative power of words specific to hate speech in the Banjarese language. Furthermore, TF-IDF effectively mitigates the influence of common words that frequently appear in both hate speech and non-hate speech documents. By downplaying the significance of these common words, the feature extraction method can focus more on identifying distinctive words and phrases that serve as indicators of hate speech in the Banjarese language. Thus, the TF-IDF feature extraction method takes into account the distribution of words across the entire dataset to enhance hate speech detection capabilities. Based on research using other regional languages, such as the Javanese language with a dataset of 3449, the Sundanese language with a dataset of 2207, the Madurese language with a dataset of 2773, Minangkabau language with a dataset of 3125, and Musi language with a dataset of 2564.

Novelty:
The results show that a larger number of datasets increases the performance results obtained.

Novelty:
The results of this study on Banjarese language using SVM, Naïve Bayes, and Decision Tree, as well as TF-IDF feature extraction, resulted in a much better F1 measure.

CONCLUSION
This research uses feature extraction and model experiments to investigate hate speech detection in the Banjarese language. By analyzing a dataset of 15,481 instances, including 2,039 hate speech samples and 13,442 non-hate speech samples, the study finds that the combination of TF-IDF feature extraction and the Support Vector Machine (SVM) model achieves an average accuracy score exceeding 90% for each metric. The research contributes novel insights to the field by addressing the lack of previous studies in hate speech detection for the Banjarese language, and it offers practical implications for future research in refining detection methods and enhancing accuracy. The findings of this study have significant implications for hate speech detection in the Banjarese language. The demonstrated effectiveness of the TF-IDF feature extraction method and SVM model underscores their potential as accurate tools for distinguishing Banjarese language hate speech. The research also provides a valuable dataset for further exploration, enabling researchers to investigate alternative approaches and refine detection methods specific to the Banjarese language. Overall, this study expands knowledge in hate speech detection and offers valuable insights for future research endeavors in this area.