Predicting Handling Covid-19 Opinion using Naive Bayes and TF-IDF for Polarity Detection

There are many public responses about implementing government policies related to Covid-19. Some have positive and negative opinions, especially on the ofﬁcial social media portal of the government. Twitter is one social media where people are free to express their opinions. This study aims to ﬁnd out the opinion of sentiment analysis on Twitter in implementing government policies related to Covid-19 to classify public opinion. Several stages in analyzing public sentiment are taken from the tweet data. The ﬁrst step is data mining to get the tweets that will be analyzed later. Furthermore, cleaning tweet data and equalizing tweet data into lowercase. After that, perform the tweet’s basic word search process and calculate its appearance frequency. Then calculate using the Nave Bayes method and determine the sentiment classiﬁcation of the tweet. The results showed that Indonesia’s public sentiment about covid-19 prevention is neutral. The performance of the application shows an Accuracy value of 76.7%. In conclusion this means that the Indonesian government needs to evaluate the policies taken to deal with Covid-19 to create positive opinions to create solid cooperation between the government and the government. Residents in tackling the Covid-19 outbreak


INTRODUCTION
A sickness outbreak rocked the world at the start of the year 2020. This epidemic spread rapidly, infecting nearly every country on the planet. This is a coronavirus infection, also known as Coronavirus illness . WHO has classified the world in a global emergency about this virus since January 2020, according to the World Health Organization [1]. The COVID '19 outbreak was widespread over the world in 2019. Within ten months, 38,085,762 infected people were found, with 28,628,813 (96 percent) recovering and 1,086,055 dying (4 percent). Even the transfer pace was lightning quick, affecting the entire world in seconds. The mortality rate was quite low. In response to the COVID-19 epidemic, practically all countries have enacted preventive policies such as social distance and remaining at home [2].
In Indonesia, the Indonesian government has declared a catastrophe emergency in response to the virus epidemic, which will last until February 2020. It continues and is being followed by the spread of countermeasures in different regions of Indonesia. The virus has spread to several areas until it occurred in May 2020. Other countermeasures implemented by the Indonesian government Ì 175 The results of the sentiment analysis classification are assessed based on three values, precision, recall, and F-Measure. Research conducted [20] shows that the values of precision, recall, and F-Measure obtained are below 60%, namely 59.11%, 56.80%, and 57.96%. Two polarities are used, namely positive and negative. This research examines the use of polarity with three polarities, namely, negative, neutral, and positive. With the implementation of government policies related to Covid-19, there are many polarities of public responses, some have positive opinions, and some have negative opinions, especially on Twitter, where people are free to express their opinions. Based on the previous discussion, the author tries to research opinion sentiment analysis on Twitter in implementing government policies related to Covid-19 using Naive Bayes. The Naive Bayes algorithm performs better than competing algorithms in several studies. Feature weighting using TF-IDF increases accuracy and performs heterogeneous classifications to produce accurate analysis results. It can be used as a reference for the government in making and evaluating covid-19 prevention policies. In Indonesia. To support the application of Naive Bayes in improving the classification of public opinion on Twitter social media, which was built using the python language. The purpose of this research is to find out the opinions of Indonesian people on Twitter regarding the policies taken by the government in tackling Covid-19 in Indonesia using natural language processing techniques, namely sentiment analysis by combining the Naive Bayes algorithm and TF-IDF feature selection.

RESEARCH METHOD
The stages of the research process carried out in this study are described in a research methodology flow as shown in Figure 1. The stages of the framework of thought have the following explanation:

Data Collection
Data collection is carried out at this stage to support how the system will be built. Before collecting Twitter data, it is necessary to prepare the Twitter API. Twitter data collection is done by the Twitter data scraping method. In collecting data, this study uses several keywords as presented in the Table 1 To perform textual data processing at the next stage, a TextBlob library is needed. Before entering the pre-processing stage, the data that has been collected will be labeled according to the sentiment polarity by the author and divided in three polarity, namely positive, neutral, and negative, which will later be used as a reference by the system in classifying sentiments. In labeling sentiments on datasets, this study uses machine learning tools that have previously been trained and tested using datasets that already have sentiments to assess a sentiment as having a positive, negative, or neutral polarity.

Preprocessing
Case folding, tokenizing, stopword, and stemming are some of the preprocessing phases that are performed on tweet data before processing [21]. Case folding is generalizing capital letters by changing all letters to lowercase. Tokenizing is the process of breaking these sentences into words or tokens, which are used to distinguish between word separators. Tokenizing also includes removing numbers, removing punctuation such as symbols and punctuation that are not important, and removing whitespace. Stopwords are removing words that are ignored in processing and are usually stored in stop lists. The stemming stage is the stage for reducing the number of different indices by returning words that have suffixes and prefixes to their primary form.

Planning
At this stage, a design for the distribution of training data and test data will be made based on the dataset that has been obtained. Because the dataset obtained was 1000 tweets in this study, this study will try to compare three datasets, namely 70%-30%, 80%-20%, and 90%-10%, based on references in previous studies.

Implementation
At this stage, according to the collected data, it is made into a web-based application using the Nave Bayes algorithm with the TF-IDF weighting feature using the python language.
a. Term Weighting In news classification, word weighting is used to get a category. One of the weighting methods is TF-IDF (Term Frequency Inverse Document Frequency). The weight value of a word (term) states the importance of the weight in representing the title.
In the TF-IDF weighting, the weight will be greater if the frequency of occurrence of the word is higher, but the weight will decrease if the word appears more often in other news.
The following is the equation (1) used for TF-IDF calculations: N is News and Df is Number of word where a word (term) appears b. Naïve Bayes Classifier The Naive Bayes algorithm in this study is used to classify public sentiment into three polarities: negative, neutral, and positive. The naive Bayes classification method utilizes probability calculations and statistics. These calculations are used to predict future probabilities based on experience. The following is the equation (2) used for probability calculations in programming: While the equation (3) is the naive Bayes equation which is used to carry out the classification Based on equation (3), it can be seen that the dependence of each document on a collection of documents is represented by the symbol P (V j ), while P (a i |V j ) represents the suitability of the appearance of the word W k in the document with the class category V j , then the frequency of the k-word in each category defined by the symbol n k , and | vocabulary | means the number of words in the test document.

Testing
In the last stage, the training dataset was tested by looking at the level of accuracy generated by the training from each experiment. Then perform sentiment analysis based on available data and calculate the level of precision, recall, and accuracy using a confusion matrix. In this study, the analysis was carried out by dividing the training data and testing data into three categories to test a good level of accuracy for sentiment classification with 1000 datasets, namely 70%-30%, 80%-20%, 90%-10%.Then it will use the formula and look for precision, recall, accuracy and f-1 measure values to determine the classification results with the following formula (4), (5), (6), and (7) [6].
Accuracy = T P + T N T P + F P + T N + T N × 100 Predicting Handling Covid-19 . . . (Supangat) ISSN: 2476-9843 The accuracy formula determines the ratio of correct predictions (positive and negative) to the entire data. Then the precision formula is used to measure the ratio of true positive predictions compared to the overall positive predicted results. Then, the recall formula is used to calculate the ratio of true positive predictions compared to all true positive data. Furthermore, the F1-Score is used to compare the weighted average precision and recall.

RESULT AND ANALYSIS 3.1. Data Collection
Data collection that was carried out using the Twitter API shows that there are 1000 data collected as a dataset. Of the 1000 datasets, labeling was carried out, which were classified into three sentiments: negative, neutral, and positive. The data that has been collected is labeled according to the sentiment polarity by the author uses machine learning tools that have previously been trained and tested using datasets that already have sentiments to assess a sentiment as having a positive, negative, or neutral polarity. Table 2 shows the number of dataset divisions based on sentiment after labeling. In Table 2, it is explained that the dataset to be used in this study does not have a balanced number of classifications. The highest number was dominated by negative sentiment, followed by neutral and positive sentiment. This will likely affect the results and the level of accuracy resulting from sentiment analysis.

a. Case Folding
The use of capital letters is not uniform across all text sources. As a result, case folding is required to convert the document's complete text into a standard format (uppercase or lowercase). For example, users who type "Berita," "BERITA," or "Berita" to acquire information about "BERITA" get the same retrieval result, namely "Berita." Case folding is the process of transforming all uppercase letters in a document to lowercase. The letters 'a' to 'Z' are the only acceptable ones. Other than letters, all characters will be eliminated. An example of the tokenizing stage can be seen in Table 3. b. Tokenizing The tokenizing stage is then utilized to break down the sentences in the string into single-word chunks. An example of the tokenizing stage can be seen in Table 4. Table 4. Tokenizing Input Text Output Text khofifah menjadi pembicara sharing session penanggulangan covid yang digelar oleh gatra media group bersama satgas penanganan covid gubernur jatim diketahui bahwa gubernur khofifah menjadi pembicara sharing session penanggulangan covid gubernur jatim sharing session penanggulangan covid akan dibicarakan oleh gubernur sharing | session | penanggulangan | covid | akan | dibicarakan | oleh | gubernur c. Stopword At this stage, the disposal of words that are less important or words that often appear (Stopwords), such as connecting words and adverbs that are not unique words, such as "sebuah", "oleh", "pada", and so on. The stop words in this investigation were generated using a modified Sastrawi library [20]. Table 5 provides an illustration of the stopword stage.
The stemming stage removes affixes, prefixes, and suffixes to change the words back to their original form. Table 5 provides an illustration of the stemmer stage.
Word weighting is used in news classification to determine a category. TF-IDF (Term FrequencyInverse Document Frequency) is one of the weighting methods. Its weight value expresses the importance of a word (term) in representing the title. The weight will be more significant in the TF-IDF weighting if the frequency of occurrence of the term is higher. However, it will be lower if the word appears more frequently in other news.

Planning
After the preprocessing stage, the dataset will be separated into training and testing data before being analyzed using the Naive Bayes and TF-IDF algorithms. Based on research [18], the dataset is divided by a ratio of 70%:30% and 80%:20% and utilizing the confusion matrix to calculate the accuracy. In this study, try adding a 90%:10% ratio. Table 7 shows the number of divisions of training data and training data based on the comparison above.

Implementation and Testing
At this stage, implementation, testing and evaluation of the performance of the proposed model will be carried out using the confusion matrix and calculating the values of precision, recall, f-score, and accuracy. Table 8. describes the results of data acquisition and then, through preprocessing 1000 existing data, divided into three sentiments, namely category one is positive, category 2 is neutral, and category 3 is negative. The data that has been normalized before being entered into the classification engine is separated into training data and test data. Based on calculations using a confusion matrix, this study resulted in Sentiment Analysis using DMNB and TF-IDF on Twitter regarding the Covid-19 response into three categories, namely positive, neutral, and negative with positive sentiment as much as 28.7%, neutral as much as 43.9%, and negative as much as 27.4%. Then, from the results obtained in Table 8, an evaluation will be carried out using a formula to determine the results of precision, recall, and F-Score calculations. The data will be grouped according to the formula in the formula. The outcomes of additional evaluation measures for negative, neutral, and positive tweets are presented in Table 9. According to Tables 8 and 9, the Naive Bayes classifier has a recall measure of 0.71 for negative tweets, 0.87 for neutral tweets, and 0.70 for positive tweets. In addition, the experiment achieves 0.77 average weighted precision, 0.76 average weighted recall, and 0.76 average weighted f-score. This study demonstrates that the precision of sentiment analysis is 76%. These results indicate that feature selection using TF-IDF and increasing the number of polarities can also increase the precision, recall, and F-Score of the confusion matrix calculation in the study [17,20]. This is because term weighting using TF-IDF can analyze and classify tweets more heterogeneously. This study presents a comprehensive discussion based on the results and discussion above.

Novelty:
This research shows that the comparison of 70%:30% is the comparison with the highest level of accuracy, and there is an increase in accuracy compared to before. [17,19,20] Comparison: According to [17] : Dataset: 1000 The results of previous studies show that the calculation of the Confusion Matrix value is below 70%. This study investigates whether adding the number of datasets and neutral polarity can increase the accuracy and calculation of the confusion matrix in classifying.

CONCLUSION
The contribution of this paper is combines the discriminative multinomial nave Bayes (DMNB) method with the TF-IDF term weighting approach for classify tweet more heterogeneously and increase the accuracy. This paper also shows that the polarities neutral is needed for the evaluation of the Indonesian government for the best policies taken to handling COVID-19 for the future. The dataset consists of 1000 Indonesian tweets. According to data testing, the proposed approach has an average precision class of 77%, recall of 76%, and f-score of 76%. In addition, the accuracy is 76.7%. Sentiment Analysis using DMNB and TF-IDF on Twitter divides the Covid-19 response into three categories: positive, neutral, and negative, with positive sentiment at 28.7%, neutral at 43.9%, and negative at 27.4%. This means that the actions taken by the government in dealing with covid-19 in Indonesia show a neutral sentiment where the Indonesian government needs to evaluate the policies taken to deal with Covid-19 to create positive opinions to create solid cooperation between the government and the government. Residents in tackling the Covid-19 outbreak. It is hoped that further specific research can be carried out by looking at the polarity of handling Covid-19 in Indonesia using other methods and by adding emotes to the dataset. This research implies that the government is expected to pay more attention to policies based on the community's perspective so that they can make better policies in the future. From the dataset obtained, the influence of socialization of policies on the community also needs to be considered so that the community understands better and can collaborate to reduce the number of Covid-19 victims in Indonesia. Suggestions for future research are adding a more significant number of tweets to improve the results of classification accuracy and a dataset that can represent all tweets from Indonesian people in all regions to capture the overall sentiment of people in Indonesia. As well as using a dataset that is not limited to words but includes emotes and satirical words.

5.
ISSN: 2476-9843 a critical revision of the article and final approval of the final version to be published.
FUNDING STATEMENT This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.