Hybrid Model for Sentiment Analysis of Bitcoin Prices using Deep Learning Algorithm

Bitcoin is a decentralized digital currency that a single authority or government does not control. Bit-coin uses blockchain technology to verify transactions and guarantee user security and privacy. The ﬂuctuating value of bitcoin is inﬂuenced by opinions that develop because many people use these opinions as a basis for buying or selling bitcoins. Therefore, knowledge to determine the market conditions of bitcoin based on public opinion is very necessary. This study aimed to develop a hybrid model for bitcoin sentiment analysis. The dataset used came from comments on the Indodax website chat room; as many as 2890 data were successfully collected, then data preprocessing, translation to English, and text labeling and used hybrid parallel CNN and LSTM using a word embedding glove 100 dimensions. Results of the experiments conducted, at 90:10 data splitting and 100 epochs was the best model with 88% accuracy, 86% precision, 78% recall, and 81% f1-score, while the classiﬁcation of opinion text comments on Indodax chat results in 64.22% neutral comments, 21.14% positive comments, and 14.63% negative comments. Based on research results, using a parallel hybrid model provides a high accuracy value in classifying text. From these results, positive comments were more than negative,


INTRODUCTION
In this increasingly modern era, Digital currency is in great demand by the public because it has the potential to significantly increase in value, making it an attractive investment alternative for some people. Digital currencies are an alternative form of liquidity with tremendous differences in ownership, transaction, and production issues about traditional monetary assets [1]. Some examples of digital currencies are Bitcoin, Ethereum, and others. Bitcoin is one of the cryptocurrencies where this currency is a digital currency that bases on online payment transactions. Bitcoin is mainly used in transactions on the internet without using intermediaries, alias, not using bank services [2]. The bitcoin exchange rate is very volatile, with unreasonable price increases, so it is vulnerable to the risk of bubbles that can potentially harm the public [3]. One of the factors that can affect the price of bitcoin is the emergence of negative or positive public opinion. The circulation of these opinions can affect the level of public trust in bitcoin. For this reason, it is necessary to do a way to find out market conditions based on the opinions that develop by using text mining.
Text mining is a search or mining effort, namely a text where data sources are usually obtained from documents to find words that can represent the document's contents. So that an analysis of the connectivity between documents can be carried out, the initial stage that must be passed is to change the file type, which was originally pdf to text, then filtering is carried out [4]. Text Mining aims to find valuable information hidden from structured and unstructured information sources [5]. Retrieval of information from the text (text mining), among others, can include text or document categorization and sentiment analysis (sentiment analysis) [6]. The technique used in text mining is sentiment analysis. Sentiment analysis is carried out to extract information from the data that has been collected, including the quality and quantity of positive and negative sentiments and trends that appear in opinions and comments related to Bitcoin. This information can then be used to predict Bitcoin price movements, both in the short and long term. Sentiment analysis is a field of study that analyzes opinions, sentiments, evaluations, judgments, attitudes, and emotions of a person towards a product, service, organization, individual, problem, event, or topic [7]. The main task of sentiment analysis is to classify polarities or features in the form of text contained in sentences or documents and determine the opinions described from these sentences or documents [8]. By using sentiment analysis, the polarity of existing opinions can be collected so that it can be used to predict the public atmosphere or a negative or positive picture of netizens' feelings. Sentiment analysis can be processed using deep learning algorithms such as convolutional neural networks, long short-term memory, recurrent neural networks, and gated recurrent units.
CNN is a combination of artificial neural networks and deep learning methods. CNN consists of one or more convolutional layers, often with a subsampling layer followed by one or more fully connected layers as standard neural networks [9]. CNN has advantages compared to other methods, such as processing computations to train the model before conducting testing. When testing is carried out, it does not require repeated training, and the model can be used anywhere. However, the CNN method has weaknesses; the more datasets used in the training process, the longer the training process [10]. LSTM is an evolution of the RNN architecture, which was first introduced by Hochreiter & Schmidhuber in 1997. A robust type of neural network designed to handle sequence dependence is called a recurrent neural network [11]. In addition, LSTM has the advantage of handling vanishing gradient problems common in relatively long data processing [12]. Based on the disadvantages of CNN and the advantages of LSTM, this research will combine the two algorithms to get better model performance.
Bitcoin price sentiment analysis has been done before, research conducted by [13] using a machine learning model with the K-Nearest Neighbors algorithm for sentiment analysis, data obtained from social media, namely Facebook, this study obtained an accuracy value of 62%. The second study ever conducted by [14] using a machine learning model with the Nave Bayes algorithm for sentiment analysis, data obtained from social media, namely Twitter, this study obtained an accuracy value of 71.98%. Recent research ever done by [15] using a machine learning model with the comparison Nave Bayes and Support Vector Machine algorithm for sentiment analysis, data obtained from social media, namely Twitter, this study obtained the best accuracy in the Support Vector Machine algorithm with a value of 71.30%. Based on previous research on bitcoin price sentiment analysis done in 2021 and focused on sentiment analysis on social media. This study uses the sentiment analysis method with an accuracy below 80%. However, since then, many things have changed, such as changes in transaction volumes and new regulatory policies that have had an impact on investor behavior. Therefore, the analysis that can be carried out in this study is to update the data and sentiment analysis methods used to account for new changes in the cryptocurrency industry. This novelty in this research is that the data used comes from Indodax, a digital currency exchange in Indonesia that provides chat room services and uses a more sophisticated sentiment analysis method, namely deep learning, to improve the accuracy of sentiment analysis. By updating the methods and data sources used, this new research is expected to provide a better understanding of investor sentiment regarding bitcoin prices. As a result, it can make decisions in conducting bitcoin transactions. RESEARCH METHOD Figure 1 shows the steps to be followed. Based on Figure 1, the process begins with collecting data on the indodax.com website using the help of an instant data scrapper, and then preprocessing will be carried out, then translated into English, which aims to provide label text using the help of vader lexicon, at the word embedding stage it will convert the text into a vector using glove which is then processed using hybrid parallel CNN and LSTM. The final result will measure the performance of the model using the confusion matrix.

Data Collection
The process begins with collecting data on the Indodax.com website. The data collection process is presented in Figure 2. Based on Figure 2, data collection used an instant data scraper by taking comments from the chat room column, and then the data is converted into excel format. The results of data collection are presented in Table 1.  Table 1 presents data collected on the Indodax.com site, which was taken from June 26 -July 27, 2022. There were 2890 data samples collected, with two variables: usernames and comments.

Data Preprocessing
Data preprocessing is the stage that is carried out before conducting data analysis or modeling. The purpose of data preprocessing is to prepare data so that it fits the needs and can be processed by the algorithm that will be used. Several stages are carried out in the preprocessing, namely: a. Data Cleaning is the process of identifying and addressing problems with data, such as missing, duplicate, or invalid data. b. Case folding is the process of changing uppercase to lowercase or vice versa; this is typically used in the text search process to ensure that the search results are not affected by the case. c. Tokenizing is used to break down the text into parts that are easier to process. d. Filtering is a data cleaning process that is carried out by filtering data that is not needed. e. Stemming is the process of changing words with the same root into basic or original words.

Translating to English
This process changes the text from Bahasa to English. Changing the language to English is intended so that data can be labeled automatically in the labeling process using the vader lexicon. Translate is done automatically using the help of Google Translate.

Labelling
Labelling is the process of adding labels or tags to text to provide additional information or categories about the text. At this stage, labeling is done automatically using the vader lexicon.

Splitting Data
Splitting data refers to dividing a dataset into two or more subsets; this is typically done to evaluate the performance of a machine-learning model or to use the subsets for different purposes. This process uses test-train-split.

Word Embedding
This process is carried out to give weight (value) to each word in a document or text. This weight is given by considering the frequency of occurrence of the word in the text, the importance of the word in the context of the text, and its relevance to the topic.

Hybrid Model
The algorithms used to build the hybrid model are CNN and LSTM. The architecture of each algorithm can be seen in the image below: 1. CNN uses a convolution process to filter and extract relevant features from the data and then uses classification to classify the data. CNN is very effective in image processing but can also be used for text by converting it into a matrix. CNN also has a high degree of generalization, meaning it can handle new data well without retraining. CNN architecture is presented in Figure  3. The convolutional layer functions to convert input into smaller and more structured features, and it performs convolution between the input and the filter, which is used as a representation of the desired features. The convolutional layer's formula (1), (2), (3), and (4) is as follows.
(c) Max Polling The pooling layer reduces the dimensions of the input and simplifies the existing information; this is done by selecting the maximum or average value from several related points in the input. Using the following equation formula (3) [18].
c is the maximum value of the feature map (d) Fully Connected Layer Fully connected layers are commonly used in deep learning architectures. They are usually followed by one or more non-linear activation functions such as ReLU (Rectified Linear Unit) or sigmoid. They are useful for learning complex relationships between the input and the output and making predictions based on those relationships.
The softmax layer is used to classify an input. This layer converts the previous layer's output into probabilities, with each output being a probability of the corresponding class. Using the following equation formula (4).
y is output. The output value is greater than the threshold and will have a value of 0 otherwise.
2. Long Short-Term Memory LSTM can remember information that has long been passed so that it can understand the wider context of the data. LSTM is also able to control information entering and leaving its long-term memory so that it can process data more accurately. LSTM architecture is presented in Figure 4 [16]. The input gate is in charge of controlling the information that goes into the LSTM unit, and the forget gate is in charge of controlling the information that is issued from the LSTM unit, while the output gate is in charge of controlling the information that is output by the LSTM unit.

CNN-LSTM
Hybrid CNN-LSTM combined two algorithms. CNN is used to obtain features from input data, while LSTM is used to manage input data sequentially. Hybrid CNN-LSTM can deal with signal processing problems with spatial and temporal structures, such as language processing, image processing, and video processing. This model has the ability to retain the required information over a long period of time and ignore irrelevant information, improving signal processing performance. CNN-LSTM architecture is presented in Figure 5  The hybrid architecture consists of two main parts, namely CNN and LSTM. CNN consists of several convolution layers which are used to extract spatial features from data. The results from the CNN are forwarded to the LSTM layer, which is the RNN network. This LSTM layer will process sequential data by storing information from the previous timestep and using it to predict the next timestep. Combining the advantages of CNN and LSTM, the CNN-LSTM hybrid architecture is able to handle sequential data with a spatial structure better than using only one type of neural network. The use of layers in the CNN-LSTM hybrid uses the following parameters.

Convolutional layers
The parameters in this layer are determined based on the shape filter that has been determined. To calculate the parameters in this layer, use the following equation Formula (5).
conv = ((shape of width of f ilter * shape of height of f ilter * number of f ilters in the previous layer + 1) * number of f ilters) (5) The shape of the width and shape of the height filter is the dimension of the filter used, and then the filter is multiplied by the previous filter, added by 1 then multiplied by the filter used on this layer.

LSTM layer
The parameters in this layer are determined based on the shape filter and input that have been determined. To calculate the parameters in this layer, use the following equation Formula (6).

Model Evaluation
Model evaluation is the process of evaluating the ability of a model to solve a problem or manage data. This process is carried out using a confusion matrix. The confusion matrix is presented in Table 2. (FPNet) is a model supposed to predict positive data but instead predicts neutral data. False Positive Negative (FPNeg) is a model supposed to predict positive data but instead predicts negative data. False Negative Neutral (FNegNet) is a model supposed to predict negative data. Instead, it predicts neutral data, False Negative Positive (FNegPos) is a model supposed to predict negative data but instead predicts positive data, False Neutral Positive (FNetPos) is a model supposed to predict neutral data but instead predicts positive data, False Neutral Negative (FNetNeg) is model supposed to predict neutral data but instead predicts negative data.
Based on Table 2, we will calculate the values of accuracy, precision, recall, and f1-score using the following equation. 1. Accuracy is the system's accuracy level in classifying the correct data. The accuracy value can be calculated by dividing the number of correct predictions by the total number of predictions using Formula (7).
2. Precision is the ratio of the number of correctly predicted classes to the total number of predicted classes-Formula (8), (9), and (10).
P ositive precision = T P T P + F N egP os + F N etP os (8) 3. The recall is one of the indicators used in evaluating the classification model. Recall measures how well the model can return the correct data from the total data that is actually correct using Formula (11), (12), and (13).
P ositive recall = T P T P + F P osN eg + F P osN et (11) N egative recall = T N eg T N eg + F N egP os + F N egN et (12) 4. F1-Score is the harmonic mean of precision and recall (sensitivity) using Formula (14).

Model Validation
Model validation is the process of evaluating the effectiveness of the model that has been created. If the results produced by the model follow the actual data, then the model is valid. However, if the results do not match the actual data, the model must be repaired or regenerated. Model validation is important to ensure that the model that has been created can be used correctly and effectively to solve the problem at hand.

RESULT AND ANALYSIS 3.1. Data Collection
Datasets were taken using a scraping technique using the instant data scraper extension, and opinions were taken from the Indodax.com website chat room; only words containing 'BTC' would be taken, the data collection period was taken within one month, i.e., from June 26 July 27, 2022. After the data is collected, then the preprocessing process is carried out, then the data is translated into English, which aims to do automatic labeling using the vader lexicon. Labeling is categorized into three classes: positive, neutral, and negative. The result is 2890 comments that have been collected. Data that has been labeled will be divided into two parts. Data division is carried out four times, comparing training data: test data, namely 60:40, 70:30, 80:20, and 90:10. A total of 2,890 comments were collected. The following part will display the number of comments in each class, and these results are presented in Figure 6. From the results shown, it can be seen that the positive amount of data is 611, the neutral is 1856, and the negative is 423.

Data Preprocessing
Data that has been collected using a scarping technique with the help of the instant data scraper extension will then be cleaned, which aims to remove empty data and clean data from symbols and links.
1. Data Cleaning The result of data cleaning is presented in Table 3; this process aims to ensure that the data to be used is clean, accurate and following the needs of analysis by checking missing data, cleaning inappropriate data, and removing symbols and links.

Tokenizing
The tokenizing result is presented in Table 5; this process is intended for text analysis, as it helps in grouping and comparing words more easily.

Filtering
The filtering result is presented in Table 6; this process aims to eliminate irrelevant or useless data for the analysis to be carried out.

Stemming
The result stemming is presented in Table 7; this process aims to change the words in a text into basic words that aim to reduce the dimensions of the data.

Translating to English
The next step is to translate from language to English using the help of Google Translate. The results of the Bahasa to English translation are presented in Table 8.

Labeling
Data that has been converted to English will continue for automatic labeling using the vader lexicon with conditions. If the compound value is 0.05, then the Sentiment is Positive. If the compound value is 0.05, then the Sentiment is Negative, and Apart from that, the sentiment is Neutral.
Giving sentiment to each text is presented in Table 9.

Splitting Data
Data is divided four times with a ratio of 60:40, 70:30, 80:20, and 90:10 (as show in Table 10).  Table 10 presents the results of data divided into training data and test data.

Hybrid Model
Hybrid cnn-lstm uses the parallel model presented in Figure 7. Based on Figure 7, this hybrid used 3*1 dimensions. In the first and second convolutional layers, it has 32 filters. The LSTM layer has 100 filters. It is necessary to check the parameters to test the hybrid model created. The parameters to be used in this study are presented in Figure 8. Parameters of hybrid are presented in Figure 8; it can be concluded that this study uses two CNN and LSTM algorithms. An explanation of the layers in each algorithm is as follows.
Conv1d (Conv1D) is the first convolution layer on CNN. These parameters are obtained by multiplying the shape of the width and height filter with dimensions of 3 * 1, then multiplying by the previous filter, which is 100, and adding by 1, then multiplying by the filter owned, which is 32, so the result obtained is 9632. P arameter conv1d = (3 * 1 * 100 + 1) * 32 Conv1d 1 (Conv1D) is the second convolution layer on CNN. These parameters are obtained by multiplying the shape of the width and height filter with dimensions of 3 * 1, then multiplying by the previous filter, which is 32, and adding by 1, then multiplying by the filter owned, which is 32, so the result obtained is 3104 Lstm is a layer of the LSTM algorithm. These parameters are obtained by adding up the previous filters, which are 32, added by 1 and 100, which are the filters for this layer, multiplied by the filter for this layer, which is 100, and multiplied by 4, which is the unit of lstm, so the result obtained is 53200. P arameterlstm = 4 * ((32 + 1 + 100) * 100)

Model Evaluation
Model evaluation is a process to evaluate how well the built model is able to predict or complete the specified task. Some common methods used include accuracy, precision, recall, and F1-score. Choosing the right method can help evaluate a model's performance and make decisions about the most suitable model to use in a given situation. Data splitting was carried out four times, and then a comparison was made to find the best model performance presented in a table. The comparison results of data splitting are presented in Table 11. Results of comparison data splitting from the confusion matrix are presented in Table 11. Based on Table 11, the data splitting of 90:10 with 100 epochs gets the best accuracy value. Testing with 100 epochs in data splitting 90:10 are presented in Table 12.  Table 12 presents the calculation results using 100 epochs. Out of 100, it gets an accuracy value of 88%. This value is the highest found in the 90:10 data splitting. Model accuracy and model loss are presented in Figure 9. Based on Figure 9, the training accuracy is higher than the data validation accuracy, indicating that the model tends to experience overfitting. Overfitting occurs when the model is too complex and too focused on training data so that it is unable to generalize well to new data or data that is not visible during training. As a result, even though the training accuracy is high, the model performance on validation data is not as good as expected; it could even be much lower. In this case, the model may have "memorized" the training data and be unable to identify the more common patterns behind the data. This can be overcome in various ways, such as using regularization techniques, expanding the dataset, or using other techniques to reduce model complexity, for results confusion matrix on the 90:10 data splitting are presented in Figure 10.  Figure 10, the model's performance has been tested by obtaining an accuracy score of 88%, precision of 86%, recall of 78%, and f1-score of 81%. The results of evaluating the performance of the CNN structure with the LSTM architecture in this study were compared with several previous studies for classifying 3-class multi-label text. This study aimed to combine the CNN and LSTM algorithms as a multi-label text classifier with the additional feature of word embedding glove dimensions of 100. The results of the performance evaluation of training and testing showed good results. This can prove that the LSTM and CNN hybrid models are good methods for sequential text classification. Comparison results with previous research are shown in Table 13.  [18] 69.97% SVM [19] 82% RNN [20] 64.48% CNN [21] 86% LSTM [22] 81% CNN & LSTM + Glove 88% ISSN: 2476-9843 Table 13 shows that the LSTM and CNN hybrid models with the word embedding GloVe feature get optimal results for text classification because CNN can perform feature extraction at the word level and create a vector representation of each word. However, CNN cannot pay attention to the order of words in a sentence. In contrast, the LSTM can remember word order and understand the context and relationships between words in a sentence. Therefore, by using a combination of these two models, we can obtain features from CNN and maintain the ability of LSTM to pay attention to word order in sentences.

Model Validation
The ROC curve is used to evaluate model performance by comparing the level of the model's ability to predict positive and negative classes. Table 14 will explain the categories of model performance values. Based on the categories in Table 14, this study obtained a model validation value of 92.77%, presented in Figure 11. Based on Figure 11, this research is in the range of 0.9-1.0, which means that this research is included in the excellent category.

CONCLUSION
The results of this study are the Convolutional Neural Network and Long Short-Term Memory algorithms, which can be combined, with previous studies, that only used one deep learning algorithm. The final result is the best in the data splitting of 90:10 and 100 epochs with 88% accuracy, 86% precision, 78% recall, and 81% f1 score. This study contributes by using a parallel hybrid model as well as direct data from investor opinion on the Indonesian digital currency exchange, namely Indodax, which then conducts a sentiment analysis of bitcoin prices so that market conditions can be determined. Results obtained are dominated by neutral comments, but positive comments are more than negative comments. Therefore, investors should not sell and buy bitcoins. Furthermore, market conditions change rapidly, and the researcher provides suggestions to further researchers to update the data so that they can make decisions in certain situations.

DECLARATIONS
AUTHOR CONTIBUTION The first and second authors' contributions have carried out this work, with the main contribution from the first author. The third and fourth author contribution assists in data analysis.
FUNDING STATEMENT This research is independently funded because it allows researchers to have the freedom to carry out research and pursue more specific goals without depending on sponsors or other parties.
COMPETING INTEREST This research is based on the assumption that no conflict of interest can affect the results and conclusions of the research.