Predicting Stock Markets Using Binary Logistic Regression Based on Bry-Boschan Algorithm

In the stock market, there are bullish and bearish terms that are reﬂected in the movement of the stock price index. One of the stock price indexes listed on the Indonesia Stock Exchange (IDX) is the IDX Composite. Stock market conditions ﬂuctuate along with changes in stock prices that move randomly, while investors expect market conditions to be active (bullish market). Several factors inﬂuence the movement of the IDX Composite, one of which is macroeconomic factors. The purpose of this research is to ﬁnd out the condition of stock market as well as predict its condition using macroeconomics indicators. The method used to determine stock market conditions (bullish or bearish) is the Bry-Boschan algorithm, while the method used to predict the stock market using macroeconomic indicators is the binary logistic regression method. The Bry-Boschan algorithm is widely used to detect peaks and troughs in business cycle analysis. Binary logistic regression is used to model data with responses that have two categories or are in the form of binary numbers. Results show that the IDX Composite experienced 42 times (month) bearish periods and 191 times (month) experienced bullish periods. The obtained model has an accuracy value of 81.55%.


A. INTRODUCTION
In the stock market, there are terms bullish and bearish. Active market conditions (bullish market) occur due to price increases, which impact transaction volume. Passive market conditions (bearish market) occur due to price decreases followed by a decrease in transaction volume (Husnan, 2018). The ups and downs in stock prices, as indicated by the movement of the stock price index, are indicative of bullish and bearish market conditions (Novianto, 2011;Safitri, 2021). The IDX Composite, often known as the Indonesia Composite Index, is one of the stock price indices listed on the Indonesia Stock Exchange (IDX) (Sampurna et al., 2016). All equities listed on the Indonesia Stock Exchange are included in the IDX Composite index (Syarina, 2020).
Investors must have access to pertinent information on the state of the capital markets in order to make wise investment choices. Identification of stock market conditions is carried out to determine whether the stock market is active (bullish market) or passive (bearish market). The stock market conditions (bearish or bullish) can be identified by the Bry-Boschan algorithm. The Bry-Boschan algorithm is an approach widely employed in identifying the phases of business and financial cycles (Wu and Lee, 2015). The first study from Tüzen et al. (2022), with the aim of examining the fundamental characteristics of the cyclical fluctuations in the Turkish economy and to determine the business cycle (contraction and expansion) using Bry-Boschan (BB) algorithm. The research of Kaur (2020) with the aim of identifying the monetary sector's leading indicators for the Indian economy. The research of Luvsannyam et al. (2019), with the aim of comparing the Mongolian business cycle using graphical and parametric methodologies.
Stock market conditions fluctuate along with changes in stock prices that move randomly, while investors expect market condi-127 128 | RENANTA DZAKIYA NAFALANA JURNAL VARIAN | e-ISSN: 2581-2017 tions to be an active or bullish market (Barus and Wijaya, 2021;Sasono, 2022;Triani, 2013). Prediction is used to find out how the condition of the stock market. As a result, investors may decide whether to increase or decrease their portfolio allocation in the stock market (Widoatmodjo, 2012). Several factors influence the movement of the IDX Composite, one of which is macroeconomic factors (Blanchard, 2017). Macroeconomic factors that affect stock performance include international economic conditions, a country's economic cycle, inflation rates, tax regulations, money supply, exchange rates, and interest rates for Bank Indonesia certificates (Krisna and Wirawati, 2013;Samsul, 2015) . Several methods can be used to predict stock market conditions (bearish or bullish) for the IDX Composite, one of which is binary logistic regression. Binary logistic regression is a form of regression used to model the relationship between response variables and predictor variables, where a response variable is a binary number (Dewi and Pratiwi, 2021). As the research of Ali et al. (2018), with the aim of forecasting stock performance using a logistic regression model. The findings demonstrated that stock performance is highly predicted by financial and accounting parameters. For predicting positive or negative stock performance, the prediction accuracy is 89.77%.
Based on this, this research aims not only to identify stock market conditions (bearish or bullish) with the Bry-Boschan algorithm but also to predict them using macroeconomic indicators, namely: inflation, BI interest rates, the rupiah exchange rate against foreign currencies, especially the US dollar, and the amount of circulated money using the binary logistic regression method.

B. RESEARCH METHOD
The data used in this research is secondary data which contains monthly data from January 2003 to May 2022. The data is sourced from yahoo finance, the Central Bureau of Statistics (BPS) website, and the Bank Indonesia website.
This research uses a variable consisting of one dependent variable/response, the IDX Composite, and four independent variables/predictors, namely inflation, BI rates, the US dollar exchange rate, and the amount of circulated money. The method used in this research is the Bry-Boschan algorithm and binary logistic regression. Flowcharts of the research carried out are presented in Figure 1.
Its called a trough if: y t−k < y t andy t < y t+k (2) with k = 5. k is the minimum duration of an up or downtrend (Tüzen et al., 2022). (b) The turning point is called to be bearish or bullish if: when the turning point was originally a peak, it turned into a trough. when the turning point was originally a trough, it turned into a peak. A peak to trough phase has a minimum duration of 5 months.
A trough to peak phase has a minimum duration of 5 months. A peak to peak cycle has a minimum duration of 15 months.
A trough to trough cycle has a minimum duration of 15 months. If there are two or more similar turning points (peak to peak) sequentially, the highest peak is selected.
If there are two or more similar turning points (trough to trough) sequentially, the lowest trough is selected. If there are two or more turning points with the same value, then the last point is designated as a potential turning point Turn points that occur within 6 months or less of the beginning and end of a data series period are not considered potential turning points. 5. Add macroeconomic variables, namely inflation, BI interest rates, US dollar exchange rate, and the amount of money in circulation (M2). 6. Conduct descriptive analysis of macroeconomic variables (inflation, BI interest rate, US dollar exchange rate, money supply) to see a general picture of the data. 7. Conduct data standardization Data standardization formula: with Z : Z-score, x : value from data,x : mean data, sd : standard deviation 8. Conduct modeling with binary logistic regression analysis.
In binary logistics regression, the parameter significance test in the model can be carried out with the likelihood ratio test and the Wald test.
with hypotheses H0 : β j = 0, j = 1, 2, . . . , p (The j-th predictor variable has no significant effect on the response variable). Wald test statistic follows the it is known that π(1) = e (β0+βj ) 1 + e (β0+βj ) and π(0) = e (β0) 1 + e (β0) with j = 1, 2, . . . , p. The OR value is obtained in Equation 9 OR = e βj with g : number of groups, O k : sum of the response variable values in the g group, π k : mean estimated probability, n k : number of observations in the g group. The hypotheses used are H0 : π k = π 0 , k = 1, 2, . . . , g (The model is suitable or there is no difference between the observation results and the predicted results). Reject H0 if the test statistic C value > χ 2 (α,g−2) or if the p − value < α.

(e) Confusion Matrix
A confusion Matrix is a measuring tool in the form of a matrix that is used to obtain the amount of classification accuracy of the class with the classification algorithm used (Witten et al., 2017). The response variable which has two classes has four possible results of different classification predictions, namely true positive (TP), true negative (TN), false positive (FP), and false negative (FN). Prediction accuracy can be known by calculating the accuracy, precision, and specificity results. The confusion matrix table also shows the error results from the classification algorithm used. The following formula may be used to determine accuracy, precision, specificity, and error rate values P recision = T P T P + F P Specif icity = T N T N + F P (13) 10. Conduct interpretation results.

C. RESULTS AND DISCUSSION 1. Identify Stock Market Conditions (Bearish or Bullish) IDX Composite with Bry-Boschan Algorithm
Before identifying the stock market conditions of the IDX Composite, a descriptive analysis was performed to find out the general picture of the data. As   Table 4.

Prediction of Stock Market Conditions (Bearish or Bullish) IDX Composite Using Macroeconomic Indicators with the Binary Logistics Regression Method
After knowing the stock market conditions (bearish or bullish) on the IDX Composite, the next step is to predict the stock market conditions (bearish or bullish) on the IDX Composite using macroeconomic indicators. In this research, 4 macroeconomic indicators were used, namely inflation, BI rates, the US dollar exchange rate, and the amount of circulated money (M2).
Before starting the analysis using the binary logistic regression method, a descriptive analysis was carried out for each macroeconomic indicator. The results of the descriptive analysis on each macroeconomic indicator (inflation, BI interest rate, US dollar exchange rate, and the amount of circulated money) are presented in Figure 4.

Modeling with Binary Logistics Regression
Before modeling with binary logistic regression, standardization data is performed for macroeconomic variables (inflation, BI rates, dollar exchange rate, and the amount of circulated money) because these variables have different units. Furthermore, binary logistic regression modeling was carried out and the model was obtained as follows: Probability model The overall test is used to determine whether the independent variables simultaneously or jointly affect the dependent variable so that the model obtained is feasible to use or not. Based on Table 4, the G value is 41.8757923 > χ 2 (0,05;4) value is 9.487729. Thus, with a 95% confidence level, the available data rejects H 0 . It can be concluded that at least there is an independent or predictor variable that has a significant effect on the response variable.

Partial Test
In order to ascertain whether or not the model developed is practicable for application, the overall test is performed to assess if the independent variables simultaneously or jointly affect the dependent variable. Based on Table 5, Wald test statistics > Chi-Square or W 2 > χ 2 (,v) . Thus, with a 95% confidence level, the available data rejects H 0 . This means that predictor variables (inflation, BI rates, dollar exchange rate, and the amount of circulated money) have a significant effect on the response variable (IDX Composite).

Odds Ratio
Odds ratio is used to simplify the interpretation of binary logistic regression models. From the parameter significance test, both overall and partially, it's known that the predictor variables that significantly affect the response variable are inflation, BI rates, dollar exchange rate, and the amount of circulated money. Based on the odds ratio in Table 6, the binary logistic regression model can be interpreted as follows: • Each increase in one unit of inflation will have an effect of 0.1761364 times greater for the occurrence of the bullish condition of IDX composite. • Each increase of one unit of BI interest rate will have an effect of 5.3024533 times greater for the occurrence of the bullish condition of IDX composite. • Each increase of one unit of the dollar exchange rate will have an effect of 0.1243685 times greater for the occurrence of the bullish condition of IDX composite. • Every one unit increase in the money supply will have an effect of 4.1342299 times greater for the occurrence of the bullish condition of IDX composite.  Based on Table 7, the C value is 13, 682 < χ 2 (0,05;8) value is 15.50731. Thus, with a 95% confidence level, the available data failed to reject H 0 . This means that the model is appropriate (there is no difference between the observed results and the predicted results).

Confusion Matrix
The results of the accurate prediction of stock market conditions (bearish or bullish) IDX Composite with binary logistic regression are presented in Table 8.   The accuracy value is 81.55%, the precision value is 70%, and the specificity value is 98.39%. In addition, the error rate value is 18.45%. Based on the accuracy value, this research is not only good at applying the Bry-Boschan algorithm to identify stock market conditions (bearish or bullish), but also good at predicting stock market conditions using macroeconomic indicators.

D. CONCLUSION AND SUGGESTION
This research was conducted in two cases. The first is to identify stock market conditions (bullish or bearish). The second is predicting stock market conditions with macroeconomic indicators. Based on the results of the identification of stock market conditions on the IDX Composite using the Bry-Boschan algorithm, from January 2003 to May 2022, there were 5 bearish and 6 bullish. The bearish trend started from December 2007 to November 2008, with a duration of 11 months. The average duration of a bearish position is 8.4 months, and the average duration of a bullish position is 26.8 months.
Based on the prediction of the IDX Composite stock market conditions using macroeconomic indicators (inflation, BI rates, US dollar exchange rates, and the amount of circulated money), the results show that from 233 data points for the period from January 2003 to May 2022, there were 47 times (months) the IDX Composite experienced a bearish trend and 186 times (months) the IDX Composite experienced a bullish trend. Therefore, we obtain an accuracy value of 81.55%, a precision value of 70%, and a specificity value of 98.39%. In addition, the error rate is 18.45%.