Mask Compliance Modeling Related COVID-19 in Indonesia Using Spline Nonparametric Regression

Until now, Coronavirus disease (COVID-19) has become a concern for Indonesia because of its signif-icant development and impact on various sectors of life and hampering the target of achieving Sustainable Development Goals (SDGs). The achievements targeted in the SDGs, such as reducing poverty, hunger, and many more are very difﬁcult to realize in the current pandemic conditions. The uncertain conditions of the pandemic made the government need some new ideas for consideration in creating policies to encourage sustainable development in this situation. This article covers modeling the effect of achieving the second dose of vaccination and the total cases of COVID-19 cases, which are often considered the reason for general negligence in complying with health protocols, especially wearing masks. This research was conducted using spline nonparametric regression because of its ﬂexibility to handle uncertain data patterns. The results of this study are truncated spline nonparametric regression with 3 knots that produce a R-sq equal to 69.952%. Based on the results, the second dose vaccination coverage variables and the total COVID-19 cases together affect mask compliance. This result is expected to be a benchmark for the government to handle COVID-19 and efforts to achieve the SDGs. Accredited by

infected with COVID-19 (Ministry of Health, 2021a). In July 2021, The Central Bureau of Statistics released the publication of survey results which stated that some people that hesitated to get vaccinated were 4.2% refused and 15.8% were still not sure. The reasons behind refusal and doubt about vaccines are diverse, such as feeling unnecessary because health protocols are sufficient, not being confident about vaccine safety, doubts about vaccine effectiveness, and fear of vaccine side effects (Central Bureau of Statistics, 2020), in addition to providing complete doses of primary vaccines for all citizens, the government also organizes booster vaccinations. Based on the study results, there has been a decrease in antibodies six months after receiving the complete primary dose of COVID-19 vaccination, so it is necessary to take the booster vaccine. The booster vaccine doses increase personal protection, especially in vulnerable groups (Ministry of State Apparatus Utilization and Bureaucratic Reform, 2022).
The people of Indonesia do not fully implement a health protocol. The Central Bureau of Statistics survey in September 2020 showed that the word lack of awareness was the word most often used as an excuse for not implementing health protocols. The survey said that 39% of the population did not implement the health protocol because no cases of COVID-19 had emerged (Central Bureau of Statistics, 2020). These results indicate that it is essential to know whether the total number of COVID-19 cases affects people's adherence to masks. In addition, citing by detik.com, it was reported that compliance with health protocols had decreased because many felt immune after receiving vaccinations. The National Disaster Management Agency broadcast also stated that compliance with health protocols had indeed reduced after observing a behaviour change monitor (Dwianto, A, 2021).
This study is to model the effect of the achievement of the second dose of vaccine and the total cases of COVID-19 on public compliance in using masks as a benchmark for government policy based on nonparametric regression. Nonparametric regression analysis is flexible because it is not limited by assumptions that need to be met, as in parametric regression. One of the methods is spline, a continuous segmented polynomial slice, so it has the advantage of overcoming data patterns that show sharp ups and downs with the help of knot points (Pratiwi, 2017). The resulting curve is relatively smooth. The best spline regression model depends on the optimal knot point. The methods to find the optimal knot points that are often used are Generalized Cross-Validation (GCV), Mean Squared Error (MSE), and the R-sq (R 2 ) (Sanusi et al., 2017). The optimal knot point is obtained from the minimum GCV and MSE values and the maximum R-sq.
This research has state of the art spline nonparametric regression implementation related to COVID-19. Research on regression modeling related to COVID-19 has been carried out by (Ogundokun et al., 2020) using linear regression and (Almalki et al., 2022) using linear regression with a spatial approach. Existing studies do not use a non-parametric regression approach, especially the spline estimator, and none has investigated compliance with the use of masks in Indonesia. So, the novelty of this study is modelling the effect of the second dose of vaccine achievement, and the total cases of COVID-19 cases on adherence to wearing masks using nonparametric spline regression analysis and the results can be used as a benchmark for the government in setting policies in the context of sustainable development related to COVID-19 prevention. The urgency of this research is that the unpredictable and unfinished COVID-19 pandemic greatly affects daily life, so new insights are needed as a benchmark for government policies in handling COVID-19 in the future. Especially considering that there are variants of the COVID-19 virus that continue to grow and even create new spikes, such as the case of the Omicron variant, which has developed since the end of December 2021 (Ministry of Health, 2021b).

B. LITERATURE REVIEW 1. Mask Use Compliance
Public understanding of compliance with masks and social distancing is important to control the spread of disease in the absence of a vaccine or if a large proportion of the population refuses to receive vaccinations. Various studies are modelling the spread of COVID-19 support the importance of wearing masks and maintaining physical distance from others (Cohen et al., 2022). One modelling study suggested that 80% compliance with wearing a mask would reduce deaths from COVID-19 by up to 45% (Eikenberry et al., 2020), and it has been suggested that masks can reduce inoculum size, leading to less severe infections (Gandhi and Rutherford, 2020).
Based on data from The Central Bureau of Statistics (Central Bureau of Statistics, 2021), compliance with masks in Indonesia is quite high. The data presented that 88.6% of the community adhered to using masks, 9.1% rarely used masks, and 2.3% of people ignored masks. In addition, based on The Task Force for Handling COVID-19 in Indonesia (Indonesian Task Force for Handling COVID-19, 2021a), most regions in Indonesia have a high level of mask compliance. There are 190 districts or cities with a compliance rate of wearing masks in the range of 91-100%. Nevertheless, 61 districts or cities in Indonesia have a low compliance rate of wearing masks of 75%.
The need to wear masks is likely to continue along with the entry of the Omicron variant in Indonesia. The first confirmed  COVID-19, 2021b). The use of masks in public places is most effective in stopping the spread of the virus when compliance is high (Howard et al., 2021). The use of masks for infected individuals without symptoms can potentially reduce the risk of infecting others when the individual wears a mask to protect himself (Li et al., 2020). Thus, the use of masks is expected to reduce cases of COVID-19 infection.

Vaccination of COVID-19
Vaccination aims to bring up a person's immune response to the attack of the COVID-19 virus so that the body can fight infection with the virus. Vaccines are required to reduce COVID-19 related morbidity and mortality, and multiple platforms have been implicated in the rapid development of vaccine candidates (Baden et al., 2020). Of course, the immune system against COVID-19 after being vaccinated does not necessarily form instantly. The health protocols launched by the government must still be implemented to provide maximum protection against COVID-19 attacks (Ministry of Health, 2021a). The Indonesian government, through the Minister of Health, stated that it had distributed 1.2 million doses of COVID-19 vaccine to 34 provinces in Indonesia as of January 7 2021, while the vaccination was planned to be carried out in the second week of January 2021, after the issuance of an Emergency Use Authorization by Food and Drug Supervisory Agency (known as BPOM in Indonesia).

Spline Nonparametric Regression
Regression analysis is a statistical method used to investigate and model the relationship between variables (Montgomery et al., 2021). Regression analysis has become one of the most widely used statistical tools for analyzing multivariable data, which provides a simple conceptual method for investigating functional relationships between variables (Oyeyemi et al., 2015).
Regression analysis relies on several assumptions, where the type of relationship between the dependent and independent variables is essential to know (Garba et al., 2021). Three approaches can be used in regression analysis to estimate the regression curve: parametric, nonparametric, and semiparametric. According to Erilli and Alakus (Erilli and Alakus, 2014), parametric regression relies on several assumptions. Meanwhile, estimates formed when the assumptions are not met will result in poor estimates so that to make better assumptions, nonparametric regression models can be used.
Many nonparametric regression approaches have been developed, including using splines. Spline regression is an alternative to polynomial regression, also called segmented regression, because the method focuses on fitting a set of models to various segments of the relationship between the response variable and the predictor variable (Darlington and Hayes, 2017). The spline model has high flexibility with an excellent ability to handle data with behaviour changes at certain sub-intervals (Sohibien et al., 2022). According to Sohibien (Sohibien et al., 2022), one of the advantages of the spline approach is that this model seeks its estimation by following the movement of data patterns.
In the nonparametric regression method, the relationship between the two data is paired (x 1i , x 2i , . . . , x pi , y i ), i = 1, 2, . . . , n can be explained by the following Equation 1.
Where ε i is a random error which is assumed to be identical, independent, and normally distributed with and E(ε i ) = 0 and var(ε i ) = σ 2 . If the regression curve is an additive model, then the equation can be described as Equation 2 f If the regression curve f j (x j ) is assumed to be contained in a spline space of order m with knot points K 1j , K 2j , . . . , K rj , j = 1, 2, . . . , p, the general equation for the univariate spline nonparametric regression model is as Equation 3 .
So that the equation of the multivariate nonparametric regression model is obtained as follows : : denotes knot point, u = 1, 2, . . . , r and (x i − k i ) m declare a cut (truncated) function, which can be described as model 5 : If m = 1, 2, and3, we get linear spline, quadratic spline, and cubic spline, and K ju is the knot point. The knot point is a joint point where there is a change in behaviour in the data. According to Sohibien (Sohibien et al., 2022), the knot point is the melting point where the function changes its pattern at different sub-intervals. The methods to find the optimal knot points that are often used are Generalized Cross-Validation (GCV) and R − sq(R 2 ).

Generalized Cross-Validation (GCV)
The criterion used as a performance measure for a good estimator is the Generalized Cross-Validation (GCV). The optimal value chosen was based on the smallest GCV value (Mardianto et al., 2021). In general, GCV is defined as Equation 6 : where I : identity matrix n : number of observations S λ : is a sized matrix nn with the equation The minimum MSE value from calculating Equation 7 indicates that the estimated value is close to the true value (Maharani and Saputro, 2021).
The R − sq(R 2 ) is a tool to measure the proportion of variance or total variance around the mean, which can be explained by the regression model. Based on Mardianto (Mardianto et al., 2021), the best estimator is based on the smallest MSE and largest R 2 values. The formula can be written as Equation 8 : witĥ y i : the estimated value of the i t h response variablē y : average response variable y i : the value of the i t h response variable

C. RESEARCH METHOD
In this study, the data analyzed were the level of compliance to the use of masks based on the achievement of the second dose of vaccine in 34 provinces in Indonesia in the peak period of COVID-19 from 12th to 18th July 2021. The data used is secondary data obtained from the official government website covid.go.id, vaccine.kemkes.go.id, and kawalcovid19.id. The research variables used in this study consisted of the dependent and independent variables, which are shown in Table 1 The level of compliance with wearing masks based on The Central Bureau of Statistics data Data analysis steps in the study, the effect of the second dose of vaccination, and the total cases of COVID-19 cases on public compliance with using masks as a benchmark for government policy are as follows: 1. Conduct descriptive analysis to find out the description of the data used.
2. Perform analysis with spline regression (a) Produce modelling with nonparametric spline regression using three-knots point with the GCV method (b) Select the optimal knot point using the GCV method (c) Generate data modelling with spline nonparametric regression using optimal knot points from the GCV method (d) Produce a comparison of nonparametric spline regression models with optimal knot points obtained from the GCV method (e) Comparing the results and choosing the best model with the R-sq and MSE criteria D. RESULTS AND DISCUSSION 1. Descriptive Statistics Descriptive statistics are statistics used to describe data into clearer and easier-to-understand information that provides an overview of the research. Based on the analyzed data, the following descriptive statistics were obtained:  Table 2 shows that the average level of mask compliance in 34 provinces in Indonesia is 84.72%. However, the difference between the maximum percentage (North Kalimantan) and the minimum (North Maluku) is still significant. This means that while around 97.33% of the top five provinces have succeeded in achieving these results, the same cannot be said about the condition in North Maluku as the province with the lowest level of compliance.  Table 3 lists the five most mask-compliant provinces in Indonesia. This also shows the average of the top five, which is 97.33% and is higher than the average of all 34 provinces in Indonesia, as shown in Table 2.  Table 4 below shows more information about the maximum and minimum values of the variables. In this analysis, two types of variables were used, namely, the mask compliance rate (Y ) as the dependent variable, the second vaccination coverage (X 1 ), and the total positive cases of COVID-19 (X 2 ) as independent variables.

Identification of Relationship Patterns Between Predictor Variables on Response
Identification of the pattern of relationships between predictor variables and response variables is the first thing that must be done to determine the proper method for modelling it. The relationship pattern can be seen visually through a scatterplot. The modelling can use a parametric approach if the scatterplot forms a quadratic, linear, or other pattern. Meanwhile, it can take a nonparametric approach if the scatterplot does not have a pattern.
It can be seen visually through Figure 1 that the scatterplot between the response variable, namely the percentage of mask compliance with the predictor variable X 1 namely, the percentage of achievement of the second dose of vaccination did not form a particular pattern. Therefore, the estimation of the model cannot be done using a parametric regression approach so that the variable is a nonparametric component. It can be seen visually in Figure 1 that the scatterplot between the response variables, namely the percentage of mask compliance with the predictor variable X 2 . The percentage of total positive cases of COVID-19 does not form a particular pattern. Therefore, the estimation of the model cannot be done using a parametric regression approach so that the variable X 2 is a nonparametric component.
Based on the explanation above, it can be concluded that the pattern of the relationship between the percentage of compliance wearing masks with the predictor factors does not form a particular pattern.

Spline Regression
The best spline nonparametric regression model is a model that has an optimal knot point. Node points or knot points are in a changing pattern of function behaviour. One of the methods commonly used to select the optimal knot point is the GCV method. The optimal knot point is obtained from the minimum GCV value. The knot points used in this study are limited to a one-knot point, two-knots point, and three-knots point.
After modelling nonparametric spline regression using one-knot point, two-knots point, and three-knots point (related parsimonious model), the minimum GCV values can be compared to select the best model. The model with the lowest GCV value will be chosen as the best model. The table below compares knots one, two, and three. y =β 0 +β 1 x 1 +β 2 ( Below is a table containing GCV values for a spline nonparametric regression model with three-knots point. Based on Table 6, the minimum GCV value is 109.183, with the optimal knot point locations for each variable are as follows: X 1 : (K 1 = 4.519 K 2 = 4.938 K 3 = 6.614) X 2 : (K 4 = 0.439 K 5 = 0.577 K 6 = 1.128) Modelling the effect of the second dose of vaccination and the total cases of COVID-19 cases on community compliance in using masks as a benchmark for government policy with parameters according Tabel So, based on the best spline nonparametric regression model using three-knots, the coefficient determination of the model is 69.952%, which means that the variable coverage of the second dose of vaccination and the total cases of COVID-19 cases affects the level of mask compliance for up to 69.952% and other factors influence the rest.

Significant Test
The significance test of the regression model parameters was carried out to determine whether the predictor variables significantly affected the mask compliance rate. There are two stages in testing the significance of the parameters. The first stage is conducting simultaneous tests. If the conclusion of the simultaneous test shows that there is at least one significant parameter, then proceed to the individual test.

Simultaneous Testing
Simultaneous testing is carried out to determine the significance of the regression model parameters together. The hypotheses of the simultaneous test are as follows: H 0 : β 1 = β 2 = 0 H 1 : There is at least one , β j = 0; j = 1, 2  Partial or individual testing is carried out if, at the same time, testing the model parameters, it is concluded that there is at least one significant parameter. It aims to determine which parameters that have or do not have a significant affect on the regression model. The hypothesis used in testing the individual parameters is as follows: H 0 : β j = 0 H 1 : β j = 0; j = 1, 2 Table 9 is the result of testing the model parameters partially. If value if |t value | is greater than the value of |t table | = t (0.025,25) is 2.060 and the P-value is less than α = 0.05 the decision to reject H 0 . Based on Table 9, it is known that five decision parameters reject Ho, which means that these parameters are significant to the model. While four other parameters have a P-value of more than, the decision is to fail to reject Ho, which means that the parameter is not significant to the model. However, even though there are insignificant parameters, these variables are still used because there is one significant parameter at least in one variable. So that the predictor variable X 1 and X 2 has a significant effect on the percentage of compliance using masks. This model can be used to represent the effect of the second dose of vaccination achievement and total COVID-19 cases on mask compliance because it has an R-sq of 69.952%, which means that the second dose of vaccination achievement and the total cases of COVID-19 cases affects the level of mask compliance up to 69.952% and other factors influence the rest. Using the analysis results, several recommendations regarding the public mask compliance for the Indonesian government were formulated, namely as follows. 1. The second dose of vaccination achievement percentage illustrates the uneven distribution of vaccines. Vaccine distribution is a challenge for the government, considering that there are still many difficult areas to reach. In this case, the government needs to allocate appropriate funding to implement vaccination evenly to areas that are difficult to reach because it is following the analysis results that vaccination achievement has a significant effect on public mask compliance. 2. Based on the analysis results, it can be seen that the total cases of COVID-19 have a significant effect on public mask compliance. In this case, transparency and ease in accessing accurate COVID-19 data are essential so that the public knows the current conditions and is aware of the risks. 3. In general, the level of mask compliance in Indonesia is quite good, with an average of 97.33%, but certain areas still have very low levels of compliance, such as North Maluku. This could be due to the uneven distribution of socialization regarding the importance of health protocols so that people do not have sufficient provisions to implement them. Therefore, the government needs to take a strategic approach to distribute information on the importance of implementing health protocols evenly. These recommendations cannot be carried out by the governments alone. Indonesian society must work together to implement the protocols that have been established. Through this research, the central and regional governments are expected to continue to provide access to testing, tracing, and treatment transparently without any discrimination against the public and to ensure that the vaccination process is carried out quickly, evenly, and safely to immediately achieve group immunity so that it can control the spread of COVID-19 effectively. If strategic efforts can handle COVID-19, the target for achieving the SDGs can be realized.

E. CONCLUSION AND SUGGESTION
Public mask compliance is significantly affected by the second dose of vaccination achievement and the total number of COVID-19 cases. Based on the analysis and discussion results, it was found that the province with the highest percentage of compliance using masks was North Kalimantan Province at 98.76%, and the lowest was North Maluku Province at 14.49%. In addition, the best nonparametric spline regression model for modelling the percentage of community compliance in using masks in Indonesia is to use a three-knots point with a minimum GCV value of 109,183. Thus, it can be concluded that with the R-sq of the best model of 69.952%, it means that the achievement variable for the second dose of vaccination and the total cases of COVID-19 cases affects the level of mask compliance up to 69.952%, and the rest is influenced by other factors, each variable of which the achievement of the second dose of vaccination and the total cases of COVID-19 provides a significant effect on the percentage of compliance using masks. This research has contributed to a statistical study based on nonparametric spline regression modeling related to the factors that affect the level of mask compliance in Indonesia that are influenced by other factors vaccines dose 2 and coronavirus cases which previously did not exist. Recommendations from the research that have been presented at the end of the results and discussion section can have implications for the government and the general public if carried out. However, similar statistical modeling is still needed using other estimators in nonparametric regression so that a model that may be better than this result can be obtained.