Structural Equation Modeling-Partial Least Square for Poverty Modeling in Papua Province

Article history: Accepted : 21-08-2020 Revition : 07-09-2020 Approved : 11-09-2020 Poverty in Papua Province in 2018 has increased from the previous year. The poverty rate in Papua Province in March 2018 reached 27,74%. This study aims to analyze the factors that influence it so that it can be handled properly. The research method used in this research is Structural Equation Modeling (SEM) with the Partial Least Squares (PLS) approach. The research variables used consisted of 4 latent variables (Poverty, Economy, Human Resources (HR), and Health) with 16 indicators (manifest variables). Based on the analysis that has been done, it is found that economic and health variables have a negative and significant effect on poverty with path coefficients of -0,421 and -0,270, respectively. The health variable has a positive and significant effect on HR with a path coefficient of 0,496. Meanwhile, the HR variable has a positive and significant effect on the economy with a path coefficient of 0,801. It can be concluded that there are two variables that have a significant effect on poverty in Papua Province, including the economy and health. Keyword: Poverty; Economy; Human Resources (HR); Structural Equation Modeling (SEM); Partial Least Squares (PLS) This is an open access article under the CC BY-SA license. DOI: https://doi.org/10.30812/varian.v4i2.852 ————————————————————

The high level of poverty in the area is caused many factors. Various empirical studies have tested the factors that affect poverty, such as the research of Suhartini (2017) which states that one of the factors that has a significant effect on poverty is health. On the other hand, some experts say that the most effective way to alleviate poverty is to create economic activity in the regions to build economic growth (Yacoub, 2012). Furthermore, research from Abukosim, A., Saleh, M., & Marwa, (2010) states that the quality of human resources has a significant effect on poverty.
Poverty alleviation is a policy that must be consistently implemented by the government. Poverty alleviation as a form of development policy is the responsibility of all elements including the government, business sector and society. This is based on the fact that the government's financial capacity to fund the implementation of poverty reduction policies is urgently limited (Rah Adi Fahmi et al., 2018). Therefore, indepth analysis and accurate modeling are needed so that the strategic role can be carried out appropriately by all elements in poverty reduction.
Based on the explanation that has been described, this research was conducted with poverty modeling in Papua Province using the Structural Equation Modeling (SEM) method with the Partial Least Squares (PLS) approach. The SEM method with the PLS approach was applied in this study since it was able to accommodate a small sample size, analyzed several factors simultaneously, measured the effect of reciprocity, and multilevel model analysed as the factors that used in this study were economic factors, human resources and health. Those factors had a high level of complexity. The modeling results were expected to describe the relationship between the analyzed factors on poverty in Papua Province so it could be used as a basis for policy making for stakeholders as well as the basis for further research in the field of poverty management, especially in Papua Province.

Structural Equation Modeling (SEM)
Multivariate analysis is a statistical analysis method for analyzing several variables simultaneously (Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, 2013). Multivariate analysis in the second generation uses structural equation modeling, known as the Structural Equation Model (SEM). These methods allow researchers to include unobservable variables but only indirectly measured by indicator variables or latent variables (Chin, 1998). SEM in confirmatory analysis data namely by Covariance-based Structural Equational Modeling (CB-SEM) method, always requires various assumptions such as theory that need to be sufficiently supportive, data must be normally distributed with a large sample size. In fact, researchers always face problems with these assumptions. To solve this problem in the second generation of SEM, there is an analysis method namely the Partial Least Square (PLS) method. PLS can be used for small sample sizes, however large sample sizes increase the estimation precision. PLS does not require the assumption of normality in the data and the construct form has a reflective or formative model with a maximum indicator of 1000 indicators (Hair, J. F., Hult, G. T. M., Ringle, C. M., & Sarstedt, 2013).

Partial Least Square (PLS)
PLS is an analysis method that is not based on the assumptions used in regression with the probability method which has the same formula as Ordinary Least Square (OLS) or often called soft modeling. Unfulfilled assumptions are data that are not normally distributed and there is a multicollinearity problem between exogenous variables (Kafadar et al., 1997). In SEM-PLS there are two kinds of relationships between indicators and their latent variables called the reflective model and the formative model. SEM-PLS aims to maximize the variance explanation of endogenous latent constructs (dependent variable) and minimize unexplained variance. This method has advantages include normality and data distribution is not assumed as of data can be carried out in SEM because the application method is done by non-parametrically (Asyraf & Afthanorhan, 2013). In SEM-PLS, there are two models; the measurement model (outer model) and the structural model (inner model). The measurement model is a model that connects the observed manifest variables with their latent variables. While the structural model is the relationship between latent variables in the SEM-PLS model.

a. Measurement Model (Outer Model)
The measurement model is part of a structural equation model that describes the relationship of latent variables with their indicators. Measurement modeling is used to measure the dimensions that make up a factor. The measurement model presents a pre-existing hypothesis, namely the relationship between the indicators and their factors which are evaluated using the Confirmatory Factor Analysis (CFA) technique (Akalili S.N & Haryono, 2014). In general, the measurement model is as follows: (2) Where is an indicator of endogenous variables measuring (px1) and p is the number of endogenous latent variables, is an indicator vector for exogenous variables measuring (qx1), while q is the number if exogenous latent variables. y(pxm) dan x(qxn) is loading factor matrix where m the number of indicators of endogenous variables dan n is the number of indicators of exogenous variables. Meanwhile (px1) dan (qx1) is the error measurement vector.
There are four types of evaluation on reflective indicators, including checking indicator reliability, internal consistency reliability or construct reliability, convergent validity, and discriminant validity (Hair et al., 2014). First, Indicator reliability shows how many variants of indicators can be explained by latent variables. The general threshold criterion is that 50% of the indicator variant can be explained by the latent construct. This causes the loading value (λ) of the latent construct on the indicator variable or will be accepted if it is greater than 0,7. This limit also indicates that the variance between the construct and its indicator is greater than the variance of the measurement error. Reflective indicators must be eliminated from the measurement model when the loading value is less than 0,4 (Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, 2010).
Second, internal consistency reliability or construct reliability which consists of two types, namely using Cronbach's alpha as the lower limit of internal consistency reliability and using composite reliability as the upper limit for true (unknown) reliability (Hair et al., 2014). Composite reliability shows how well the construct is measured by predefined indicators. Composite reliability can be calculated by the following equation.
(3) λ i shows the loading of the indicator variable i on a latent variable, ε i shows measurement error of the indicator variable i, dan j represents the index of the number of reflective measurement models. ρ values range from 0 to 1, and it is acceptable if the value is greater than 0,6 (Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, 2010).
Third, convergent validity which in classical theory is based on the correlation between responses obtained by maximizing different methods of measuring the same construct. The common measure for checking convergent validity is the average variance extracted (AVE) which is calculated by the following equation.
The AVE value indicates the average percentage of variance that can be explained by construct items. It is said to have a good convergent validity if the AVE value is at least 0,5. That is, the latent variable can explain an average of more than half the variance of the indicators (Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, 2010).
Fourth, discriminant validity is evaluated by comparing the AVE value with the square of the correlation value between constructs or comparing the AVE root with the correlation between constructs with the criteria that the AVE root value must be higher than the correlation between constructs or the AVE value is higher than the correlation squared between constructs (Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, 2010).

b. Structural model (Inner Model)
The structural model describes the relationship between latent variables, namely the independent latent variable (exogenous) and the dependent latent variable (endogenous). The structural equation model is as follows (Chin, 1998) Where is the random vector of endogenous latent variables with a size of mx1, is the coefficient matrix of endogenous latent variables with a size of mxm , , is the random vector of exogenous latent variables with a size of nx1, is the coefficient matrix of exogenous latent variables which which shows the relationship of size to with a size of mxn and adalah random error factor measuring mx1. Partial Least Square (PLS) is designed for a recursive model (a cause model that has one direction, and there is no reverse direction or no causal effect) so the relationship between latent variables is called the causal chain system, so it can be specified as follows (Chin, 1998).
Evaluation of the structural model used R 2 , effect size f 2 , path coefficient estimates, dan Stone-Geisser's Q 2 . First, the described R 2 is the same as in linear regression ie the amount that states the ability of the exogenous variable to explain the variance of the endogenous variable. Value can be calculated by the following equation (Afifah, 2013).
Second, see the significance of the relationship between constructs. This can be seen from the path coefficient which describes the strength of the relationship between structures. The sign in the path coefficient must be in accordance with the hypothesized theory, to assess the significance of the path coefficient it can be seen by looking at the t-test value obtained from the bootstrapping process (resampling method) (Vinzi, V. E., Chin, W. W., Henseler, J., & Wang, 2010).
The third check can also be done whether endogenous latent variables have a major influence on exogenous latent variables, by calculating value of effect size f 2 as follows: R include 2 is calculated by involving exogenous latent variables, R exclude 2 is calculated without involving exogenous latent variables. The value of f 2 range from 0 to 1 with interpretations of values is 0,02 (weak exogenous latent variable influence), 0,15 (moderate effect), and 0,35 (large exogenous latent variable effect) (Cohen, 1988).
Another measure to determine the predictive capability of the resulting model is Stone-Geisser's obtained from the blindfolding procedure.
If the value of Q 2 is above 0, it means that the observation value has been reconstructed well and the model has a predictive relationship (Henseler et al., 2009). Apart from some of the evaluation criteria above, there are also overall structural model criteria. This criterion is used to evaluate the measurement and structural model as a whole against the prediction of the model that has been produced, namely the GoF index obtained by the following formula. GoF = √communality ̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ × R 2 (9) The average communality value is obtained by calculating the average value of the communality, which is a value that shows the proportion of the variance of exogenous variables which can explain a number of factors obtained from the sum of squares of loadings of the exogenous variable on the common factor (Afifah, 2013). The GoF index cannot be used in models with formative indicator types (Hair et al., 2014). The criteria for a GoF value are if the value is less than 0,1 (Gof small), more than 0,25 and less than 0,36 (moderate GoF), the GoF value is greater than 0,36 (GoF large).

C. RESEARCH METHODS
The data used in this study is secondary data obtained from the publication of " Data dan Informasi Kemiskinan Kabupaten/Kota Tahun 2018". The data and information presented in this publication are the results of calculations from the National Socio-Economic Survey (SUSENAS) for the period March 2018. The research variables used consisted of 4 latent variables (Poverty, Economy, Human Resources, and Health) with 16 indicators (manifest variables) are presented in Table 1. Health X8 : Percentage of Female users of birth control tools in poor households X9 : The percentage of toddlers in poor households whose birth process is helped by health workers X10 : Percentage of Toddlers in poor households who have been immunized X11 : Percentage of poor households with floor area per capita ≤ 8 m2 X12 : Percentage of poor households using clean water X13 : The percentage of poor households with their own/shared latrines Furthermore, the analysis steps performed on this research are as follows. 1. Create models based on concepts and theories 2. Design measurement and structural models 3. Create a path chart 4. SEM-PLS Modeling 5. Evaluating the measurement model (Outer model) 6. Evaluating structural models (Inner models) 7. Draw conclusions.

Poverty in Papua Province
Cases of poverty in Papua Province are the highest in Indonesia with a percentage of the poor population of 27,74%, the district with the highest percentage of the poor is Deiyai Regency which is 43,49% or in other words almost half of the population of Deiyai Regency belongs to the category of poor people. Merauke and Jayapura are 10,54% and 11,37% respectively, although the poverty percentage in Merauke and Jayapura is still higher than the percentage of the national poor at 9,82%.
The problem of poverty is not just how many and percentage of the poor (Y1). Another concern is the poverty depth index (Y2) and severity of poverty (Y3). In march 2018 the poverty depth index (Y2) and the poverty severity index (Y3) in Papua Province were 6,73 and 2,28, respectively. Sarmi Regency is a district with a depth index of poverty and the least poverty severity index of 1,72 and 0,30, respectively. Meanwhile, the districts with the depth index of poverty and the largest poverty severity index are Lanny Jaya Regency which is 14,59 and 8,30, respectively. To clarify the picture of poverty in Papua Province is presented in the form of a graph in Figure 1 as follows.

Measurement Model (Outer Model)
Analysis of the outer model is carried out to ensure that the measurement used is feasible for measurement (valid and reliable). There are three criteria in the analysis of outer models namely indicator reability, composite reliability, convergent validity and discriminant validity. The reability indicator shows how many variants of the indicator can be explained by latent variables by looking at the loading factor value. The loading factor limit used in this study was 0,60. Here is the result of loading factor value obtained.  Figure 2 shows that there are several indicators/constructs with loading factor values smaller than 0,6 so it is necessary to modify the model by removing indicators with the smallest loading factor value or below 0,60. So a new path chart is generated as figure 3.  Figure 3 shows that all loading factors have values above 0,60 so the construct for all variables is no longer eliminated from the model. On latent variables the economy can explain variants of the X1 and X2 indicators of more than 80% respectively. Variants of the X6 and X7 indicators can be explained by latent HR variables above 90%. Latent Health variables are able to explain variants of the X8, X9 and X10 indicators of more than 60% each. While the latent variable poverty can explain the variants of the three indicators, namely Y1, Y2, and Y3 above 80% each.
The next stage is the evaluation of measurement models based on cronbach alpha, composite reliability and AVE values presented in Table 2 below. Based on Table 2 of Cronbach's alpha value and composite reliability of the four variables above 0,7. This indicates that the indicator has been reliabel or has been able to measure any construct or latent variable well. The better value of Convergent validity is shown by the higher correlation between indicators that make up a construct. Convergent validity can be seen from the average variance extracted (AVE) value. In this study the AVE value of each construct was above 0,5. Therefore there is no convergent validity issue on the tested model.
Due to the absence of a convergent validity problem, the next thing that is tested is a problem related to discriminant validity, namely by comparing the correlation between constructs with the AVE square root value. The correlation values between latent variables presented in Table 2 will then be compared with the square root values of AVE which are presented in Table 3. Discriminant validity can be tested by comparing the value of the square root of AVE with the correlation value between constructs. From Table 4, it can be seen that the AVE square root value of the four latent variables is greater than the correlation of each construct so it can be said that there is no discriminant validity problem.
The feasibility of a measurement model can also be seen from the loading value of the measurement model, with the criteria for the probability value of p-value with a 5% alpha value is less than 0,05. The loading results obtained through the bootstrapping process 500 times are shown in Table 5.  Table 5 shows each of the latent variables has a relationship with the indicator, it can be seen from the pvalue less than 0,05 so that the measurement models for each latent variable is good. The following is a model that can be formed based on the measurement model. X 1 = 0,907 Economy + δ 1 X 2 = 0,897 Economy + δ 2 X 6 = 0,943 HR + δ 6 X 7 = 0,932 HR + δ 7 X 8 = 0,647 Health + δ 8 (10) X 9 = 0,917 Health + δ 9 X 10 = 0,880 Health + δ 10 Y 1 = 0,892 Poverty + ε 1 Y 2 = 0,982 Poverty + ε 2 Y 3 = 0,895 Poverty + ε 3 Based on the equation (10), each latent variable has a relationship with its the indicator with the largest contribution is Y2 to the latent variable of poverty with a path coefficient of 0,982, and the smallest contribution is X8 to the latent variable of health with the path coefficient of 0,647.

Structural Model (Inner Model)
Inner model analysis or structural model analysis is using to determine whether the structural model formed is accurate or robust. Structural models describe the relationship between constructs or between latent variables. The structural model can be evaluated by several indicators, namely the coefficient of determination (R 2 ), predictive relevance (Q 2 ) and the Goodness of Fit Index (GoF). In testing the hypothesis using the probability value of p-value for α = 5% and the t-table value for α = 5%, which is 1,96. The results of the path coefficients obtained through the bootstrapping process as much as 500 times shown in Table 6. Based on Table 6, formed the model as equation (11), (12), (13): Poverty = −0,421 Economy − 0,236 HR − 0,270 Health + ζ (11) Economy = 0,801 HR − 0,021 Health + ζ (12) HR = 0,496 Health + ζ (13) The estimation results show that the economic and health variables have a negative and significant effect on poverty (p = 0,046; p = 0,008 <0,05), and each has a path coefficient of -0,421 and -0,270, which means that when the economic and health variables increase, poverty will decrease. Meanwhile the HR variable did not have a significant effect on poverty (p = 0,192> 0,05). The health variable does not have a significant effect on the economy (p = 0,903> 0,05) however has a positive and significant effect on HR (p = 0,000 <0,05) and has a path coefficient of 0,496, which means that when the health variable increases, HR will increase. Meanwhile, the HR variable has a positive and significant effect on the economy (p = 0,000 <0,05) and has a path coefficient of 0,801, which means that if the HR variable increases, the economy will increase.
Next is the feasibility test of the model using the R 2 value presented in Table 7. The value of R 2 for the poverty variable is 0,612. This value explains that the variability of the poverty variable that can be explained by the variability of the economic, human resources, and health variables is 61,2%. In the Economic variable, the R 2 value is 0,625 which means that the variability of economic variables that can be explained by the variability of the HR and health variables is 62,5%. Whereas for the HR variable, the R 2 value was 0,246 which means that the variability of the HR variable that could be explained by the variability of the health variable was 24,6%.
In addition to checking the R-Square, also examined the influence of endogenous variables to exogenous variables are known based on the value of the effect size (f 2 ) are presented in Table 8.  Table 8 shows that the effect of health and human resources is weak on poverty, the economy has a moderate effect on poverty, health has a weak effect on the economy and has a moderate effect on human resources. Meanwhile, the influence of human resources is great on the economy.
To validate the overall model, it can be seen from the Goodness of Fit (GoF) value. The GoF value obtained is 0,890 (large), meaning that the model has a high ability to explain empirical data so that overall it can be said that the model formed is valid. Meanwhile, to test the strength of the model's prediction by looking at the Stone Geisser Q 2 value. The Q 2 value obtained is 0,631 (above 0), so that the structural model obtained has a relevance prediction or exogenous latent variables both as latent variables that can explain endogenous latent variables in the model.

E. CONCLUSSION AND SUGGESTION
Economic and health variables have a negative and significant effect on poverty with a significant indicator of the economic variable, namely the percentage of poor people aged 15 and over who are not working, and percentage of poor people aged 15 and over working in the agricultural sector. Health variables have a positive and significant effect on human resources with indicator health variables that significant is the percentage of Female users of birth control tools in poor households, percentage of toddlers in poor households whose birth process is helped by health workers, and percentage of Toddlers in poor households who have been immunized. While the HR variable has a positive and significant impact on the economy with a significant HR variable indicator, namely the percentage of poor people aged 15 and over who do not finish elementary school and the Literacy rates of poor people aged 15-55. Further research is recommended to be able to consider other factors that have not been tested in this study, and time-series data can be considered in conducting further analysis so that more comprehensive information can be obtained.