Hausman and Taylor Estimator Analysis on The Linear Data Panel Model

Panel data modelling in the field of econometrics applies two main approaches, namely fixed effect estimators and random effects. The application of the Hausman and Taylor estimator to real data is used to test for fixed effects or random effects based on the idea that the set of estimated coefficients obtained from the fixed effect estimates is taken as a group. A good estimator is an estimator that is as close as possible to represent the characteristics of the population. The characteristics of a good estimator include unbiasedness, efficiency, and consistency. The purpose of this study is to identify the properties of the Hausman and Taylor estimator in the linear model of panel data. Based on the analysis using panel data, it is found that the Hausman and Taylor estimator on the random effects panel data is an estimator that is consistent and efficient even though it is not unbiased.


A. INTRODUCTION
Panel data modelling in the field of econometrics applies two main approaches, namely fixed effect estimators and random effects. In (Kojima et al., 2016), conducted research that investigate global demand characteristics of major vegetable oils for future research and policy analysis by using a cross-country panel data of 165 countries from 1991 to 2011 using panel data fixed effect model. In the fixed effect approach, the time factor cannot be observed for each unit of observation but is captured by a dummy variable. Furthermore, this unobservable time factor is treated as part of the error in the random effects model, assuming that its correlation with the regression is zero, (Baltagi and KhantiAkom, 1990). If this assumption is met, the random effect estimator has the advantage of having greater efficiency than the fixed effect estimator, (Frondel and Vance, 2010). The random effects panel data model assumes that all explanatory variables are uncorrelated with random individual effects, while the fixed effects panel data model assumes that all variables are correlated with random individual effect. To investigate whether modelling has any of the characteristics of these two approaches, Hausman in (Baltagi and Liu, 2012) sparked the idea hereinafter referred to as the Hausman test. Tests for fixed effects or random effects are based on the idea that the set of estimated coefficients derived from the fixed effect estimates is taken as a group. The null statistical hypothesis in Hausman's test is that there is no difference between the set of estimated coefficients and the specific set of individuals that cannot be observed. If the test results suggest rejecting H 0 , the applied researcher generally proceeds to draw conclusions based on the fixed effect model estimates.
According to (Yusra et al., 2019), testing the equivalence of fixed and random effects using the Hausman and Taylor test. The test uses a direct model specification that can simultaneously estimate fixed effects and random effects, either based on Ordinary Least Squares (OLS) or Generalized Least Squares (GLS) to test both coefficient equations for individual variables as well as the entire range of coefficients. Hausman test used for random effect model selection with fixed effect model. This test works by testing whether there is a relationship between errors in the model with one or more variables explanation in the model, (Agus Astapa et al., 2018). Hausman and Taylor estimators on panel data allow multiple explanatory variables to be correlated with individual effects. One of the main drawbacks of the fixed effect estimator is that it removes the effects of time-invariant variables. In contrast Hausman and Taylor estimators recapture estimates of time-invariant variables that are important in empirical applications, (Baltagi et al., 2003). (Baltagi and Bresson, 2012) concerning to introduces small sample properties and pretest estimation of a spatial Hausman and Taylor model.
The research of (Sarafidis and Wansbeek, 2010) demonstrate the Monte Carlo method using the Hausman and Taylor estimator on robust panel data suggested by (Arellano, 1993). The size of the Mean Square Error (MSE) depends on the type and level of observation. The results of this study yield an analysis that can be extended to the dynamic Hausman and Taylor model, in which a researcher can examine the sensitivity of using the difference transformation versus the mean or median transformation to eliminate individual effects. The development of the Hausman and Taylor estimator application was also introduced by Harding and Lamarche (2014). They are using the Hausman-Taylor instrument variable as a penalized estimation approach of quantile panel models. In addition, the results of (Maani et al., 2015), by using the Hausman and Taylor estimator, it can be identified that immigrants' labor market integration is significantly affected by the local concentration and resources of their ethnic group.
The application of the Hausman and Taylor estimator to real data has been widely used. In the research of (Baltagi et al., 2014), used the Hausman and Taylor estimator setting to analyze SME data and trade, exports and FDI. The results of the analysis allow concurrent testing to control for the effect of common exogenous determinants and estimate the direct relationship between exports and FDI in terms of bilateral effects that are not observed and do not vary over time. This approach provides consistent parameter estimates and is superior to the corresponding fixed effects model. In addition, the analyze yielded parameter estimates for time-invariant variables such as distance that control for bilateral heterogeneity. (Egger and Pfaffemayr, 2001) generates a derived Hausman and Taylor estimator that allows to test the spatial dependence on homoscedastic or heteroscedastic disorders. The results contributed to examining the large sample nature of the estimator and demonstrating the small sample performance through Monte Carlo simulations. In 2013, (Cng et al., 2013) using the Hausman and Taylor estimator in application of 1995 to 2011 panel data that included 18 of Vietnams major country partners and provided by Vietnams authorities and international organizations.
A good estimator is an estimator that is as close as possible to represent the characteristics of the population, (Chapman and Robbins, 1951). The characteristics of a good estimator include unbiasedness, efficiency, and consistency, (Chan et al., 2008). The purpose of this study is to identify the properties of the Hausman and Taylor estimator in the linear model of panel data. The identification of the characteristics of this estimator is useful for adding broader insights in the field of spatial panel data as introduced by (Millo and Piras, 2012) (Xu and Li, 2020).

B. LITERATURE REVIEW 1. Data Panel Random Effect Model
In the last four decades, longitudinal data analysis has become an object of research interest by researchers. In the study of economics, this analysis is called panel data analysis and has developed into a major subfield of econometrics, (Greene, 2018). According to (Baltagi and Bresson, 2012), Some of the advantages obtained in analysis using panel data include panel data providing more informative, more varied data, lower collinearity between variables, more degrees of freedom, and more efficiency. The second advantage of studying repeated cross-sectional data from observations is that researchers can study the dynamics of change better because panel data can detect and measure unobservable effects in both cross-sectional and pure time-series data. Another advantage is that panel data allows researchers to study more complex behavioural models because autocorrelation between individuals can be resolved compared to cross-sectional analysis or pure time series analysis, (Greene, 2018). The basic framework of the panel data regression model can model differences in behavior between individuals which is expressed in the following equation: with as many as K independent variable on x it . The element of heterogeneity derived from individual effects is expressed by z i α, where z i contains constants and sets of individual or group-specific variables, which can be observed, such as race, gender, location, and so on; or unobserved, such as family-specific characteristics, individual heterogeneity in skills or preferences, and so on, all of which are held to be constant  (Gardebroek and Lansink, 2003). This model is a classical regression model. If z i observed for all individuals, then the entire model can be treated as an ordinary linear model and fits the least squares. Complications arise when c i unobserved, which will be the case in most applications, (Baltagi and Liu, 2012). The fixed effects model in panel data analysis allows unobserved individual effects to be correlated with the included variables, (Elhorst, 2014). The model of the difference between the units as a parametric shift of the regression function. This model may be viewed as valid only for cross-sectional units in this study, not for additional ones outside the sample. For example, a comparison between countries might include a complete set of countries that it is reasonable to assume that the model is constant. As before, to see the model formulation in the block of observations T for group i, y i , X i , u i i, and ε i . For observation T obtained and η i = [η i1 , η i2 , . . . , η iT ] hereinafter referred to as the error component model. Let we assume the strict exogeneity For group i, y i , X i , u i i, and ε i of T observations, we assume the error component models are and Thus, For the T observations for unit i, we have matrices variance where i T is a T × 1 column vector of 1s.

C. RESEARCH METHOD
This research is pure research which aims to develop theory and find the characteristics of Hausman and Taylor estimator. First of all, the writer defines a linear regression model on panel data. Next, the author shows the difference between the fixed effect model and the random effect model. In the random effects model, the author states that the estimator used is the Hausman and Taylor estimator. After that, the author analyzes the best characteristics of the estimator, namely unbiased, minimum variance, consistent, and efficient. Next, the author performs a data simulation. To find out the estimator has a minimum variance, in practice it is necessary to compare the Hausman and Taylor estimator with other estimators such as the Generalized Least Quare estimator and the Feasible Generalized Least Square estimator.

D. RESULT AND DISCUSSION
Given a linear model of random effect panel data components u i is an element of random heterogeneity of observations i-th and is constant with time t. The estimator using the ordinary least square estimation method is obtained as follows.
Using the Gauss-Markov Theorem, we have the OLS estimator is unbiased estimator, (Ahn and Schmidt, 2019). Unbiased estimator statistically means that the expected value of the estimator is the same as the parameter value, (Chan et al., 2008). Practically speaking, unbiased estimator means that the estimator value actually represents the population parameter value. The unbiasedness of this estimator is not affected by the number of samples observed. Furthermore, the OLS estimator is a consistent estimator, because if the deviation is formed on each segment, it is obtained Thus, when the sample is large, Then the estimation is continued by using the Generalized Least Square method and obtained used to test the orthogonality of the general and regressor effects. This test is based on the idea that under the no correlation hypothesis, both the LSDV and FGLS estimators are consistent, but LSDV is inefficient, whereas under the alternative, LSDV is consistent, but FGLS is not. Breusch and Pagan in (Greene, 2018) has introduced the Lagrange multiplier test to test the random effects model based on OLS residuals with statistical hypotheses: From a practical point of view, the dummy variable approach in the random effects model has an impact on the loss of degrees of freedom. The random effects model is based on the assumption that object-specific effects are not observed (z i ) uncorrelated with the included variables (x it ). This assumption is the main weakness of this model. However, the random effects model allows the model to contain observed time-invariant characteristics, such as demographic characteristics, that are not captured in the fixed effects model. Hausman and Taylor estimators provide solutions to overcome weaknesses in the random effects model but still accommodate the possibility of a fixed effect model. Pay attention to the linear random effect model The Hausman and Taylor estimator defines the four sets of observed variables in the model as follows: x 1it is a variable that varies with time and is not correlated with u i , z 1i is a time invariant variable and does not correlate with u i , x 2it is a variable that varies with time and is correlated with u i , z 2i is a time invariant variable and correlated with u i . There are five assumptions to obtain the Hausman and Taylor estimator on the random effects model: This means that the variance of the addition of error and heterogeneity is constant in the form of the sum of each error variance and heterogeneity variance.
This means that the correlation of error and heterogeneity to time varying and time invariant variables is a comparison The following are the steps to obtain a consistent and efficient Hausman and Taylor estimator: 1. Taking deviations from group means 2. Obtain the Least Square Dummy Variable from fixed effect model of β = (β 1 , β 2 ) based on x 1 and x 2 . 3. Stack the group means of these residuals e it in a full-sample-length data vector.
where t = 1, . . . , T and i = 1, . . . , n so that the equation 4. Residual variance in the regression is a consistent estimator of It means T → ∞, σ 2 * = σ 2 u . 5. Defined weighted instrumental variable estimator whereθ is estimator ofˆ The transformed data are collected in the rows data matrix W * and in column vector y * . 6. Time invariant variables in w i t and the group mean is the original variable then we have instrumental variable Thus, we have instrumental variable estimator for Hausman and Taylor model random effect iŝ As a case study, the data set obtained from the Panel Study of Income Dynamics (PSID) during 1968-1972 involved 750 men aged 25 to 55 years to see the economic benefits of education. The variables of concern are experience (years of schooling), health, The main focus of this study is the coefficient on schooling because Schooling and, possibly, Experienced, and Unemployed are correlated with latent effects, there may be serious bias in conventional estimates of this equation. The school variable was treated as endogenous (correlated with u i ) in both cases. The estimation result for the schooling variable was estimated to be 0.067, a value which the authors assumed was too small. As we saw earlier, even in the presence of a correlation between the measured effect and the latent effect, in this model, the Hausman and Taylor estimators provide a consistent estimator of the coefficients for variable time variables. In the HT column, estimates of instrumental variables are presented. Therefore, in future studies we can use it in the Hausman specification test to test the correlation between the included variables and latent heterogeneity.

E. CONCLUSION AND SUGGESTION
Based on the results and discussion, it is found that the Hausman and Taylor estimator on the random effects panel data is an estimator that is consistent and efficient even though it is not unbiased.