3. Panel Data Models
Panel data may have group effects, time effects, or both. These effects are either fixed effect or random effect. A fixed effect model assumes differences in intercepts across groups or time periods, whereas a random effect model explores differences in error variances. A one-way model includes only one set of dummy variables (e.g., firm), while a two way model considers two sets of dummy variables (e.g., firm and year). Model 2 in Chapter 2, in fact, is a one-way fixed group effect panel data model.
3.1 Functional Forms and Notation
The functional forms of one-way panel data models are as follows.
Fixed group effect model:
Fixed group effect model:
The dummy variable is a part of the intercept in the fixed effect model and a part of error in the random effect model.
indicates that errors are independent identically distributed.
The notations used in this document are,
3.2 Fixed Effect Models
There are several strategies for estimating fixed effect models. The least squares dummy variable model (LSDV) uses dummy variables, whereas the within effect does not. These strategies produce the identical slopes of non-dummy independent variables. The between effect model also does not use dummies, but produces different parameter estimates. There are pros and cons of these strategies (Table 5).
3.2.1 Estimations: LSDV, Within Effect, and Between Effect Model
As discussed in Chapter 2, LSDV is widely used because it is relatively easy to estimate and interpret substantively. This LSDV, however, becomes problematic when there are many groups or subjects in the panel data. If T is fixed and N goes to the infinity, only coefficients of regressors are consistent. The coefficients of dummy variables are not consistent since the number of these parameters increases as N increases (Baltagi 2001). This is so called the incidental parameter problem. Under this circumstance, LSDV is useless, calling for another strategy, the within effect model.
The within effect model does not use dummy variables, but uses deviations from group means. Thus, this model is the OLS of
without an intercept. You do not need to worry about the incidental parameter problem any more. The parameter estimates of regressors are identical to those of LSDV. The within effect model in turn has several disadvantages.
Table 5. Three Strategies for Fixed Effect Models
Since this model does not report dummy coefficients, you need to compute them using the formula
. Since no dummy is used, the within effect model has a larger degree of freedom for error, resulting in a small MSE (mean square error) and incorrect (larger) standard errors of parameter estimates. Thus, you have to adjust the standard error using the formula
. Finally, R2 of the within effect model is not correct because an intercept is suppressed.
The between group effect model, so called the group mean regression, uses the group means of the dependent and independent variables. Then, run OLS of
. The number of observations decreases to n. This model uses aggregated data to test effects between groups (or individuals), assuming no group and time effect. Table 5 contrasts LSDV, the within effect model, and the between group models. In two-way fixed effect model, LSDV2 and the between effect model are not valid.
3.2.2 Testing Group Effects
The null hypothesis is that all dummy parameters except one are zero:
. This hypothesis is tested by the F test, which is based on loss of goodness-of-fit. The robust model in the following formula is LSDV and the efficient model is the pooled regression.
If the null hypothesis is rejected, you may conclude that the fixed group effect model is better than the pooled OLS model.
3.2.3 Fixed Time Effect and Two-way Fixed Effect Models
For the fixed time effects model, you need to switch n and T, and i and t in the formulas.
The fixed group and time effect model uses slightly different formulas. The within effect model of this two-way fixed model has four approaches for LSDV (see 6.1 for details).
3.3 Random Effect Models
A random effect model is estimated by generalized least squares (GLS) when the variance structure is known and feasible generalized least squares (FGLS) when the variance is unknown. Compared to fixed effect models, random effect models are relatively difficult to estimate. This document assumes panel data are balanced.
3.3.1 Generalized Least Squares (GLS)
When omega is known (given), GLS based on the true variance components is BLUE and all the feasible GLS estimators considered are asymptotically efficient as either n or T approaches infinity (Baltagi 2001). The omega matrix looks like,
In GLS, you just need to compute theta using the omega matrix:
. Then transform variables as follows.
Finally, run OLS with the transformed variables:
. Since Omega is often unknown, FGLS is more frequently used rather than GLS.
3.3.2 Feasible Generalized Least Squares (FGLS)
The estimation of the two-way random effect model is skipped here, since it is complicated.
3.3.3 Testing Random Effects (LM test)
The null hypothesis is that cross-sectional variance components are zero. Breusch and Pagan (1980) developed the Lagrange multiplier (LM) test (Greene 2003; Judge et al. 1988). In the following formula, e bar is the n X 1 vector of the group specific means of pooled regression residuals, and e'e is the SSE of the pooled OLS regression. The LM is distributed as chi-squared with one degree of freedom.
Baltagi (2001) presents the same LM test in a different way.
The two way random effect model has the null hypothesis that both cross-sectional and time-series variance components are zero. The LM test combines two one-way random effect models for group and time.
3.4 Hausman Test: Fixed Effects versus Random Effects
The Hausman specification test compares the fixed versus random effects under the null hypothesis that the individual effects are uncorrelated with the other regressors in the model (Hausman 1978). If correlated (H0 is rejected), a random effect model produces biased estimators, violating one of the Gauss-Markov assumptions; so a fixed effect model is preferred. Hausman's essential result is that the covariance of an efficient estimator with its difference from an inefficient estimator is zero (Greene 2003).
is the difference between the estimated covariance matrix of the parameter estimates in the LSDV model (robust) and that of the random effects model (efficient). It is notable that an intercept and dummy variables SHOULD be excluded in computation.
3.5 Poolability Test
What is poolability? It asks if slopes are the same across groups or over time. Thus, the null hypothesis of the poolability test is
. Remember that slopes remain constant in fixed and random effect models; only intercepts and error variances matter.
The poolability test is undertaken under the assumption of
. This test uses the F statistic,
, where e'e is the SSE of the pooled OLS and e'ei is the SSE of the OLS regression for group i. If the null hypothesis is rejected, the panel data are not poolable. Under this circumstance, you may go to the random coefficient model or hierarchical regression model.
Similarly, the null hypothesis of the poolability test over time is
. The F-test is
, where e'et is SSE of the OLS regression at time t.