Stat/Math
Software Support
Software Consulting
Software Availability
Software Price
Contact

User Support
Documentation
Knowledge Base
Education
Consulting
Podcasts

Systems & Services
Cyberinfrastructure
Supercomputers
Grid Computing
Storage
Visualization
Digital Libraries & Data

Results & Impact
Publications
Grants & Grant Info
Events & Outreach
Economic Impact
Survey Results

Vision & Planning
News & Features

## 7. Random Effect Models

The random effects model examines how group and/or time affect error variances. This model is appropriate for n individuals who were drawn randomly from a large population. This chapter focuses on the feasible generalized least squares (FGLS) with variance component estimation methods from Baltagi and Chang (1994), Fuller and Battese (1974), and Wansbeek and Kapteyn (1989).

7.1 The One-way Random Group Effect Model

When the omega matrix is not known, you have to estimate theta using the SSEs of the between effect model (.0317) and the fixed effect model (.2926).

The variance component of error is .00361263 = .292622872/(6*15-6-3)
The variance component of group is .01559712 =.031675926/(6-4) - .00361263/15

Thus, theta estimate is .

Now, transform the dependent and independent variables including the intercept.

. gen rg_cost = cost - .87668488*gm_cost // transform variables
. gen rg_output = output - .87668488*gm_output
. gen rg_fuel = fuel - .87668488*gm_fuel
. gen rg_int = 1 - .87668488 // for the intercept

Finally, run the OLS with the transformed variables. Do not forget to suppress the intercept. This is the groupwise heteroscedastic regression model (Greene 2003).

. regress rg_cost rg_int rg_output rg_fuel rg_load, noc

Source |       SS       df       MS              Number of obs =      90
-------------+------------------------------           F(  4,    86) =19642.72
Model |  284.670313     4  71.1675783           Prob > F      =  0.0000
Residual |  .311586777    86  .003623102           R-squared     =  0.9989
Total |    284.9819    90  3.16646556           Root MSE      =  .06019

------------------------------------------------------------------------------
rg_cost |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
rg_int |   9.627911   .2101638    45.81   0.000     9.210119     10.0457
rg_output |   .9066808   .0256249    35.38   0.000     .8557401    .9576215
rg_fuel |   .4227784   .0140248    30.15   0.000      .394898    .4506587
rg_load |    -1.0645   .2000703    -5.32   0.000    -1.462226   -.6667731
------------------------------------------------------------------------------

Top

7.2 Estimations in SAS, Stata, and LIMDEP

The SAS TSCSREG and PANEL procedures have the /RANONE option to fit the one-way random effect model. These procedures by default use the Fuller and Battese (1974) estimation method, which produces slightly different estimates from FGLS.

PROC TSCSREG DATA=masil.airline;
ID airline year;
MODEL cost = output fuel load /RANONE;
RUN;

The TSCSREG Procedure

Dependent Variable: cost

Model Description

Estimation Method             RanOne
Number of Cross Sections           6
Time Series Length                15

Fit Statistics

SSE              0.3090    DFE                  86
MSE              0.0036    Root MSE         0.0599
R-Square         0.9923

Variance Component Estimates

Variance Component for Cross Sections    0.018198
Variance Component for Error             0.003613

Hausman Test for
Random Effects

DF    m Value    Pr > m

3       0.92    0.8209

Parameter Estimates

Standard
Variable        DF    Estimate       Error    t Value    Pr > |t|

Intercept        1       9.637      0.2132      45.21      <.0001
output           1    0.908024      0.0260      34.91      <.0001
fuel             1    0.422199      0.0141      29.95      <.0001
load             1    -1.06469      0.1995      -5.34      <.0001

The PANEL procedure has the /VCOMP=WK option for the Wansbeek and Kapteyn (1989) method, which is close to groupwise heteroscedastic regression. The BP option of the MODEL statement, not available in the TSCSREG procedure, conducts the Breusch-Pagen LM test for random effects. Note that two procedures estimate the same variance component for error (.0036) but a different variance component for groups (.0182 versus .0160),

PROC PANEL DATA=masil.airline;
ID airline year;
MODEL cost = output fuel load /RANONE BP VCOMP=WK;
RUN;

The PANEL Procedure
Wansbeek and Kapteyn Variance Components (RanOne)

Dependent Variable: cost

Model Description

Estimation Method             RanOne
Number of Cross Sections           6
Time Series Length                15

Fit Statistics

SSE              0.3111    DFE                  86
MSE              0.0036    Root MSE         0.0601
R-Square         0.9923

Variance Component Estimates

Variance Component for Cross Sections    0.016015
Variance Component for Error             0.003613

Hausman Test for
Random Effects

DF    m Value    Pr > m

2       1.63    0.4429

Breusch Pagan Test for Random
Effects (One Way)

DF    m Value    Pr > m

1     334.85    <.0001

Parameter Estimates

Standard
Variable        DF    Estimate       Error    t Value    Pr > |t|

Intercept        1    9.629513      0.2107      45.71      <.0001
output           1    0.906918      0.0257      35.30      <.0001
fuel             1    0.422676      0.0140      30.11      <.0001
load             1    -1.06452      0.2000      -5.32      <.0001

The Stata .xtreg command has the re option to produce FGLS estimates. The .iis command specifies the panel identification variable, such as a grouping or cross-section variable that is used in the i() option.

. iis airline

. xtreg cost output fuel load, re i(airline) theta

Random-effects GLS regression                   Number of obs      =        90
Group variable (i): airline                     Number of groups   =         6

R-sq:  within  = 0.9925                         Obs per group: min =        15
between = 0.9856                                        avg =      15.0
overall = 0.9876                                        max =        15

Random effects u_i ~ Gaussian                   Wald chi2(3)       =  11091.33
corr(u_i, X)       = 0 (assumed)                Prob > chi2        =    0.0000
theta              = .87668503

------------------------------------------------------------------------------
cost |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
output |   .9066805    .025625    35.38   0.000     .8564565    .9569045
fuel |   .4227784   .0140248    30.15   0.000     .3952904    .4502665
load |  -1.064499   .2000703    -5.32   0.000    -1.456629    -.672368
_cons |   9.627909    .210164    45.81   0.000     9.215995    10.03982
-------------+----------------------------------------------------------------
sigma_u |  .12488859
sigma_e |  .06010514
rho |  .81193816   (fraction of variance due to u_i)
------------------------------------------------------------------------------

The theta option reports the estimated theta (.8767). The sigma_u and sigma_e are square roots of the variance components for groups and errors (.0036=.0601^2).

In LIMDEP, you have to specify Panel\$ and Het\$ subcommands for the groupwise heteroscedastic model. Note that LIMDEP presents the pooled OLS regression and least square dummy variable model as well.

+-----------------------------------------------------------------------+
| OLS Without Group Dummy Variables                                     |
| Ordinary    least squares regression    Weighting variable = none     |
| Dep. var. = COST     Mean=   13.36560933    , S.D.=   1.131971444     |
| Model size: Observations =      90, Parameters =   4, Deg.Fr.=     86 |
| Residuals:  Sum of squares= 1.335449522    , Std.Dev.=         .12461 |
| Fit:        R-squared=  .988290, Adjusted R-squared =          .98788 |
| Model test: F[  3,     86] = 2419.33,    Prob value =          .00000 |
| Diagnostic: Log-L =     61.7699, Restricted(b=0) Log-L =    -138.3581 |
|             LogAmemiyaPrCrt.=   -4.122, Akaike Info. Crt.=     -1.284 |
| Panel Data Analysis of COST       [ONE way]                           |
|           Unconditional ANOVA (No regressors)                         |
| Source      Variation        Deg. Free.     Mean Square               |
| Between       74.6799                5.         14.9360               |
| Residual      39.3611               84.         .468584               |
| Total         114.041               89.         1.28136               |
+-----------------------------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
OUTPUT       .8827386341   .13254552E-01   66.599   .0000    -1.1743092
FUEL         .4539777119   .20304240E-01   22.359   .0000     12.770359
LOAD        -1.627507797       .34530293   -4.713   .0000     .56046016
Constant     9.516912231       .22924522   41.514   .0000
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)

+-----------------------------------------------------------------------+
| Least Squares with Group Dummy Variables                              |
| Ordinary    least squares regression    Weighting variable = none     |
| Dep. var. = COST     Mean=   13.36560933    , S.D.=   1.131971444     |
| Model size: Observations =      90, Parameters =   9, Deg.Fr.=     81 |
| Residuals:  Sum of squares= .2926207777    , Std.Dev.=         .06010 |
| Fit:        R-squared=  .997434, Adjusted R-squared =          .99718 |
| Model test: F[  8,     81] = 3935.82,    Prob value =          .00000 |
| Diagnostic: Log-L =    130.0865, Restricted(b=0) Log-L =    -138.3581 |
|             LogAmemiyaPrCrt.=   -5.528, Akaike Info. Crt.=     -2.691 |
| Estd. Autocorrelation of e(i,t)     .573531                           |
| White/Hetero. corrected covariance matrix used.                       |
+-----------------------------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |t-ratio |P[|T|>t] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
OUTPUT       .9192881432   .19105357E-01   48.117   .0000    -1.1743092
FUEL         .4174910457   .13532534E-01   30.851   .0000     12.770359
LOAD        -1.070395015       .21662097   -4.941   .0000     .56046016
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)

+------------------------------------------------------------------------+
|                Test Statistics for the Classical Model                 |
|                                                                        |
|        Model            Log-Likelihood    Sum of Squares    R-squared  |
| (1)  Constant term only     -138.35814   .1140409821D+03     .0000000  |
| (2)  Group effects only      -90.48804   .3936109461D+02     .6548513  |
| (3)  X - variables only       61.76991   .1335449522D+01     .9882897  |
| (4)  X and group effects     130.08647   .2926207777D+00     .9974341  |
|                                                                        |
|                                Hypothesis Tests                        |
|               Likelihood Ratio Test                F Tests             |
|          Chi-squared   d.f.  Prob.         F    num. denom. Prob value |
| (2) vs (1)    95.740      5     .00000    31.875    5    84     .00000 |
| (3) vs (1)   400.256      3     .00000  2419.329    3    86     .00000 |
| (4) vs (1)   536.889      8     .00000  3935.818    8    81     .00000 |
| (4) vs (2)   441.149      3     .00000  3604.832    3    81     .00000 |
| (4) vs (3)   136.633      5     .00000    57.733    5    81     .00000 |
+------------------------------------------------------------------------+
Error:   425: REGR;PANEL. Could not invert VC matrix for Hausman test.

+--------------------------------------------------+
| Random Effects Model: v(i,t) = e(i,t) + u(i)     |
| Estimates:  Var[e]              =   .361260D-02  |
|             Var[u]              =   .119159D-01  |
|             Corr[v(i,t),v(i,s)] =   .767356      |
| Lagrange Multiplier Test vs. Model (3) =  334.85 |
| ( 1 df, prob value =  .000000)                   |
| (High values of LM favor FEM/REM over CR model.) |
| Fixed vs. Random Effects (Hausman)     =     .00 |
| ( 3 df, prob value = 1.000000)                   |
| (High (low) values of H favor FEM (REM).)        |
| Reestimated using GLS coefficients:              |
| Estimates:  Var[e]              =   .362491D-02  |
|             Var[u]              =   .392309D-01  |
| Var[e] above is an average. Groupwise            |
| heteroscedasticity model was estimated.          |
|             Sum of Squares          .147779D+01  |
+--------------------------------------------------+
+---------+--------------+----------------+--------+---------+----------+
|Variable | Coefficient  | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X|
+---------+--------------+----------------+--------+---------+----------+
OUTPUT       .9041238041   .24615477E-01   36.730   .0000    -1.1743092
FUEL         .4238986905   .13746498E-01   30.837   .0000     12.770359
LOAD        -1.064558659       .19933132   -5.341   .0000     .56046016
Constant     9.610634379       .20277404   47.396   .0000
(Note: E+nn or E-nn means multiply by 10 to + or -nn power.)

Like SAS TSCSREG and PANEL procedures, LIMDEP estimates a slightly different variance component for groups (.0119), thus producing different parameter estimates. In addition, the Hausman test is not successful in this example.

Top

7.3 The One-way Random Time Effect Model

Let us compute theta estimate using the SSEs of the between effect model (.0056) and the fixed effect model (1.0882).

The variance component for error is .01511375 = 1.08819022/(15*6-15-3)
The variance component for time is -.00201072 =.005590631/(15-4)- .01511375/6

The theta estimate is .

. gen rt_cost = cost - (-1.226263)*tm_cost // transform variables
. gen rt_output = output - (-1.226263)*tm_output
. gen rt_fuel = fuel - (-1.226263)*tm_fuel
. gen rt_int = 1 - (-1.226263) // for the intercept

. regress rt_cost rt_int rt_output rt_fuel rt_load, noc

Source |       SS       df       MS              Number of obs =      90
-------------+------------------------------           F(  4,    86) =       .
Model |  79944.1804     4  19986.0451           Prob > F      =  0.0000
Residual |  1.79271995    86  .020845581           R-squared     =  1.0000
Total |  79945.9732    90  888.288591           Root MSE      =  .14438

------------------------------------------------------------------------------
rt_cost |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
rt_int |   9.516098   .1489281    63.90   0.000     9.220038    9.812157
rt_output |   .8883838   .0143338    61.98   0.000     .8598891    .9168785
rt_fuel |   .4392731   .0129051    34.04   0.000     .4136186    .4649277
rt_load |  -1.279176   .2482869    -5.15   0.000    -1.772754   -.7855982
------------------------------------------------------------------------------

However, the negative value of the variance component for time is not likely. This section presents examples of procedures and commands for the one-way time random effect model without outputs.

In SAS, use the TSCSREG or PANEL procedure with the /RANONE option.

PROC SORT DATA=masil.airline;
BY year airline;

PROC TSCSREG DATA=masil.airline;
ID year airline;
MODEL cost = output fuel load /RANONE;
RUN;

PROC PANEL DATA=masil.airline;
ID year airline;
MODEL cost = output fuel load /RANONE BP;
RUN;

In Stata, you have to switch the grouping and time variables using the .tsset command.

. tsset year airline

panel variable:  year, 1 to 15
time variable:  airline, 1 to 6

. xtreg cost output fuel load, re i(year) theta

In LIMDEP, you need to use the Period\$ and Random\$ subcommands.

Top

7.4 The Two-way Random Effect Model in SAS

The random group and time effect model is formulated as . Let us first estimate the two way FGLS using the SAS PANEL procedure with the /RANTWO option. The BP2 option conducts the Breusch-Pagan LM test for the two-way random effect model.

PROC PANEL DATA=masil.airline;
ID airline year;
MODEL cost = output fuel load /RANTWO BP2;
RUN;

The PANEL Procedure
Fuller and Battese Variance Components (RanTwo)

Dependent Variable: cost

Model Description

Estimation Method             RanTwo
Number of Cross Sections           6
Time Series Length                15

Fit Statistics

SSE              0.2322    DFE                  86
MSE              0.0027    Root MSE         0.0520
R-Square         0.9829

Variance Component Estimates

Variance Component for Cross Sections    0.017439
Variance Component for Time Series       0.001081
Variance Component for Error              0.00264

Hausman Test for
Random Effects

DF    m Value    Pr > m

3       6.93    0.0741

Breusch Pagan Test for Random
Effects (Two Way)

DF    m Value    Pr > m

2     336.40    <.0001

Parameter Estimates

Standard
Variable        DF    Estimate       Error    t Value    Pr > |t|

Intercept        1    9.362677      0.2440      38.38      <.0001
output           1    0.866448      0.0255      33.98      <.0001
fuel             1    0.436163      0.0172      25.41      <.0001
load             1    -0.98053      0.2235      -4.39      <.0001

Similarly, you may run the TSCSREG procedure with the /RANTWO option.

PROC TSCSREG DATA=masil.airline;
ID airline year;
MODEL cost = output fuel load /RANTWO;
RUN;

Top

7.5 Testing Random Effect Models

The Breusch-Pagan Lagrange multiplier (LM) test is designed to test random effects. The null hypothesis of the one-way random group effect model is that variances of groups are zero. If the null hypothesis is not rejected, the pooled regression model is appropriate. The e'e of the pooled OLS is 1.33544153 and the e'e bar is .0665147.

LM is 334.8496 = with p <.0000.

With the large chi-squared, we reject the null hypothesis in favor of the random group effect model. The SAS PANEL procedure with the /BP option and the LIMDEP Panel\$ and Het\$ subcommands report the LM statistic. In Stata, run the .xttest0 command right after estimating the one-way random effect model.

. quietly xtreg cost output fuel load, re i(airline)

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects:

cost[airline,t] = Xb + u[airline] + e[airline,t]

Estimated results:
|       Var     sd = sqrt(Var)
---------+-----------------------------
cost |   1.281358       1.131971
e |   .0036126       .0601051
u |   .0155972       .1248886

Test:   Var(u) = 0
chi2(1) =   334.85
Prob > chi2 =     0.0000

The null hypothesis of the one-way random time effect is that variance components for time are zero. The following LM test uses Baltagi¡¯s formula. The small chi-squared of 1.5472 does not reject the null hypothesis at the .01 level.

LM is with p<.2135.

. quietly xtreg cost output fuel load, re i(year)

. xttest0

Breusch and Pagan Lagrangian multiplier test for random effects:

cost[year,t] = Xb + u[year] + e[year,t]

Estimated results:
|       Var     sd = sqrt(Var)
---------+-----------------------------
cost |   1.281358       1.131971
e |   .0151138        .122938
u |          0              0

Test:   Var(u) = 0
chi2(1) =     1.55
Prob > chi2 =     0.2135

The two way random effects model has the null hypothesis that variance components for groups and time are all zero. The LM statistic with two degrees of freedom is 336.3968 = 334.8496 + 1.5472 (p<.0001).

Top

7.6 Fixed Effects versus Random Effects

How do we compare a fixed effect model and its counterpart random effect model? The Hausman specification test examines if the individual effects are uncorrelated with the other regressors in the model. Since computation is complicated, let us conduct the test in Stata.

. tsset airline year

panel variable:  airline, 1 to 6
time variable:  year, 1 to 15

. quietly xtreg cost output fuel load, fe

. estimates store fixed_group

. quietly xtreg cost output fuel load, re

. hausman fixed_group .

---- Coefficients ----
|      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))
|   fix_group        .          Difference          S.E.
-------------+----------------------------------------------------------------
output |    .9192846     .9066805        .0126041        .0153877
fuel |    .4174918     .4227784       -.0052867        .0058583
load |   -1.070396    -1.064499       -.0058974        .0255088
------------------------------------------------------------------------------
b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg

Test:  Ho:  difference in coefficients not systematic

chi2(3) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=        2.12
Prob>chi2 =      0.5469
(V_b-V_B is not positive definite)

The Hausman statistic 2.12 is different from the PANEL procedure¡¯s 1.63 and Greene (2003)¡¯s 4.16. It is because SAS, Stata, and LIMDEP use different estimation methods to produce slightly different parameter estimates. These tests, however, do not reject the null hypothesis in favor of the random effect model.

Top

7.7 Summary

Table 7 summarizes random effect estimations in SAS, Stata, and LIMDEP. The SAS PANEL procedure is highly recommended.

Table 7 Comparison of the Random Effect Model in SAS, Stata, LIMDEP*