R
This section discusses how to center variables and estimate multilevel
models using R. Install and load the package lme4,
which fits linear and generalized linear mixed-effects models.
From the drop-down menu, select Install Packages and then Load Package
under Packages, or, alternatively, use the following syntax:
> install.packages(lme4)
> library(lme4)
Next, load the High School and Beyond dataset in R and attach the dataset to the R search path:
> HSBdata <- read.table("C:/user/temp/hsbALL.txt", header=T, sep=",")
> attach(HSBdata)
To center a level-1 variable around the mean of all cases
within the same level-2 group, use the ave() function, which estimates group averages. The new
variable meanses is the average of ses for each high school
group id. The new variable centses is centered around the mean of all
cases within each level-2 group by subtracting meanses from ses.
The text HSBdata$, which is attached to the new variable name, indicates that the
variables will be created within the existing dataset HSBdata. Remember
to attach the dataset to the R search path after making any changes to the variables.
> HSBdata$meanses <- ave(ses, list(id))
> HSBdata$centses <- ses - meanses}
> attach(HSBdata)
Within the lme4 package, the lme() function estimates linear
mixed effects models. To use lme(), specify the dependent variable,
the fixed components after the tilde sign and the random components in
parentheses. Indicate which dataset R should use. To fit the empty
model described above (5), use the following sintax:
> results1 <- lmer(mathach ~ 1 + (1 | id), data = HSBdata)
> summary(results1)
R saves the results of the model in an object called results1, which is
stored in memory and may be retrieved with the function summary().
The function lmer() estimates a model, in which mathach is the
dependent variable. The intercept, denoted by 1 immediately following
the tilde sign, is the intercept for the fixed effects. Within the parentheses, 1
denotes the random effects intercept, and the variable id is specified
as the level-2 grouping variable. R uses the HSBdata for this analysis.
The results are displayed in Table 4. The average test
score across schools, reflected in the fixed effects intercept term, is 12.6370.
The variance component corresponding to the random intercept is 8.614. The two
variance components can be used to partition the variance
across levels. The intraclass correlation coefficient is equal to
8.614/(39.148+8.614)=18.04, meaning that roughly 18%
of the variance is attributable to the school level.
To explain some of the school-level variance in student math achievement
scores, incorporate school-level predictors in the empty the model.
The socioeconomic status of the typical student and the school's status as
public or private may influence test performance. The following R syntax indicates
how to incorporate these two variables as fixed effects:
> results2 <- lmer(mathach ~ 1 + sesmeans + sector + (1 | id), data = HSBdata)
> summary(results2)
The intercept, which now corresponds to the expected math achievement
score in a public school with average SES scores, is 12.1282. Moving
to a private school bumps the expected score by 1.2254 points. In
addition, a one-unit increase in the average SES score is associated
with an expected increased in math achievement of 5.3328. These estimates
are all significant.
The variance component corresponding to the random intercept has decreased
to 2.3140, indicating that the inclusion of the level-2
variables has accounted for some of the unexplained variance in the math achievement.
Nonetheless, the estimate is still more than twice the size
of its standard error, suggesting that there remains unexplained variance.
The final model introduces the student socioeconomic status (SES) variable and cross-level interaction terms.
The centered SES slope is treated as random because individual SES status may vary across schools.
In addition, a school's average SES score and its sector (public
or private) may interact with student-level SES, thus accounting for some of the variance in the math achievement
slope. To include cross-level interaction terms in your model, place an asterisk between the two variables composing the interaction.
> results3 <- lmer(mathach ~ sesmeans + sector + centses + sesmeans*centses + sector*centses + (1 + centses|id), data = HSBdata)
> summary(results3)
The results are displayed in the final column of Table 4.
Because there are interactions
in the model, the marginal fixed effects of each variable now depend
on the value of the other variable(s) involved in the interaction.
The marginal effect of a one-unit change in student's SES on math
achievement will depend on whether a school is public or private as
well as on the average SES score for the school.
When cross-level interactions are present, graphical means may be
appropriate for exploring the contingent nature of marginal effects
in greater detail. Here the simplest interpretation of the interaction
coefficients is that the effect of student-level SES is significantly
higher in wealthier schools and significantly lower in private schools.
The variance component for the random intercept is 2.37956, which
is still large relative to its standard deviation of 1.54258. Thus some school-level
variance remains unexmplained in the final model. The
variance component corresponding to the slope, however, is quite small
relative to its standard deviation. This suggests that the researcher
may be justified in constraining the effect to be fixed.
R displays the deviance and AIC and BIC. Comparing both the AIC and BIC
statistics in Table 4 it is clear that the final model is preferable to the
first two models.



