Introduction

Multilevel data are pervasive in the social sciences. Students may be nested within schools, voters within districts, or workers within firms, to name a few examples. Statistical methods that explicitly take into account hierarchically structured data have gained popularity in recent years, and there now exist several special-purpose statistical programs designed specifically for estimating multilevel models (e.g. HLM, MLwiN). In addition, the increasing use of of multilevel models -- also known as hierarchical linear and mixed models - has led general purpose packages such as SPSS, Stata, and SAS to introduce their own procedures for handling nested data.

Nonetheless, researchers may face two challenges when attempting to determine the appropriate syntax for estimating multilevel/mixed models using general purpose software. First, many users from the social sciences come to multilevel modeling with a background in regression models, whereas much of the software documentation utilizes examples from experimental disciplines [due to the fact that multilevel modeling methodology evolved out of ANOVA methods for analyzing experiments with random effects (Searle, Casella, and Mc- Culloch, 1992)]. Second, notation for multilevel models is often inconsistent across disciplines (Ferron 1997).

The purpose of this document is to demonstrate how to estimate multilevel models using SPSS, Stata, and SAS. It first seeks to clarify the vocabulary of multilevel models by defining what is meant by fixed effects, random effects, and variance components. It then compares the model building notation frequently employed in applications from the social sciences with the more general matrix notation found in much of the software documentation. The syntax for centering variables and estimating multilevel models is then presented for each package.


Up: Table of Contents
Next: Vocabulary of Mixed and Multilevel Models