3. Confirmatory Factor Analysis with Missing Data
Missing data is a pervasive problem in the social sciences. A subject may fail to complete a test in an experimental setting, refuse to give an answer to a particular survey item, or drop out of a panel. In many cases, including the previous example, researchers choose to drop all observations from subjects that have missing observations on any of the items included in the model. This approach to handling missing data is referred to as listwise deletion and is the default in programs such as SPSS and Stata. Unfortunately, dropping incomplete cases results in sacrificing information from the sample and can lead to biased estimates when the data are not missing completely at random.
Over the last 30 years, more sophisticated means have emerged for dealing with missing data, many of which have been incorporated into structural equation modeling software. Because it is available in Amos, LISREL, and Mplus, this document will consider Full Information Maximum Likelihood (FIML), an estimator which makes maximal use of all data available from every subject in the sample. Other approaches to dealing with missing data, such as multiple imputation, may also be available depending on the specific program. A non-technical overview of different methods for handling missing data in the context of structural equation models is available in Enders (2001), though the description of the capabilities of specific computer packages is already dated.
This section shows how to estimate the two-factor model for political values introduced in the previous section when the raw data matrix includes missing observations. The data to be analyzed has been saved as an SPSS file named values_full.sav in the C:\temp\CFA folder. All missing observations have been coded as system missing (.) in SPSS.



