This class deals with techniques referred to broadly as multivariate methods. We focuses on how these methods can be used to transform a set of related variables into a smaller number of more fundamental measures. This is sometimes referred to as "scaling". Examples of how these methods might be used include: multiple tests scores used to create a measure of ability; using variables for exposure to cultural events to create a scale of cultural capital; using questions about interactions with people having a mental illness to create a measure of social distance. Creating scales is often a critical first step in data analysis. Too often a simple summated scale, presented along with Crohnbach's alpha, is all that is done, possibly obscuring as much as it reveals. After reviewing methods such as multidimensional scaling, principal components, and cluster analysis, we focus on latent structure analysis (LSA). LSA includes exploratory factor analysis, confirmatory factor analysis, latent class analysis, item response models, and structural equation modeling. Assignment will involve exercises applying these models to real data. Feel free to email me at jslong at indiana dot edu with questions.

**Prerequisites**: Students need a prior course on the linear regression model and a course on models for categorical outcomes.

**This web page serves as the syllabus for the class**

· Policies and logistics
· Workflow
· Computing and data
· Getting help
·

· Books
· Resources
· FTP downloads · Getting ready
·

### News

### Policies and logistics

**Enrollment: **Sometimes there are more students who want to take S651 than there are seats. First priority is given to students for whom the course is required. Otherwise, authorizations is given on a first-come-first-serve basis. If you are interested in taking the class, contact Scott Long (jslong at indiana dot edu). If you are given an authorization, sign up for the class during the normal enrollment period; if you do not, your authorization will be given to the next student on the wait list.
**Logistics**: Lectures are Tu and Th 9:30-10:45AM (Woodburn 109). **Please arrive on time.** Lab is Th 5:00-7:00PM. Lab is not required but makes sure that you will have access to the software used in the class. My office is Ballantine 842B which is directly across from the elevator. Enter 842 (no need to knock) and my office is at the end of the hall. If I am talking with someone, let me know you are waiting. Office hours are **TO BE DETERMINED**. Other times are available by appointment. During the week if you don't hear within 12 hours, try again. 2014-01-12
**Time conflicts: **If you have another class that
overlaps with the lecture, you will need to take the class another semester.
**Lecture notes**: PDFs of lectures will be made available on the class LAN.
**Turning in assignments**: Assignments are due at the start of class on the day they are due. Pedagogically it is critical to complete assignments on time. *Late assignments are penalized 25%*. If there are special circumstances, let me know and we'll figure out something.
**LAN access**: Students in the registrar's enrollment list will be given R/W permission to the course LAN. If you enrolled late, contact sochelp at indiana dot edu (cc jslong at indiana dot edu) and request R/W access to the LAN for Soc 651. Copies of all your work should be saved on the LAN, but be sure you have it backed up elsewhere. Documentation on accessing the LAN is here.
**IU students can run Stata for free over the internet**. For details go here. Opinions vary on how well this works.
**Grading errors**: If a mistake is made in grading, I apologize. Unfortunately, it sometimes happens. Return the assignment to me along with a cover page explaining the error. If I do not return the assignment documenting the change within two class periods, please remind me by e-mail.

###

### Workflow for reproducible results

An essential part of being an effective researcher is having a workflow to organize your efforts and allow others to reproduce your results. Since this is a class in applied data analysis, a portion of your grade is based on the workflow you use in completing your assignments. More general information and a detailed treatment of workflow is available at Long’s workflow page and his book *The Workflow of Data Analysis Using Stata*. For this class you are not required to implement the full workflow from the book, but must follow the guidelines provided in class.

###

### Computing and data

**Downloading files**: Files for the class will be on the LAN.
**Getting Started with Stata**: Information on getting started with Stata is here. Also, google "Stata Youtube" for helpful information.
**Data sets**: You will need to find data to use for your class project. ICPSR (google ICPSR) has many datasets that would work. If you are using data that you have not collected yourself, make sure that you have explicit permission to use the data for the class.

### Getting help

If you need help debugging a program, the best
thing is to place relevant files on the LAN in the subdirectory called \_Students\_Help\<username> (e.g., \LAN\_Students\_Help\jslong Scott Long). Include the do-file, the dataset, and log file in
text format, not smcl. Please follow the guidelines below or it is much less likely that you will get a quick and helpful answer. For further details on getting help, check here.

- The script-file must be self-contained. It must load the data, create needed variables (if any), generate the
problem, and save a log file in text format. The do-file must have comments explaining what you are
doing and what the problem is.
- If a command is causing a problem,
include the command which
* command-name* for the command causing the problem. This tells me which
version of the command you are using.
- Do
**not** refer to specific directories (e.g., do not: use d:\mydata\science3.dta).
Assume that your data is located in your Stata working directory.
- Here is an example of what the do-file might look:

capture log close

log using jslong_assgn1_problem, text replace

// Scott Long - 2011-08-31

// Assignment 4: binary regression

// ERROR: see #3 below.

// #1: load data and check data

spex science2, clear

tab y

sum x1 x2

// #2: estimate logit

logit y xl x2, nolog

// #3: compute discrete change

// ERROR: variable xl not found

which prchange

prchange, x(x1=1 x2=3)

log close

exit

### Books and lecture notes

**Lecture notes**: PDFs will be put on the LAN. Bring paper or electronic copies to class. *Required.*

**Required: **Analysis of Multivariate Social Science Data, Second Edition by Bartholomew, Steele, Moustaki, and Galbraith. If you have the first edition, that should be fine (as long as you don't mind searching a bit when I refer to a specific page number). The 2nd edition adds multiple regression, a brief discussion of the SEM (structural equation model), and an introduction to multilevel models. (amazon.com)

**Recommeded: **Latent Class and Latent Transition Analysis by Collins and Lanza. We will be using their software for LCA as well. (amazon.com)

**Recommended: ** The workflow of data analysis. (WF at www.stata.com $52 plus shipping; amazon.com for $61). If several people order from Stata, shipping is much more reasonable. *Highly recommended for you work in graduate school*

**Optional**: Confirmatory Factor Analysis: A Preface to LISREL by Scott Long is an old but remarkable book on the confirmatory factor model. It has a great deal of detail on information on identification. (CFA at amazon.com $17 or $10 on Kindle) *Optional*.

**Optional**: Structural Equations with Latent Variables by Ken Bollen is the most comprehensive book available on the structural equation model treated in Chapter 11 of AMSSD. (SELV at amazon.com $125) *Optional*.

### Resources

The AMSSD site has datasets and sample chapters that you can download.

Collins and Lanza book site for data, software, publications.

Montoring the Future data from Collins and Lanza book.

The Mplus site has a lot of extremely valuable information. The examples there along with the discussion formum is often more helpful than the official manual.

The Workflow site has supplementary information for the Workflow book.

### Getting ready for Soc651

There are several things you can do to get ready for the class.

- Review a book on the linear regression model.
- Learn the Greek alphabet, upper and lower case.
- Review basic matrix algebra.
- Skim the entire AMSSD to get an overview of the models to be considered.