J. Scott Long - Indiana University
Department of Sociology :: Department of Statistics :: Interuniversity Consortium for Political and Social Research
Bureau of Social Science Research :: Schuessler Institute for Social Research :: The Kinsey Institute
Home Teaching Research SPost Commands Workflow of Data Analysis Contact and vita Links Recommendations FTP downloads
Stat503 Categorial Data Analysis

Fall 2009/2010

Stat503 (also taught as Soc650) is a second course in in applied statistics. The first course deals with regression models in which the dependent variable is continuous (e.g., Soc 554). These models include the linear regression model, seemingly unrelated regressions, and systems of simultaneous equations. Stat 503/Soc650 deals with regression models in which the dependent variable is limited or categorical. Such models include probit, logit, ordered logit, and Poisson regression, among others. The prerequisite for this class is a prior course in regression. To see the syllabus, click here.

FTP downloads · Due dates· Codebooks · Help · Books· Computing · Enrolling & time conflicts

News

  • CLASSPAK - If the bookstore doesn't have copies of the ClassPak, go to the textbook register at the IU Bookstore and purchase a voucher. They will have a copy by 3PM the next day. Contact the TA if you have problems. If they tell you that they can't do this, ask to talk to Keith Waits. Or, E-mail Kathy Parker cparker@indiana.edu.

Due dates - assignments are due at end of lab.

  • Assignment 1: Math review. Due day 3, September 8, 2009.
  • Assignment 2: Picking your variables. Due day 4, September 10, 2009.
  • Assignment 3: Data cleaning. Due day 6, September 17, 2009.
  • Assignment 4: LRM. Due day 9, September 29, 2009.
  • Assignment 5: BRM. Progress report. Due day 12, October 12, 2009.
    Due day 14, October 15, 2009.
  • Assignment 6: Testing and Fit. Due day 17, October 27, 2009.
  • Assignment 7: OLM. Due day 22, November 12, 2009.
  • Assignment 8: MNLM. Due day 27, December 3, 2009.
  • Assignment 9: Count models. Due Tuesday of final's week, December 15, 2009..

Codebooks

You can use the following datasets for assignments:

  • gss9098extract: General Social Survey 1990-1998 (codebook).
  • hsb: High School and Beyond Study 1983 (codebook).
  • nes: 1992 National Election Study (codebook).
  • science3 (science2): data on the careers of biochemists (codebook).
  • wls: Wisconsin Longitudinal Survey data on Wisconsin high school graduates (codebook).
  • addhealth3: Add Health data (codebook).

Getting help

If you need help debugging a program, the best thing to place relevant files in your directory on the lan in a subdirectory called \Help (e.g., \jslong\help). Include the do-file, dataset, and log file (in text format, not smcl). Please follow the guidelines listed below. If you don't follow these guidelines, I won't be able to help. For further details on getting help, check here.

1) The do-file must be self-contained. It must loads the data, create needed variables (if any), generates the problem, and saves a log file in text format. The do-file must have comments explaining what you are doing and what the problem is.

2) If a SPost command is causing a problem, include the command which command-name for the command causing the problem. This tells me which version of the command you are using.

3) Do not refer to specific directories. Assume that your data is located in your working directory. When you specify a specific directory, the do-file won't run on my computer.

Here is an example of what the do-file might look:

capture log close
log using jslong_assgn1_problem, text replace
// Scott Long - 2008-08-31
// Assignment 4: binary regression

// ERROR: prchange produces variable not found error.

// load data and check data
spex science2, clear
tab y
sum x1 x2

// estimate logit
logit y xl x2, nolog

// compute discrete change
// ERROR: variable xl not found
which prchange
prchange, x(x1=1 x2=3)

log close
exit

Books

  1. ClassPak - be sure to bring this the first day of class. It includes lecture notes and reprints. Required.
  2. Long, J. Scott and Freese, Jeremy. 2005. Regression Models for Categorical Dependent Variables Using Stata, 2nd Edition. Stata Press: College Stata, TX. Required. If you have the “revised” edition, you do not need to buy the 2nd edition. Required.
  3. Recommended: Long, J.S. 2008, The Workflow of Data Analysis Using Stata. Stata Press: College Station, TX.
  4. Recommended: Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks, CA: Sage. Required.

Files to download

Most materials other than the course notes (available at TIS or the Campus Bookstore) can be downloaded here. Files will be added throughout the semester.

  • ftp site for S650/Stat503

Computing

If you want to install the ado files needed for this class, follow this link. You will also find sample programs and data sets at that location. While you may freely use my ado files, they require Stata to run. Stata is installed in campus computing labs. Personal copies can be purchased from the IU Stat/Math Center.

Enrolling in Soc650/Stat503 and Time Conflicts

Enrollment: Unfortunately, there are more students who want to take S650 than there are seats in the class. First priority is given to graduate students in sociology since this is a required course for them. Otherwise, authorizations for the class are given on a first-come-first-serve basis. If you are interested in taking the class, contact the graduate secretary in sociology to get on the list. The graduate secretary (socgrad@indiana.edu) will contact you regarding authorization for the class. If you are given an authorization, you need to sign up for the class during the normal enrollment period; if you do not, your authorization will be given to the next student on the wait list.

Time conflicts: If you have another class that overlaps with the lecture time for Stat503/Soc650, you will need to take one of the classes in another semester. If you have a time conflict with all of the lab times, you should take the class some other semester. If you can attend some of the labs each week and you are already familiar with Stata (or can learn it on your own), you will probably do fine but might have to work harder than students who can attend lab. While most of the lab time is used for students doing independent work, the teaching assistant will give some short lectures related to the assignments. For example, he/she might provide additional information about keeping a research log or how to format tables using Word.

Getting ready for Soc650/Stat503

There are several things you can do to get ready for the class.

  1. Review a book on the linear regression model.
  2. If you are rusty on mathematics, you can review the materials in this file.
  3. Feel free to start reading the main text, which is listed in the syllabus.
© 2007 J. Scott Long