P533 Intro. to Bayesian Data Analysis I, Prof. Kruschke

P533
Introduction to Bayesian Data Analysis I.

Prof. John K. Kruschke

Fall 2008: M, W, 10:10-11:35am, room Psych 111.
(Registrar class number 17632)

Under construction, May 2008.

This course is a tutorial introduction to doing Bayesian statistics for data analysis. We will start from the basics of probabilities and Bayes' theorem, and gradually work our way through contemporary Monte Carlo methods in the context of simple analyses, building up to simple examples of hierarchical models (see list of topics below). The course is intended to make advanced Bayesian methods accessible to real graduate students (see pre-req's below). The course is "hands on": We will build computer-based analyses so that you can actually get in the kitchen and make a meal, rather than just consume fast food at the drive through. This way you can adapt the methods to your own research scenarios.
      This course is the first semester of a two-semester sequence. This first semester emphasizes the simplest data situation: two-valued measurements such as yes/no, agree/disagree, remember/forget, detect/miss, male/female, heads/tails, and so on. The main goal is to use this simple situation to develop the methods of contemporary Bayesian analysis. The second semester,P534, applies the methods to data situations involving continuous measures (e.g., response time, etc.). The second semester emphasizes inferences regarding means, trends, and so forth.

Schedule of Topics:

  1. Models, parameters, beliefs. Intro to R.
  2. Probability: Inside and outside the head. Mass and density. Conditional probabilities.
  3. Bayes' theorem. Three goals of statistical inference.
  4. Inferring a binomial proportion via exact mathematical analysis.
  5. Inferring a binomial proportion via grid approximation.
  6. Inferring a binomial proportion via Monte Carlo approximation. The Metropolis algorithm.
  7. Inferences regarding two binomial proportions. Goals of inference. Three approaches: Analytical, grid approximation, Monte Carlo approximation. Metropolis algorithm and Gibbs sampling.
  8. Binomial likelihood with hierarchical priors. Intro to "BUGS" software.
  9. Comparison of Bayesian inference with null hypothesis significance testing.
The second semester of the course, P534, covers analysis of means, linear models, etc.

Intended audience: The course is aimed at graduate students and advanced undergraduates.

Prerequisites:

  • This is not a mathematical statistics course, but a fair amount of mathematics is unavoidable. If you know what someone means when she says, "The integral of x squared is one-third x cubed," then you should be okay. You will not have to generate a lot of mathematical derivations, but you will have to understand some, and most will involve basic calculus.
  • We will be doing a lot of computer programming in a language called R. R is free and can be installed on any computer. The road to understanding will be much smoother if you have already had some programming experience, in any language. It's easy to learn basic programming, but it can be time consuming, so if you don't have any previous experience, just anticipate spending more time. Learning to program can have huge payoffs in multiple situations later in your career, so it's worth the effort.
  • A previous course in traditional statistics (such as K300) or probability can be helpful as background. Although the course will be developed independently of traditional ("null hypothesis significance testing") statistical methods, you might find the concepts of probability easier to understand if you have already had some exposure to them.

    Materials:

  • The primary materials will be info delivered at lecture and in extensive course notes being written by the Prof. Kruschke. Readings will be posted on Oncourse under the "Resources" link.
  • Please also discuss the assignments and lectures on Oncourse under the "Message Center" link. If you attending the class but cannot get access to the Oncourse page, please e-mail Prof. Kruschke.

    The following textbooks are recommended (not required) other sources:

  • Albert, J. H., & Rossman, A. J. (2001). Workshop Statistics: Discovery with Data, a Bayesian Approach. Emeryville, CA: Key College Publishing. The last few chapters of this book give a wonderfully "hands on" introduction to the basics of Bayesian statistics --- but only the basics.
  • Bolstad, W. M. (2004). Introduction to Bayesian Statistics. Hoboken, NJ: Wiley. Terrific tutorial that uses only basic calculus; highly recommended. Only down side is that it does not cover hierarchical models or computer implementation of any numerical approximation methods.
  • Gill, J. (2002). Bayesian Methods: A Social and Behavioral Sciences Approach. Boca Raton, FL: Chapman and Hall/CRC Press.
    and
  • Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004): Bayesian Data Analysis, 2nd Ed. Boca Raton, FL: Chapman and Hall/CRC Press. These two books offer more advanced examples of Bayesian methods, including Monte Carlo techniques, but require a lot more "connecting the dots" by the reader.

    Grading; Homework; Exams: There will be weekly homework assignments. At this time I plan no exams nor any final project. Grades will be determined by performance on the homework assignments. All assignments are mandatory. There will be penalties for late homework unless you have a cogent excuse. These penalties are designed as an incentive to you because the material is cumulative; the penalties also help keep things fair to all students. If you must be late with an assignment, please let me know immediately.

    Disclaimer: All the information here is subject to change. Changes will announced in class.

    This web page is at URL = http://www.indiana.edu/~jkkteach/P533/