P533/P534 Bayesian Data Analysis I & II, Prof. Kruschke

Introduction to Bayesian Data Analysis

Spring 2012: Mondays and Wednesdays, 5:45pm-8:15pm,
Room 111 of the Psychology Building.

P533 and P534 are consecutive 8-week courses in one semester.
P533 is required for P534.
P533 is Section 26725, P534 is Section 22134.

Prof. John K. Kruschke

Success increases with knowledge of Bayesian data analysis
Figure 1. Why you should enroll.
(Notice that the Bayesian analysis reveals many credible regression lines, for which the slopes and intercepts trade off, instead of just one "best" line.)

P533/P534 is a tutorial introduction to doing Bayesian statistics for data analysis. In P533, we start from the basics of probabilities and Bayes' theorem, and gradually work our way through contemporary Monte Carlo methods in the context of simple analyses, building up to hierarchical models, model comparison, and power analysis. In P534, we do a variety of realistic applications, covering the Bayesian versions of linear regression, logistic regression, t-tests, analysis of variance, etc., including a look at repeated measures designs. More details about topic coverage is provided below. The course is intended to make advanced Bayesian methods genuinely accessible to real graduate students, and even unreal undergraduates (see pre-req's below). Many complete computer programs are provided for you do adapt to your own research.

Why should we do Bayesian analysis instead of 20th century null hypothesis significance testing? Sciences from astronomy to zoology are changing from 20th-century null-hypothesis significance testing to Bayesian data analysis. Read more:
  • An open letter explaining why it's time to go Bayesian.
  • An article* that explains a critical flaw of p-values in null hypothesis significance testing, and two different Bayesian approaches to assessing null values.
      Kruschke, J. K. (2010). Bayesian data analysis. Wiley Interdisciplinary Reviews: Cognitive Science, 1(5), 658-676. (doi:10.1002/wcs.72)
  • An article* that emphasizes advantages of Bayesian data analysis and the fact that Bayesian data analysis is appropriate regardless of the status of Bayesian models of cognition.
      Kruschke, J. K. (2010). What to believe: Bayesian methods for data analysis. Trends in Cognitive Sciences, 14(7), 293-300. (doi:10.1016/j.tics.2010.05.001)
  • More at the blog!
*Your click on this link constitutes your request to the author for a personal copy of the article exclusively for individual research.

Prerequisites:

  • This is not a mathematical statistics course, but some math is unavoidable. If you can handle basic summation notation like Σi xi and integral notation like ∫ x dx, you're in good shape. You will not need to generate mathematical derivations.
  • We will be doing a lot of computer programming in a language called R. R is free and can be installed on any computer. (We will be using a free add-on package called BRugs that only works with Windows, but which you can run with WINE in MacOS or Linux.) The road to understanding will be much smoother if you have already had some programming experience, in any language. It's easy to learn basic programming, but it can be time consuming, so if you don't have any previous experience, just anticipate spending more time. Learning to program can have huge payoffs in multiple situations later in your career, so it's worth the effort.
  • A previous course in traditional statistics (such as K300) or probability can be helpful as background, but is not essential. P533 and P534 proceed independently of traditional ("null hypothesis significance testing") statistical methods.

    Book spine. Topics covered, in a little more detail: P533 emphasizes the simplest data situation: two-valued measurements such as yes/no, agree/disagree, remember/forget, detect/miss, male/female, heads/tails, and so on. The main goal is to use this simple situation to develop all the methods of contemporary Bayesian analysis, including hierarchical models and even the impress-your-friends-with-this "transdimensional Markov chain Monte Carlo" method for model comparison! P534 applies the methods to more complex data designs, corresponding to classical methods of multiple linear regression, logistic regression, t-tests, analysis of variance, etc. For complete details of coverage, see the textbook's Table of Contents, linked into the textbook web page.

    Textbook: Doing Bayesian Data Analysis: A Tutorial with R and BUGS, by J. K. Kruschke. Academic Press, 2010.

    Book cover.

    Discussion: Please discuss the assignments and lectures on Oncourse under the "Forums" link. If you are attending the class but cannot get access to the Oncourse page, please e-mail Prof. Kruschke.

    Grading; Homework; Exams: There are homework exercises assigned every week or so. No exams or projects. Grades will be determined by performance on the homework assignments. All assignments are mandatory. There will be penalties for late homework unless you have a cogent excuse. These penalties are designed as an incentive to you because the material is cumulative; the penalties also help keep things fair to all students. If you must be late with an assignment, please notify the professor immediately.

    How does this course (P533/P534) differ from S626? The Dept. of Statistics offers S626, Bayesian theory and data analysis. S626 has a prerequisite of "two statistics courses at the graduate level", and provides a mathematical treatment of Bayesian data analysis. Students are encouraged to consider S626 after taking P533/P534.

    Disclaimer: All the information here is subject to change. Changes will announced in class.

    This web page is at URL = http://www.indiana.edu/~jkkteach/P533/