P533
Bayesian Data Analysis, Prof. John K. Kruschke
Spring 2019: Tu,Th 9:30am10:45am, Room 111 Psych.
Overview:
P533 is a tutorial
introduction to doing Bayesian data analysis. The course is intended to make
advanced Bayesian methods genuinely accessible to graduate students in the
social sciences. The course covers all the fundamental concepts of Bayesian
methods, and works from the simplest models up through hierarchical models
(a.k.a. multilevel models) applied to various types of data. More details about
content are provided below in the Schedule of Topics. Students from all fields
are welcome and encouraged to enroll (see figure at right). The course uses
examples from a variety of disciplines.
Prerequisites: This is not a mathematical statistics
course, but some math is unavoidable. If you understand basic summation
notation like Σ_{i} x_{i} and
integral notation like ∫ x dx ,
then you're in good shape. We will be doing a lot of computer programming in a
language called R. R is free and can be installed on any computer. The textbook
includes an introductory chapter on R. A previous course in traditional
statistics or probability can be helpful as background, but is not essential.
P533 proceeds independently of traditional ("null hypothesis significance
testing") statistical methods.
Credit toward I.U.
Statistics Department requirements:
P533 counts toward the Ph.D. minor in STAT and toward the 12 hour "area
relevant to statistics" section of the MSAS (Masters in Applied Statistics).
Homework: There will be weekly homework
assignments. You are encouraged to use whatever resources help you understand
the homework and complete it with full comprehension, but ultimately you must
write your own answers on your own and in your own words. Each homework
assignment begins with an honor statement indicating that you are writing your
answers on your own in your own words. In your answers that you submit, please
provide explanations and thoroughly show all your computations, with annotation
that explains what you are doing. An unannotated succession of computations
will not get full credit, even if it is numerically correct.
Course Grading Method: Grading is based on your total
homework score, as a percentile relative to the class. There are no exams and
no projects. N.B.: Scores tend to be very high, so do not think that, say, 96%
must be a grade of A because it could end up being an A if, say, two thirds of
the class does better than 96%. Typically the late
penalties turn out to be a bigger deduction than points missed due to errors,
so don't fall behind. As this is a graduate course, grades are typically in the
A to high B range, and only rarely is a C or less assigned.
All
assignments are mandatory. Late homework is exponentially penalized with a
halflife of one week, meaning that after one week 50% is the maximum possible
score. (The R program for the exponential decay is in the Canvas files; see LatePenalty.R.) No homework may be turned in more than
three weeks later than its due date (and no homework may be turned in after
12:00 noon of Wednesday of finals week). There are two reasons for this policy:
First, the course moves quickly and the material is
largely cumulative, so the late penalty acts as an extra incentive to keep up.
Second, the assistant, who will be grading the homework, must not be given a
flood of late homework papers at the end of the semester. In recognition of the
fact that "life happens" (e.g., shortterm illness, personal turmoil,
overwhelming confluence of deadlines, etc.), your two worst late penalties will
be dropped. In other words, for every homework we will record the scores with
and without a late penalty. The two homeworks with
the largest difference between with and without late penalty will have their
late penalty dropped. Note, therefore, that any homework not turned in will
count as zero.
Required textbook: Doing
Bayesian Data Analysis, 2nd Edition: A Tutorial with R, JAGS, and Stan. Go
to the web page, https://sites.google.com/site/doingbayesiandataanalysis/purchase,
for a link to purchase the book. The book is also available online through the
IU Library.
Instructor: John K. Kruschke,
johnkruschke@gmail.com. Office hours by appointment; please do ask.
Assistant: Brad Celestin, bcelesti@umail.iu.edu.
Office hours to be posted on Canvas.
Discussion: Please discuss the assignments and
lectures on Canvas. If you are
attending the class but cannot get access to the Canvas page, please email Prof.
Kruschke.
Disclaimer: All information in this document is
subject to change. Changes will be announced in class.
Schedule
of Topics Exact
day of each topic might flex as course progresses. 

Week 
Day 
Chapter and topic 
1 
Tu 
2. Introduction:
Credibility, models, and parameters. Strongly recommended
article: “Bayesian data analysis for newcomers” at https://link.springer.com/article/10.3758/s1342301712721 or https://psyarxiv.com/nqfr5/ 
1 
Th 
3. The R programming
language. Instructions for installation of software are here: https://sites.google.com/site/doingbayesiandataanalysis/softwareinstallation 
2 
Tu 
4. Probability. 
2 
Th 
5. Bayes’ rule. 
3 
Tu 
6. Inferring a
probability via mathematical analysis. 
3 
Th 
7. Markov chain Monte
Carlo (MCMC). 
4 
Tu 
8. JAGS. 
4 
Th 
8, continued. 
5 
Tu 
9. Hierarchical models. 
5 
Th 
9, continued. 10. Model comparison. 
6 
Tu 
10, continued. 11. Null hypothesis
significance testing (NHST). Strongly recommended
article: “The Bayesian New Statistics” at https://link.springer.com/article/10.3758/s1342301612214 or https://osf.io/ksfyr/ 
6 
Th 
11. NHST, continued. 
7 
Tu 
12. Bayesian null
assessment. See also articles
“Bayesian assessment of null values via parameter estimation and model
comparison” at http://www.indiana.edu/~kruschke/articles/Kruschke2011PoPScorrected.pdf and “Rejecting or
accepting parameter values in Bayesian estimation” at http://journals.sagepub.com/doi/full/10.1177/2515245918771304 
7 
Th 
12, continued. 
8 
Tu 
13. Goals, power, and
sample size. See also video at http://www.youtube.com/playlist?list=PL_mlm7M63Y7j641Y7QJG3TfSxeZMGOsQ4. 
8 
Th 
13, continued. 
9 
Tu 
15. The generalized
linear model. 16. Metric predicted variable,
1 or 2 group predictor variable. 
9 
Th 
16, continued. Also power analysis applied to 2 groups.
See article titled “Bayesian estimation supersedes the t test” at http://www.indiana.edu/~kruschke/BEST/. 
10 
Tu 
17. Metric predicted
variable, metric predictor variable. 
10 
Th 
17, continued. 18. Metric predicted
variable, metric predictor variables.
See also article “The
time has come: Bayesian methods for data analysis in the organizational
sciences” at http://www.indiana.edu/~kruschke/BMLR/. 
11 
Tu 
18, continued. 
11 
Th 
19. Metric predicted
variable, nominal predictor variable. 
12 
Tu 
19, continued. 20. Metric predicted
variable, nominal predictor variables. 
12 
Th 
20, continued. 
13 
Tu 
21. Dichotomous
predicted variable (logistic regression). 
13 
Th 
22. Nominal predicted
variable (softmax regression). For an applied example
of hierarchical conditional logistic regression, see article titled
“Ostracism and fines in a public goods game with accidental contributions:
The importance of punishment type” at http://journal.sjdm.org/14/14721a/jdm14721a.pdf 
14 
Tu 
22, continued. 
14 
Th 
23. Ordinal predicted
variable (ordered probit regression). For more about the
perils of applying metric models to ordinal data, see “Analyzing ordinal data
with metric models: What could possibly go wrong?” at https://www.sciencedirect.com/science/article/pii/S0022103117307746 For another example or
ordinal regression, see manuscript titled “Moral Foundation Sensitivity and
Perceived Humor” at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2519218 
15 
Tu 
23, continued. 
15 
Th 
24. Count predicted
variable. 
Finals 

No final exam, but
final homework is due during finals’ week at date TBA. 