P533
Bayesian Data Analysis, Prof. John K. Kruschke
Spring 2019: Tu,Th 9:30am10:45am, Room 111 Psych.
Overview:
P533 is a tutorial introduction
to doing Bayesian data analysis. The course is intended to make advanced
Bayesian methods genuinely accessible to graduate students in the social sciences.
Students from all fields are welcome and encouraged to enroll (see figure at
right). The course uses examples from a variety of disciplines. The course
covers all the fundamental concepts of Bayesian methods, and works from the simplest
models up through hierarchical models (a.k.a. multilevel models) applied to
various types of data. More details about content are provided below in the
Schedule of Topics.
Prerequisites: This is not a mathematical statistics
course, but some math is unavoidable. If you understand basic summation
notation like Σ_{i} x_{i} and
integral notation like ∫ x dx ,
then you're in good shape. We will not be using much math, but we will be doing
a lot of computer programming in a language called R. R is free and can be
installed on any computer. The textbook includes an introductory chapter on R
and there are lots of resources online. A previous course in traditional
statistics or probability can be helpful as background, but
is not essential. P533 proceeds independently of traditional ("null
hypothesis significance testing") statistical methods.
Credit toward I.U.
Statistics Department requirements:
P533 counts toward the Ph.D. minor in STAT and toward the 12 hour "area
relevant to statistics" section of the MSAS (Masters in Applied
Statistics).
Homework: There will be weekly homework
assignments. You are encouraged to use whatever resources help you understand
the homework and complete it with full comprehension, but ultimately you must
write your own answers on your own and in your own words. Each homework
assignment begins with an honor statement indicating that you are writing your
answers on your own in your own words. In your answers that you submit, please
provide explanations and thoroughly show all your computations, with annotation
that explains what you are doing. An unannotated succession of computations
will not get full credit, even if it is numerically correct.
Course Grading Method: Grading is based on your total
homework score, as a percentile relative to the class. There are no exams and
no projects. N.B.: Scores tend to be very high, so do not think that, say, 96%
must be a grade of A because it could end up being an A if, say, two thirds of
the class does better than 96%. Typically the late
penalties turn out to be a bigger deduction than points missed due to errors,
so don't fall behind. As this is a graduate course, grades are typically in the
A to high B range, and only rarely is a C or less assigned.
All
assignments are mandatory. Late homework is exponentially penalized with a
halflife of one week, meaning that after one week 50% is the maximum possible
score. (The R program for the exponential decay is in the Canvas files; see LatePenaltyCalculator.R.) No homework may be turned in more
than three weeks later than its due date (and no homework may be turned in
after 12:00 noon of Wednesday of finals week). There are two reasons for this
policy: First, the course moves quickly and the material
is cumulative, so the late penalty acts as an extra incentive to keep up.
Second, the assistant, who will be grading the homework, must not be given a
flood of late homework papers at the end of the semester. In recognition of the
fact that "life happens" (e.g., shortterm illness, personal turmoil,
overwhelming confluence of deadlines, etc.), your two worst late penalties will
be dropped. In other words, for every homework we will record the scores with
and without a late penalty. The two homeworks with
the largest difference between with and without late penalty will have their
late penalty dropped. Note, therefore, that any homework not turned in will
count as zero.
Required textbook: Doing
Bayesian Data Analysis, 2nd Edition: A Tutorial with R, JAGS, and Stan. Go
to the web page, https://sites.google.com/site/doingbayesiandataanalysis/purchase,
for a link to purchase the book. The book is also available online through the
IU Library.
Instructor: John K. Kruschke,
johnkruschke@gmail.com. Office hours by appointment; please do ask.
Assistant: Brad Celestin, bcelesti@umail.iu.edu.
Office hours to be posted on Canvas.
Discussion: Please discuss the assignments and lectures
on Canvas. If you are attending the
class but cannot get access to the Canvas
page, please email Prof. Kruschke.
Disclaimer: All information in this document is subject
to change. Changes will be announced in class.
Schedule
of Topics Exact
day of each topic might flex as course progresses. 

Week 
Day 
Chapter and topic 
1 
Tu 
2. Introduction: Credibility,
models, and parameters. Strongly recommended
article: “Bayesian data analysis for newcomers” at https://link.springer.com/article/10.3758/s1342301712721 
1 
Th 
3. The R programming
language. Instructions for installation of software are here: https://sites.google.com/site/doingbayesiandataanalysis/softwareinstallation 
2 
Tu 
4. Probability. 
2 
Th 
5. Bayes’ rule. 
3 
Tu 
6. Inferring a probability
via mathematical analysis. 
3 
Th 
7. Markov chain Monte
Carlo (MCMC). 
4 
Tu 
8. JAGS. 
4 
Th 
8, continued. 
5 
Tu 
9. Hierarchical models. 
5 
Th 
9, continued. 10. Model comparison. 
6 
Tu 
10, continued. 11. Null hypothesis significance
testing (NHST). Strongly recommended
article: “The Bayesian New Statistics” at https://link.springer.com/article/10.3758/s1342301612214 
6 
Th 
11. NHST, continued. 
7 
Tu 
12. Bayesian null
assessment. See also articles
“Bayesian assessment of null values via parameter estimation and model
comparison” at http://www.indiana.edu/~kruschke/articles/Kruschke2011PoPScorrected.pdf and “Rejecting or
accepting parameter values in Bayesian estimation” at http://journals.sagepub.com/doi/full/10.1177/2515245918771304 
7 
Th 
12, continued. 
8 
Tu 
13. Goals, power, and
sample size. See also video at http://www.youtube.com/playlist?list=PL_mlm7M63Y7j641Y7QJG3TfSxeZMGOsQ4. 
8 
Th 
13, continued. 
9 
Tu 
15. The generalized
linear model. 16. Metric predicted
variable, 1 or 2 group predictor variable. 
9 
Th 
16, continued. Also power analysis applied to 2 groups.
See article titled “Bayesian estimation supersedes the t test” at http://www.indiana.edu/~kruschke/BEST/. 
10 
Tu 
17. Metric predicted
variable, metric predictor variable. 
10 
Th 
17, continued. 18. Metric predicted
variable, metric predictor variables.
See also article “The time
has come: Bayesian methods for data analysis in the organizational sciences”
at http://www.indiana.edu/~kruschke/BMLR/. 
11 
Tu 
18, continued. 
11 
Th 
19. Metric predicted
variable, nominal predictor variable. 
12 
Tu 
19, continued. 20. Metric predicted
variable, nominal predictor variables. 
12 
Th 
20, continued. 
13 
Tu 
21. Dichotomous
predicted variable (logistic regression). 
13 
Th 
22. Nominal predicted
variable (softmax regression). For an applied example
of hierarchical conditional logistic regression, see article titled
“Ostracism and fines in a public goods game with accidental contributions:
The importance of punishment type” at http://journal.sjdm.org/14/14721a/jdm14721a.pdf 
14 
Tu 
22, continued. 
14 
Th 
23. Ordinal predicted
variable (ordered probit regression). For more about the perils
of applying metric models to ordinal data, see “Analyzing ordinal data with
metric models: What could possibly go wrong?” at https://www.sciencedirect.com/science/article/pii/S0022103117307746 For another example or
ordinal regression, see manuscript titled “Moral Foundation Sensitivity and
Perceived Humor” at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2519218 
15 
Tu 
23, continued. 
15 
Th 
24. Count predicted
variable. 
Finals 

No final exam, but
final homework is due during finals’ week at date TBA. 