P533 Intro. to Bayesian Data Analysis I, Prof. Kruschke
P533
Introduction to Bayesian Data Analysis I.
Fall 2008:
M, W, 10:10-11:35am, room Psych 111.
(Registrar class number 17632)
Under construction, May 2008.
This course is a tutorial introduction to doing Bayesian statistics
for data analysis. We will start from the basics of probabilities and
Bayes' theorem, and gradually work our way through contemporary Monte
Carlo methods in the context of simple analyses, building up to simple
examples of hierarchical models (see list of topics below). The course
is intended to make advanced Bayesian methods accessible to real
graduate students (see pre-req's below). The course is "hands on": We
will build computer-based analyses so that you can actually get in the
kitchen and make a meal, rather than just consume fast food at the
drive through. This way you can adapt the methods to your own research
scenarios.
This course is the
first semester of a two-semester sequence. This first semester
emphasizes the simplest data situation: two-valued measurements such
as yes/no, agree/disagree, remember/forget, detect/miss, male/female,
heads/tails, and so on. The main goal is to use this simple situation
to develop the methods of contemporary Bayesian analysis. The second
semester,P534, applies the methods to data
situations involving continuous measures (e.g., response time,
etc.). The second semester emphasizes inferences regarding means,
trends, and so forth.
Schedule of Topics:
- Models, parameters, beliefs. Intro to R.
- Probability: Inside and outside the head. Mass and
density. Conditional probabilities.
- Bayes' theorem. Three goals of statistical inference.
- Inferring a binomial proportion via exact mathematical analysis.
- Inferring a binomial proportion via grid approximation.
- Inferring a binomial proportion via Monte Carlo
approximation. The Metropolis algorithm.
- Inferences regarding two binomial proportions. Goals of
inference. Three approaches: Analytical, grid approximation, Monte
Carlo approximation. Metropolis algorithm and Gibbs sampling.
- Binomial likelihood with hierarchical priors. Intro to
"BUGS" software.
- Comparison of Bayesian inference with null hypothesis
significance testing.
The second semester of the course, P534, covers analysis of means,
linear models, etc.
Intended audience: The course is aimed at graduate students
and advanced undergraduates.
Prerequisites:
This is not a mathematical
statistics course, but a fair amount of mathematics is unavoidable. If
you know what someone means when she says, "The integral of x squared
is one-third x cubed," then you should be okay. You will not have to
generate a lot of mathematical derivations, but you will have
to understand some, and most will involve basic calculus.
We will be doing a lot of computer programming in a language
called R. R is free and can be installed on any computer. The road to
understanding will be much smoother if you have already had some
programming experience, in any language. It's easy to learn basic
programming, but it can be time consuming, so if you don't have any
previous experience, just anticipate spending more time. Learning to
program can have huge payoffs in multiple situations later in your
career, so it's worth the effort. A previous course in traditional
statistics (such as K300) or probability can be helpful as
background. Although the course will be developed independently of
traditional ("null hypothesis significance testing") statistical
methods, you might find the concepts of probability easier to
understand if you have already had some exposure to them.
Materials:
The primary materials will be info delivered
at lecture and in extensive course notes being written by the
Prof. Kruschke. Readings will be posted on Oncourse under the
"Resources" link.
Please also discuss the assignments and lectures on Oncourse under the
"Message Center" link. If you attending the class but cannot get
access to the Oncourse page, please e-mail Prof. Kruschke.
The following textbooks are recommended (not required) other sources:
Albert, J. H., & Rossman, A. J. (2001). Workshop Statistics:
Discovery with Data, a Bayesian Approach. Emeryville, CA: Key College
Publishing. The last few chapters of this book give a wonderfully
"hands on" introduction to the basics of Bayesian statistics --- but
only the basics.
Bolstad, W. M. (2004). Introduction to Bayesian
Statistics. Hoboken, NJ: Wiley. Terrific tutorial that uses only
basic calculus; highly recommended. Only down side is that it does not
cover hierarchical models or computer implementation of any numerical
approximation methods.
Gill, J. (2002). Bayesian Methods: A Social and Behavioral
Sciences Approach. Boca Raton, FL: Chapman and Hall/CRC Press.
and
Gelman, A., Carlin, J. B., Stern, H. S. and Rubin, D. B. (2004):
Bayesian Data Analysis, 2nd Ed. Boca Raton, FL: Chapman and Hall/CRC
Press. These two books offer more advanced examples of Bayesian
methods, including Monte Carlo techniques, but require a lot more
"connecting the dots" by the reader.
Grading; Homework; Exams: There will be weekly homework
assignments. At this time I plan no exams nor any final
project. Grades will be determined by performance on the homework
assignments. All assignments are mandatory. There will be penalties
for late homework unless you have a cogent excuse. These penalties are
designed as an incentive to you because the material is cumulative;
the penalties also help keep things fair to all students. If you must
be late with an assignment, please let me know immediately.
Disclaimer: All the information here is subject to
change. Changes will announced in class.
This web page is at URL = http://www.indiana.edu/~jkkteach/P533/