Fall 2017, MWF 10:30am-11:45am,

Instructor | Assistant | |

Name: | John K. Kruschke | Brad Celestin |

Office Room: | PY 364 | PY 243 |

Office Hours: | By appt. (please do ask) | Mondays after class, 11:45 - 12:45 |

E-mail: | johnkruschke@gmail.com | bcelesti@umail.iu.edu |

**Course Description:** This course is an introduction to basic statistics (despite the official title, "Advanced Statistics..."). We will cover fundamental concepts of statistical inference, focusing on classical "frequentist" methods but also getting some exposure to Bayesian methods. We will explore some of the most commonly used models, including *t*-tests, ANOVA, regression, etc. More information about content is provided below, and on the schedule.

**Prerequisites:** This course is intended to bring all the incoming graduate students in Psychology "up to pace," so it is not intended to "weed out" students with relatively weak previous training in statistics. On the other hand, this course is definitely *not* remedial --- it moves quickly and covers a lot of material, so expect to devote a lot of time to the course. You should have previously taken an undergraduate course in statistics. The course emphasizes conceptual unification, not rote mechanics. A purpose of P553 is to enrich and solidify your understanding of the conceptual underpinnings of methods to which you were previously exposed. (After taking various previous instantiations of this course, many students have told me that although they have *taken* stats courses before, this is the first time they have *understood* statistics! My hope is that regardless of your previous level of understanding, you come away from this course with a better understanding.)

Students with relatively strong previous training in statistics should also find this course useful to refresh their knowledge and to gain a deeper understanding of the basic concepts. If you are a Psychology major and have already taken a comparable graduate-level course, and feel that you are already thoroughly familiar with the material in P553, please see the instructor to discuss a possible exemption from the P553 requirement. Students exempted from P553 are encouraged to take other statistics courses, such as Prof. Kruschke's Bayesian course.

Students from all disciplines are welcome in this course.

**After taking this course you should be able to...**

- understand what a
*p*value really is. - know the models underlying typical analyses such as
*t*-tests, ANOVA, regression, etc. - conduct analyses in the R computing language.
- transfer your knowledge to more complex analyses and to other software.
- eagerly enroll in a subsequent course on Bayesian statistics.

**Homework:** There will be weekly homework assignments. You are encouraged to use whatever resources help you understand the homework and complete it with full comprehension, but ultimately you must write your own answers on your own and in your own words. Each homework assignment begins with an honor statement indicating that you are writing your answers on your own in your own words. In your answers that you submit, please provide explanations and thoroughly show all your computations, *with annotation* that explains what you are doing. An unannotated succession of computations will not get full credit, even if it is numerically correct.

**Course Grading Method:** Grading is based on your total homework score,

All assignments are mandatory. Late homework is exponentially penalized with a half-life of one week, meaning that after one week 50% is the maximum possible score. (The R program for the exponential decay is in the Canvas files; see LatePenalty.R.) No homework may be turned in more than three weeks later than its due date (and no homework may be turned in after 12:00 noon of Wednesday of finals week). There are two reasons for this policy: First, the course moves quickly and the material is largely cumulative, so the late penalty acts as an extra incentive to keep up. Second, the assistant, who will be grading the homework, must not be given a flood of late homework papers at the end of the semester. In recognition of the fact that "life happens" (e.g., short-term illness, personal turmoil, overwhelming confluence of deadlines, etc.), your two worst late penalties will be dropped. In other words, for every homework we will record the scores with and without a late penalty. The two homeworks with the largest difference between with- and without- late penalty will have their late penalty dropped. Note, therefore, that any homework not turned in will count as zero.

**Software:** We'll be using software called R and RStudio. Both are free to download and install on your personal computer. Details will be provided in class. Both are also on all IU computers.

**Lecture Notes:** Lecture materials will be posted online. The early weeks have some extensive written notes, but the later weeks have only slides without annotation. Therefore, if you must miss a lecture, please get notes from a classmate and then see the assistant during office hours or Prof Kruschke if you have questions.

**Recommended Book:** We will not be following this book chapter by chapter, but it is a very accessible reference for many of the topics we'll be covering. I highly recommend it as a very useful resource for this course and for your future data analysis with R. The book is:

Fischetti, Tony (2015).Data Analysis with R. Birmingham, UK: Packt Publishing. ISBN: 978-1-78528-814-2.

**Other Reference Materials:** There are many online materials about R, including the official R documentation. Another nice resource is Using R for psychological research by the Personality Project. Another useful online site is Quick-R (which also promotes a book that shows examples of R but does not explain statistical concepts).

**Canvas:** We will use an online system called Canvas for posting announcements, discussion, and grades. To get to Canvas, go to https://canvas.iu.edu/ and click on the Login button. You need to have an IU computer account, and you need to be enrolled in this course.

**Schedule:** Weekly homework is assigned on Wednesdays and is due the following Wednesday. In class there will be a mix of lecture and computer demonstration. The schedule below is a guideline only; please see updated details posted in Canvas!

Week | Topics |

1 | Describing noisy data with mathematical models. Getting started with R. |

2 | Finding the parameter values of a model that best fit the data: maximum likelihood estimation. Examples: Single group, two groups, linear regression. |

3 | Sampling distributions and p values (null hypothesis significance testing). |

4 | Sampling distributions and confidence intervals. |

5 | Sampling distributions (hence p values and confidence intervals) depend on stopping and testing intentions. |

6 | Model comparison and deciding among models. |

7 | The generalized linear model. Linear regression. |

8 | Multiple linear regression. |

9 | Oneway ANOVA. Multiple comparisons. |

10 | Multi-factor ANOVA. Interaction. |

11 | Dichotomous predicted variable: Logistic regression. |

12 | Nominal predicted variable: Softmax regression. |

13 | Ordinal predicted variable: Ordinal probit regression. |

14 | Count predicted variable: Log-linear models. |

15 | Bayesian methods: Bayesian for newcomers: published article, final manuscript. Bayesian and frequentist, hypothesis testing and estimation: published article, final manuscript. Two-group comparison: JEP:G article. Linear regression: ORM article. Hierarchical models: chapter. |

**Disclaimer:** This syllabus is meant to be suggestive, not absolute. Any and all of the information on this syllabus is subject to change at any time, including due dates, grading policies, etc. Changes will be announced in class.