Statistics | Statistical Model Selection
S482 | 29521 | Guilherme Rocha
Estimation techniques in Statistics are often based on choosing
parameters within a given family of models to maximize the fit to an
observed data set. While such model estimates are known to enjoy many
desirable properties, goodness of fit alone is not an adequate method
for selecting the best among models of different "complexity."
On the one hand, "simpler" models can be more revealing of the
structure in the data. On the other hand, they are often restricted
versions of more "complex'' models and hence will never be preferred
based on goodness of fit alone.
In this course, we cover model selection techniques with an emphasis
on variable selection in the regression setting. We review classical
variable selection methods such as AIC, BIC, and Mallows' Cp and
discuss some of the computational issues involved. In addition, we
introduce some alternative measures of the complexity of a model and
review how they can be used for model selection purposes. Finally, we
briefly review some of the issues specific to high-dimensional data
sets and how they can be addressed.