Lecture | Title | Description | Notes | Code | |
---|---|---|---|---|---|
NA | Review | Stuff you should know: Basics of probability, the central limit theorem, and inference | NA | ||
1 | Introduction to Regression and Prediction | We will describe linear regression in the context of a prediction problem. | R | ||
2 | Overview of Supervised Learning | Regression for predicting bivariate data, K nearest neighbors (KNN), bin smoothers, and an introduction to the bias/variance trade-off. | R | ||
3-4 | Linear Methods for Regression | Subset selection and ridge regression. We will use singular value decomposition (SVD) and principal component analysis (PCA) to understand these methods. | R | ||
5 | Linear Methods for Classification | Linear Regression, Linear Discriminant Analysis (LDA), and Logisitc Regression | R | ||
6 | Kernel Methods | Kernal smoothers including loess. We will briefly describe 2 dimensional smoothers. We will also define degrees of freedom in the context of smoothing and learn about density estimators. | R | ||
7 | Model Assessment and Selection | We revist the bias-variance tradeoff. We describe how monte-carlo simulations can be used to assess bias and variance. We then introduce cross-validation, AIC, and BIC. | R | ||
8 | The Bootstrap | We give a short introduction to the bootstrap and demonstrate its utility in smoothing problems. | R | ||
9-10 | Splines, Wavelets, and Friends | We give intuitive and mathematical description of Splines and Wavelets. We use the SVD to understand these better and see connections with signal processing methods. | R | ||
11-12 | Additive Models, GAM and Neural Networks | We move back to cases with many covariates. We introduce projection pursuit, additive models as well as generalized additive models. We breifly describe neural networks and explain the connection to projection pursuit. | NA | ||
13-14 | CART, Boosting and Additive Trees | We introduce classification algorithms and regression trees (CART) as well as the more modern versions such as random forrests. | archive for CART, archive for others | ||
15 | Model Averaging | Bayesian Statistics, Boosting and Bagging | NA | ||
16 | Clustering algorithms | Notes and code taking from my My microarray class | R |
Homework:
If your field is mathematical (statistics, biostatistics, engineering,etc..) then look through the top journal of your favorite public health application. If you don't have one then use American Journal of Epidemiology (there should be plenty of regrssion analyses in this journal).
Data-sets:
You are visitor number