Predictive Modeling
Preface
Welcome
Main references and credits
Contributions
License
Citation
1
Introduction
1.1
Course overview
1.2
What is
Predictive Modeling
?
1.3
General notation and background
1.4
Scripts and datasets
2
Linear models I: multiple linear model
2.1
Case study:
The Bordeaux equation
2.2
Model formulation and least squares
2.2.1
Simple linear model
2.2.2
Case study application
2.2.3
Multiple linear model
2.2.4
Case study application
2.3
Assumptions of the model
2.4
Inference for model parameters
2.4.1
Distributions of the fitted coefficients
2.4.2
Confidence intervals for the coefficients
2.4.3
Testing on the coefficients
2.4.4
Case study application
2.5
Prediction
2.5.1
Case study application
2.6
ANOVA
2.6.1
Case study application
2.7
Model fit
2.7.1
The
\(R^2\)
2.7.2
The
\(R^2\!{}_{\text{Adj}}\)
2.7.3
Case study application
3
Linear models II: model selection, extensions, and diagnostics
3.1
Case study:
Housing values in Boston
3.2
Model selection
3.2.1
Case study application
3.2.2
Consistency in model selection
3.3
Use of qualitative predictors
3.3.1
Case study application
3.4
Nonlinear relationships
3.4.1
Transformations in the simple linear model
3.4.2
Polynomial transformations
3.4.3
Interactions
3.4.4
Case study application
3.5
Model diagnostics
3.5.1
Linearity
3.5.2
Normality
3.5.3
Homoscedasticity
3.5.4
Independence
3.5.5
Multicollinearity
3.5.6
Outliers and high-leverage points
3.5.7
Case study application
3.6
Dimension reduction techniques
3.6.1
Review on principal component analysis
3.6.2
Principal components regression
3.6.3
Partial least squares regression
4
Linear models III: shrinkage, multivariate response, and big data
4.1
Shrinkage
4.1.1
Ridge regression
4.1.2
Lasso
4.1.3
Variable selection with lasso
4.2
Constrained linear models
4.3
Multivariate multiple linear model
4.3.1
Model formulation and least squares
4.3.2
Assumptions and inference
4.3.3
Shrinkage
4.4
Big data considerations
5
Generalized linear models
5.1
Case study:
The Challenger disaster
5.2
Model formulation and estimation
5.2.1
Logistic regression
5.2.2
General case
5.3
Inference for model parameters
5.3.1
Distributions of the fitted coefficients
5.3.2
Confidence intervals for the coefficients
5.3.3
Testing on the coefficients
5.3.4
Case study application
5.4
Prediction
5.4.1
Case study application
5.5
Deviance
5.6
Model selection
5.7
Model diagnostics
5.7.1
Linearity
5.7.2
Response distribution
5.7.3
Independence
5.7.4
Multicollinearity
5.8
Shrinkage
5.9
Big data considerations
6
Nonparametric regression
6.1
Nonparametric density estimation
6.1.1
Histogram and moving histogram
6.1.2
Kernel density estimation
6.1.3
Bandwidth selection
6.1.4
Multivariate extension
6.2
Kernel regression estimation
6.2.1
Nadaraya–Watson estimator
6.2.2
Local polynomial regression
6.2.3
Asymptotic properties
6.2.4
Bandwidth selection
6.3
Kernel regression with mixed multivariate data
6.4
Prediction and confidence intervals
6.5
Local likelihood
Appendix
A
Further topics
A.1
Informal review on hypothesis testing
A.2
Least squares and maximum likelihood estimation
A.3
Multinomial logistic regression
A.4
Dealing with missing data
A.5
A note of caution with inference after model-selection
B
Software
B.1
Installation of R and RStudio
B.2
Introduction to RStudio
B.3
Introduction to R
Simple computations
Variables and assignment
Vectors
Some functions
Matrices, data frames, and lists
More on data frames
Vector-related functions
Logical conditions and subsetting
Plotting functions
Distributions
Functions
Control structures
References
ISBN 978-84-09-29679-8
Licensed under
Published with bookdown
Notes for Predictive Modeling
B
Software
\(\mbox{}\)