Advanced Regression Methods
1
Introduction
1.1
Course Description
1.2
Course Information
1.3
Student Learning Outcomes
1.4
Other Course Materials
2
R-Basics
2.1
Class Instructions
2.2
Online Materials
2.3
DataCamp Courses
3
Play with R
3.1
Read the data from an excel/SPSS file
3.2
Transform the data
3.3
Compute Variable
3.4
Class Activity 1: Calculate the aggregated data
3.4.1
Create new variable,‘ScienceTotal’,using the average of (score1-score5).
3.4.2
Create new variable,‘ParentSupport’ using the mean of 4 variables (parentsupport1-parentsupport4)
3.4.3
Create new variable,‘StudentsBullied’ using the sum of 6 variables (studentbullied1-studentbullied6)
3.4.4
Perform descriptive analysis (mean,median,mode,and S.D.) on ScienceScore, ParentSupport,and StudentsBullied.
3.5
Recoding Variables
3.5.1
Load the car package for reverse coding
3.5.2
Reverse code the test item
3.6
Class Activity 2: Recode Variables
3.6.1
Recode into same variables for ‘learning2, learning4’
3.6.2
Recode ‘confidence2, confidence3, confidence6’ variables only for students who are born in 1999 (‘year’ variable) and save the recoded variables into ‘confidence2_re, confidence3_re, confidence6_re’ variables.
3.6.3
Perform frequency analysis for ‘learning2, learning4, confidence2_re, confidence3_re, confidence6_re’.
3.7
Class Activity 3: Select Cases
3.7.1
Select ‘gender = girl’ and ‘year = 2000’ and create a new dataset named by GIRL_2000. What is the mean of StudentBullied1?
3.7.2
Select students who have id=3001 to id=4000 and filter out unselected cases. What is the variance of parentsupport1?
3.7.3
Select 10% of total student at random and delete unselected cases. What is the frequency of learning1?
3.8
Sorting Cases & Merging files
3.8.1
Sorting Cases
3.8.2
Merging file: Add Cases
3.8.3
Merging file: Add variable
3.9
Class Exercise
4
Running Correlations in R
4.1
Pearson & Spearman Correlation
4.1.1
Pearson Correlation
4.1.2
Spearman Correlation
4.1.3
Calculating the Pearson/Spearman Correlation in R
4.2
Point Biserial Correlation & Phi Correlation
4.2.1
Point Biserial Correlation
4.2.2
Phi Correlation
4.3
Partial and Semi-partial Correlation
4.4
Supplementary Learning Materials
5
Multiple Regression
5.1
Introduction to Multiple Regression
5.2
Simple Regression using R
5.2.1
Load your data
5.2.2
Scatterplot the data
5.2.3
Check the correlation
5.2.4
Run a simple regression model
5.2.5
Centering Variable for better interpretation
5.3
Multiple Regression
5.3.1
Variable Selection Methods
5.4
Assessing the regression model
5.4.1
Run the Multiple Regression model
5.5
Assumptions
5.5.1
Check the correlation matrix & the P-value matrix
5.5.2
Check the independent errors
5.5.3
Check the outliers by using Mahalanobis Distance
5.5.4
Check the outliers by using Cook’s Distance
5.5.5
Check the other assumptions
5.6
Supplementary Learning Materials
6
Curvilinear Regression
6.1
Introduction to Curvilinear Regression
6.2
Descriptive Data Analysis
6.3
Run the Curvilinear Regression Model
6.3.1
Linear regression model
6.3.2
Quadratic regression model
6.3.3
Cubic regression model
6.4
Model Comparison
6.5
Supplementary Learning Materials
7
Multi-level Models Part A
7.1
Introduction
7.1.1
Hierarchical Data
7.1.2
Multilevel Model
7.1.3
Intraclass Correlation (ICC)
7.1.4
Benefits of Multilevel Models
7.2
R Lab: Running Multilevel models in R
7.2.1
Prepare the data & R packages
7.2.2
Setting up the simple linear model
7.2.3
Setting up an Unconditional Model
7.2.4
Random intercepts model
7.2.5
Random intercepts and slopes model
7.2.6
Adding an interaction term to the model
7.3
Supplementary Learning Materials
8
Multi-level Models Part B
8.1
Understanding the data set
8.2
Prepare the data set and review
8.3
Centering for continuous X
8.4
Question 1 - How much do U.S. high schools vary in their mean math achievement?
8.4.1
Answering Question 1:
8.5
Question 2 - Do schools with high MEAN SES also have high math achievement?
8.5.1
Answering Question 2:
8.6
Question 3 - Is the strength of association between student CSES and math achievement similar across schools? Or is CSES a better predictor of student math achievement in some schools than others?
8.6.1
Answering Question 3:
8.7
Question 4 - How do public and Catholic schools compare in terms of mean math achievement and in terms of the strength of the SES-math achievement relationship, after we control for MEAN SES?
8.7.1
Answering Q4:
8.8
Supplementary Learning Materials
9
Growth Model
9.1
Introduction to Growth Model
9.2
R Lab: Growth Model
9.2.1
Organize Longitudinal Data: Long Format vs. Wide Format
9.2.2
An Example of Honeymoon Data
9.2.3
Adding higher-order polynomials
9.3
Supplementary Learning Materials
10
Binary Logistic Regression
10.1
Introduction
10.2
The Purpose of Binary Logistic Regression
10.3
Log Transformation
10.4
Equation
10.5
Hypothesis Test
10.6
Likelihood Ratio Test for Nested Models
10.7
R Lab: Running Binary Logistic Regression Model
10.7.1
Data Explanations ((Data set: class.sav))
10.7.2
Explore the data
10.7.3
Running a logstic regression model
10.8
Things to consider
10.9
Supplementary Learning Materials
11
Multinomial Logistic Regression
11.1
Introduction to Multinomial Logistic Regression
11.2
Equation
11.3
Hypothesis Test of Coefficients
11.4
Likelihood Ratio Test
11.5
Checking AssumptionL: Multicollinearity
11.6
Features of Multinomial logistic regression
11.7
R Labs: Running Multinomial Logistic Regression in R
11.7.1
Understanding the Data: Choice of Programs
11.7.2
Prepare and review the data
11.7.3
Run the Multinomial Model using “nnet” package
11.7.4
Check the model fit information
11.7.5
Calculate the Goodness of fit
11.7.6
Calculate the Pseudo R-Square
11.7.7
Likelihood Ratio Tests
11.7.8
Parameter Estimates
11.7.9
Interpretation of the Predictive Equation
11.7.10
Build a classification table
11.8
Supplementary Learning Materials
12
Ordinal Logistic Regression
12.1
Introduction to Ordinal Logistic Regression
12.2
Cumulative Probability
12.3
Link Function
12.4
Prepare the Data
12.5
Descriptive Analysis
12.6
Run the ordinal logistic Regression model using MASS package
12.7
Check the Overall Model Fit
12.8
Check the model fit information
12.9
Compute a confusion table and misclassification error (R exclusive)
12.10
Measuring Strength of Association (Calculating the Pseudo R-Square)
12.11
Parameter Estimates
12.12
Calculating Expected Values
12.13
References
13
Probit Analysis
13.1
Introduction to Probit Analysis
13.2
R-Lab: Running Probit Analysis in R
13.2.1
Understanding the Data
13.2.2
Descriptive data analysis
13.2.3
Run the Probit logistic Regression model using stats package
13.2.4
Compare the overall model fit
13.2.5
Check the model fit information
13.2.6
Measuring Strength of Association (Calculating the Pseudo R-Square)
13.2.7
Parameter Estimates
13.3
Supplementary Learning Materials
14
References
Published with bookdown
Companion to BER 642: Advanced Regression Methods
Chapter 14
References