Chapter 20 Simple Linear Model and Mixed Methods

Note: this section is partially adapted from Fox’s Linear Mixed Models and Bates et al (2015) . We will first focus on simple linear model, we extend it to fixed effect model, finally we discuss random effects modelling. We will also use the data from study of Belenky et al (2003). This is pretty famous dataset that is easy to find online and you may find some help on it easily as many people are using it either to teach or to learn about mixed effects.

20.1 Data sets

Some of the data that we will use today is available via respective packages. All the additional data sets can be found here.

20.2 Longitudinal Data

We have seen quite a few examples of linear models last week, let us now finally get to the ones you probably came here to learn more about. The world can be slightly more complex of course than traditional set up of linear models. Things change in time and they change differently for different units of analysis (i.e. participants). We want to control for the variation in this change when making claims about casual relationship we find using data. In psychology and linguistics research, especially, you often find that you will have some repeated measures of your variable of interest for participants (i.e. memory after performing certain tests, reaction scores, attitudes, etc). These are related either to multiple treatments given to participants or/and arise from measuring the same relationships at different time points.

Remember old model set up: $\begin{equation} Y_[i] = \beta_0 + \beta_1 * X_i + \epsilon_i \end{equation}$

And remember the assumptions that are required for the linear model. Can you recall those for the residual term? Once in repeated measures setting you will see that most of these will remain. As in the simple linear model, we do care most about our residual term. We want it to be random and normally distributed with zero mean. The main reason for that of course is the fact that we want our model to explain all the meaningful variations without leaving anything in the residual term.

The only difference we have now is the we want to control for each individual unobserved characteristics so we introduce extra parameter which will account for those.

$\begin{equation} Y_[i] = \beta_0 + \beta_1 * X_i + \alpha + \epsilon_i \end{equation}$

This is the fixed effect model.

We can also introduce further parameters that will allow us to include effects at individual levels (i.e. pupils) and at the entity levels (i.e.schools)

$\begin{equation} Y_[i] = \beta_0 + \beta_1 * X_i + \theta_i + \gamma_ij + \epsilon_i \end{equation}$

20.3 Why a new model?

The reason why we are introducing a new model for repeated measures data or data that has few levels is the consideration of Ecological fallacy. What do we mean by that is that there may be differences across the effects on the levels of units compared to that on individual levels. Lets look at the example based on some data. Below, I have a generic plot of data points that illustrates the corresponding values of Y for each unit of some variable X.

Do you think we have positive relationship or negative? Is it easy to tell what is happening?

I think its hard to tell! Let us look at something a bit more specific.

20.4 Ecological Fallacy (quick illustration) - no need to run

Lets get some data in and load lmer package:

library(lme4)
disease<-read.csv('disease.csv')

20.5 Simple Example

We would like to study the relationship between prevalence of deadly diseases and average income per cities. We will run three models but first let’s provide a simple visualisation of the data:

#Make sure you have ggplot2
library(ggplot2)
qplot(disease$Income, disease$HDisease, xlab='Income', ylab='Disease')

Looks like we are dealing with quite positive relationship. Lets get to the models:

#Not controlling for city
model_0 <- lm(HDisease~Income, data = disease)
#Controlling for city effects
model_1 <- lm(HDisease~Income+factor(City), data = disease)
#Controlling for city and across cities effects
model_2 <- lmer(HDisease~Income + (1 | City), data = disease)

We can compare the outputs:

#Produce tidy outputs using texreg
library(texreg)
screenreg(list(model_0, model_1, model_2))

## 
## =========================================================
##                        Model 1     Model 2     Model 3   
## ---------------------------------------------------------
## (Intercept)              1.65 ***    3.07 ***   16.41 ***
##                         (0.27)      (0.11)      (2.87)   
## Income                   1.95 ***   -0.99 ***   -0.98 ***
##                         (0.05)      (0.07)      (0.07)   
## factor(City)2                        2.88 ***            
##                                     (0.16)               
## factor(City)3                        5.96 ***            
##                                     (0.21)               
## factor(City)4                        8.91 ***            
##                                     (0.26)               
## factor(City)5                       12.06 ***            
##                                     (0.33)               
## factor(City)6                       14.89 ***            
##                                     (0.40)               
## factor(City)7                       17.78 ***            
##                                     (0.47)               
## factor(City)8                       20.90 ***            
##                                     (0.54)               
## factor(City)9                       23.86 ***            
##                                     (0.59)               
## factor(City)10                      26.93 ***            
##                                     (0.68)               
## ---------------------------------------------------------
## R^2                      0.95        1.00                
## Adj. R^2                 0.95        1.00                
## Num. obs.              100         100         100       
## RMSE                     1.34        0.32                
## AIC                                            148.61    
## BIC                                            159.03    
## Log Likelihood                                 -70.31    
## Num. groups: City                               10       
## Var: City (Intercept)                           81.14    
## Var: Residual                                    0.10    
## =========================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

20.6 Now for Advanced: Model set up

The previous example was something known as toy example. We can use some real data examples now. Lets load all the packages we will need.

#Load neccesary packages
library(lattice)
library(arm)

There will be two main illustration we want to focus on. First one will consider clustering of temporal variation by unit of analysis (i.e. participant) and the second one will consider clustering of units by group (i.e. school).

For the first example, we will use the data on sleep deprivation. The example is based on Belenky et al(2003) study that looks at patterns of sleeping across individuals that are going through the stages of sleep deprivation. On day 0 the subjects had their normal amount of sleep. Starting that night they were restricted to 3 hours of sleep per night. The observations represent the average reaction time on a series of tests given each day to each subject.

#Read data in
sleep<-read.csv('sleep.csv')

#Check whats in
str(sleep)

## 'data.frame':    180 obs. of  4 variables:
##  $ X       : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ Reaction: num  250 259 251 321 357 ...
##  $ Days    : int  0 1 2 3 4 5 6 7 8 9 ...
##  $ Subject : int  308 308 308 308 308 308 308 308 308 308 ...

Let us just plot the data as it is. We will use Days as our X, and Reaction as our Y. You will recognize something we just have seen.

#Plot reaction dynamics across days
qplot(sleep$Days, sleep$Reaction, xlab='Days', ylab='Reaction')

It is hard to tell whether there is a positive effect or whether there may be different levels of effect. It is clear that there is some tendency for increasing reaction as time passes but we have various individuals we may find that on individual levels we may observe a more detailed picture.

We can use xy plot() to plot the variation of reaction time for the test which were given to participants across ten days of the recovery.

To check arguments that can be modified have a look here.

xyplot(Reaction ~ Days|Subject, sleep, type = c("g","p","r"), #we are mixing the types of plots here, try removing the option
       index = function(x,y) coef(lm(y ~ x))[1], #we specify that we are plotting x and y using lm() fit 
       xlab = "Days of sleep deprivation",
       ylab = "Average reaction time (ms)", aspect = "xy")

What do you see? It looks like for some individuals there were pretty fast changes in the reactions times across periods with some having a steady growth in reaction level. Mostly we observe upwards movement but the speed seems to vary so as the starting points.

Note how in the plots I included the variables Days|Subject, sleep. This helps to identify that we are looking at the effect of time for each subject’s sleep variable.

Let us now see what we will observe numerically if we were to run a model to fit this data.

20.6.1 Pooling

We can start with simple model, this is also know as complete polling, we assume that effects of variable Day were the same for everyone. Just like in simple linear model.

#Lets regress reactions on days
model_simple<-lm(Reaction~Days, sleep)
summary(model_simple)

## 
## Call:
## lm(formula = Reaction ~ Days, data = sleep)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -110.848  -27.483    1.546   26.142  139.953 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  251.405      6.610  38.033  < 2e-16 ***
## Days          10.467      1.238   8.454 9.89e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 47.71 on 178 degrees of freedom
## Multiple R-squared:  0.2865, Adjusted R-squared:  0.2825 
## F-statistic: 71.46 on 1 and 178 DF,  p-value: 9.894e-15

20.6.2 No pooling

Some may say that it will be quite naive to put all together as we have some variability within each individual records that we do not account. Alternatively, we can attempt to control for each subject, this is known as fixed effect and is equivalent to having a dummy variable for each of the individuals. That way we can ensure that we separate the variation that is due to individuals unique characteristics. Here is an example:

#Lets regress reactions on days but now control for indviduals via including 'Subject'. 
model_subject<-lm(Reaction~Days+as.factor(Subject), sleep)
summary(model_subject)

## 
## Call:
## lm(formula = Reaction ~ Days + as.factor(Subject), data = sleep)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -100.540  -16.389   -0.341   15.215  131.159 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            295.0310    10.4471  28.240  < 2e-16 ***
## Days                    10.4673     0.8042  13.015  < 2e-16 ***
## as.factor(Subject)309 -126.9008    13.8597  -9.156 2.35e-16 ***
## as.factor(Subject)310 -111.1326    13.8597  -8.018 2.07e-13 ***
## as.factor(Subject)330  -38.9124    13.8597  -2.808 0.005609 ** 
## as.factor(Subject)331  -32.6978    13.8597  -2.359 0.019514 *  
## as.factor(Subject)332  -34.8318    13.8597  -2.513 0.012949 *  
## as.factor(Subject)333  -25.9755    13.8597  -1.874 0.062718 .  
## as.factor(Subject)334  -46.8318    13.8597  -3.379 0.000913 ***
## as.factor(Subject)335  -92.0638    13.8597  -6.643 4.51e-10 ***
## as.factor(Subject)337   33.5872    13.8597   2.423 0.016486 *  
## as.factor(Subject)349  -66.2994    13.8597  -4.784 3.87e-06 ***
## as.factor(Subject)350  -28.5311    13.8597  -2.059 0.041147 *  
## as.factor(Subject)351  -52.0361    13.8597  -3.754 0.000242 ***
## as.factor(Subject)352   -4.7123    13.8597  -0.340 0.734300    
## as.factor(Subject)369  -36.0992    13.8597  -2.605 0.010059 *  
## as.factor(Subject)370  -50.4321    13.8597  -3.639 0.000369 ***
## as.factor(Subject)371  -47.1498    13.8597  -3.402 0.000844 ***
## as.factor(Subject)372  -24.2477    13.8597  -1.750 0.082108 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 30.99 on 161 degrees of freedom
## Multiple R-squared:  0.7277, Adjusted R-squared:  0.6973 
## F-statistic: 23.91 on 18 and 161 DF,  p-value: < 2.2e-16

We have got some improvement in terms of explain variance (check Adjusted R-squared).

We can test whether controlling for each individual variation improves on a simple model as well. An F test is here to help since we have nested models:

#F test 
anova(model_simple, model_subject)

## Analysis of Variance Table
## 
## Model 1: Reaction ~ Days
## Model 2: Reaction ~ Days + as.factor(Subject)
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    178 405252                                  
## 2    161 154634 17    250618 15.349 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Not bad! Guess we can stop here? We can think for a moment…

Quick note: in the structure above we do not consider anything with regards to between units variation, we are looking at what happening within individuals reaction scores but do not account how they vary between each other

20.6.3 Partial Pooling (varying intercepts)

An attempt do a bit of both (partial pooling) can be using random effect modelling structure. Such model will first look at the variation between individuals and then measure their distance from average individual effect, so we are partially pooling everyone together. Please note that we are assuming the slopes are the same, it is the intercepts that will vary for each person.

There are quite a few variations of mixed methods specifications. A useful table was found in Bates et al (2015):

#Run baseline model (will use it later for comparison, we have  no controls)
model_null <- lmer(Reaction ~  (1|Subject), # note how we use 1 to suggest that that we keep the slope constant and vary intercept
                  data=sleep)
#Run the model (note thay we also control for the effect of time by subject)
model_mix <- lmer(Reaction ~ Days + (1|Subject) , sleep)
#Summarise
summary(model_mix)

## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (1 | Subject)
##    Data: sleep
## 
## REML criterion at convergence: 1786.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.2257 -0.5529  0.0109  0.5188  4.2506 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  Subject  (Intercept) 1378.2   37.12   
##  Residual              960.5   30.99   
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 251.4051     9.7467   25.79
## Days         10.4673     0.8042   13.02
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.371

Double check that the intercept for fixed effect is identical to the one we found earlier when we controlled for the subject. The coefficient for ‘Days’ reports the average effect of an extra day of sleep deprivation on reaction score.

Lets get to a bit more compressed output:

#Use display()
display(model_mix)

## lmer(formula = Reaction ~ Days + (1 | Subject), data = sleep)
##             coef.est coef.se
## (Intercept) 251.41     9.75 
## Days         10.47     0.80 
## 
## Error terms:
##  Groups   Name        Std.Dev.
##  Subject  (Intercept) 37.12   
##  Residual             30.99   
## ---
## number of obs: 180, groups: Subject, 18
## AIC = 1794.5, DIC = 1801.7
## deviance = 1794.1

We can extract fixed effect estimation:

#fixef () wil take fixed effect out for us
fixef(model_mix)

## (Intercept)        Days 
##   251.40510    10.46729

#se.fixef() wil take he standard error of the fixed effect 
se.fixef(model_mix)

## (Intercept)        Days 
##   9.7467163   0.8042214

Or random effect:

#ranef () wil take random effect out for us
ranef(model_mix)

## $Subject
##     (Intercept)
## 308   40.783710
## 309  -77.849554
## 310  -63.108567
## 330    4.406442
## 331   10.216189
## 332    8.221238
## 333   16.500494
## 334   -2.996981
## 335  -45.282127
## 337   72.182686
## 349  -21.196249
## 350   14.111363
## 351   -7.862221
## 352   36.378425
## 369    7.036381
## 370   -6.362703
## 371   -3.294273
## 372   18.115747
## 
## with conditional variances for "Subject"

#se.ranef() wil take he standard error of the fixed effect 
se.ranef(model_mix)

## $Subject
##     (Intercept)
## 308    9.475668
## 309    9.475668
## 310    9.475668
## 330    9.475668
## 331    9.475668
## 332    9.475668
## 333    9.475668
## 334    9.475668
## 335    9.475668
## 337    9.475668
## 349    9.475668
## 350    9.475668
## 351    9.475668
## 352    9.475668
## 369    9.475668
## 370    9.475668
## 371    9.475668
## 372    9.475668

#We can also just get the predicted intercept + slope for each ( fixef() plus the value of ranef())
coef(model_mix)

## $Subject
##     (Intercept)     Days
## 308    292.1888 10.46729
## 309    173.5556 10.46729
## 310    188.2965 10.46729
## 330    255.8115 10.46729
## 331    261.6213 10.46729
## 332    259.6263 10.46729
## 333    267.9056 10.46729
## 334    248.4081 10.46729
## 335    206.1230 10.46729
## 337    323.5878 10.46729
## 349    230.2089 10.46729
## 350    265.5165 10.46729
## 351    243.5429 10.46729
## 352    287.7835 10.46729
## 369    258.4415 10.46729
## 370    245.0424 10.46729
## 371    248.1108 10.46729
## 372    269.5209 10.46729
## 
## attr(,"class")
## [1] "coef.mer"

20.6.4 Partial Pooling Extended - (varying intercepts and/or slopes)

We can also vary slopes if we wanted to, we saw earlier that we may have very different line fit for the reaction time for different individuals.

#Run the model (note thay we also control for the effect of time by subject)
model_mix_slope <- lmer(Reaction ~ Days + (Days|Subject), # note that now 1 was replaced with Days to suggest varying slopes
                        sleep)
#Summarise
summary(model_mix_slope)

## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (Days | Subject)
##    Data: sleep
## 
## REML criterion at convergence: 1743.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.9536 -0.4634  0.0231  0.4633  5.1793 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  Subject  (Intercept) 611.90   24.737       
##           Days         35.08    5.923   0.07
##  Residual             654.94   25.592       
## Number of obs: 180, groups:  Subject, 18
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  251.405      6.824  36.843
## Days          10.467      1.546   6.771
## 
## Correlation of Fixed Effects:
##      (Intr)
## Days -0.138

Each subject now will have their own slope estimation, which is as the intercept is partially pooled towards the center of Days effects’ distribution. Note carefully the results for Days coefficients in the main output.

Whatever decision you make, it will always be driven by your data. You may want to do the model comparison exercises and analyse the residuals in model output to see how much varying intercepts/slopes improved your understanding of underlying relationships in your data.

In mixed model setting, we can use log-likelihood ratio test to compare the models we just built:

#Make sure lmtest is loaded
library(lmtest)

#Compare the null model with random intercept model using likelihood ratio test
lrtest(model_null,model_mix)

## Likelihood ratio test
## 
## Model 1: Reaction ~ (1 | Subject)
## Model 2: Reaction ~ Days + (1 | Subject)
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   3 -952.16                         
## 2   4 -893.23  1 117.86  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Note how we add incrementally each model to the test:

#Now also compare varying intercept with both varying interecept + slopes
lrtest(model_mix, model_mix_slope)

## Likelihood ratio test
## 
## Model 1: Reaction ~ Days + (1 | Subject)
## Model 2: Reaction ~ Days + (Days | Subject)
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   4 -893.23                         
## 2   6 -871.81  2 42.837   4.99e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The model with random intercept improves on over null model. Having a random slope and intercept improved the model even more. Anything else to add for your results write up?

We can also use something which is known pseudo R-squared to see how much variance was explained due to our explanatory factor. We will need to calculate it by hand. We will use our model without controls where were have random intercepts and we will add Days to compare the explained variation that is due to addition of explanatory factors. Just a reminder of the models we need:

#Null-baseline
model_null <- lmer(Reaction ~  (1|Subject) , sleep)
#Extended-Days
model_mix <- lmer(Reaction ~ Days + (1|Subject) , sleep)

Let us extract the variance:

## We can try to extract the partial variance that was explained by Days for mixed model
#Calculate variance for the null
totvar_model_null <- (summary(model_null)$sigma)^2 +  
  as.numeric(summary(model_null)$varcor)
#Calculate variance for the mixed
totvar_model_mix <- (summary(model_mix)$sigma)^2 + as.numeric(summary(model_mix)$varcor) 
#Check the ratio of the difference between the too overall total variance in null
Var_expl <-(totvar_model_null-totvar_model_mix)/totvar_model_null
Var_expl

## [1] 0.2775756

We find that about 28% of variation in reaction can be explained by time, when controlling for subjects.

Lets finally put all these models side by side using texreg(). We can check what our main coefficients are, likelihood and also export if for the paper if we want to

#Make sure to have textreg loaded
library(texreg)

And now put the main models that we were working with for the overall comparison. Note, how tidy the output looks and we can now focus specifically on the effect of Days.

screenreg(list(model_null, model_mix))

## 
## ==================================================
##                           Model 1      Model 2    
## --------------------------------------------------
## (Intercept)                298.51 ***   251.41 ***
##                             (9.05)       (9.75)   
## Days                                     10.47 ***
##                                          (0.80)   
## --------------------------------------------------
## AIC                       1910.33      1794.47    
## BIC                       1919.91      1807.24    
## Log Likelihood            -952.16      -893.23    
## Num. obs.                  180          180       
## Num. groups: Subject        18           18       
## Var: Subject (Intercept)  1278.34      1378.18    
## Var: Residual             1958.87       960.46    
## ==================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

20.7 Multilevel modelling with random intercepts and slopes

Let’s consider another example, we can now try to fit similar models to a structure with levels that are represented by groups. Previously we had participant level variability. But what if we had participant levels that we were further grouped by other variable (i.e. country of birth, cohort, study trial)?

We can use mixed models as well. A very common way to illustrate the example would be using students grades across schools that vary within the pupils at the school level but also vary across schools. This is relevant in cases where grades were measured at different times as well. Let us work through something which is a bit more meaningful.

For example studies that look at this closely have a look at this review paper. The data set that we will work with has information on IQ scores for school students and is taken from Snijders and Bosker (2012) textbook. We will have different levels here, variability of grades within the schools and across the schools. What we want to understand is whether the average score on language test has anything to do with scores that student get in their SES and IQ and the scores at the average school level.

20.7.1 Overview of the data set

variable	description
school	School id
pupil	Student id
lang_score	language score
ses	Socioeconomic status of the student (mean centered)
IQ_verb	Verbal IQ of the student (mean centered)
sex	Gender of the student
Minority	Dummy indicator of whether student is of minority background
sch_ses	Average SES in a school (mean centered)
sch_iqv	Average verbal IQ in a school (mean centered)
sch_min	Proportion of minority students in a school

20.7.2 Prepare

Before we start let us load all the packages we will need later.

library(nlme)
library(lattice)

#Read data in:
schools<-read.csv('schools.csv')
head(schools)

##   school pupil lang_score    ses IQ_verb sex Minority sch_ses sch_iqv
## 1      1     3         46  -4.73    3.13   0        0 -14.035 -1.4039
## 2      1     4         45 -17.73    2.63   0        1 -14.035 -1.4039
## 3      1     5         33 -12.73   -2.37   0        0 -14.035 -1.4039
## 4      1     6         46  -4.73   -0.87   0        0 -14.035 -1.4039
## 5      1     7         20 -17.73   -3.87   0        0 -14.035 -1.4039
## 6      1     8         30 -17.73   -2.37   0        1 -14.035 -1.4039
##   sch_min
## 1    0.63
## 2    0.63
## 3    0.63
## 4    0.63
## 5    0.63
## 6    0.63

Let us first build a simple mixed effect model, we will try to predict language scores and will control for the schoo level only:

#Simple model (account for school level)
model_schools_0 <- lmer(lang_score ~  (1|school), data = schools)
summary(model_schools_0)

## Linear mixed model fit by REML ['lmerMod']
## Formula: lang_score ~ (1 | school)
##    Data: schools
## 
## REML criterion at convergence: 26595.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1850 -0.6417  0.0905  0.7226  2.5281 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept) 18.24    4.271   
##  Residual             62.85    7.928   
## Number of obs: 3758, groups:  school, 211
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  41.0038     0.3257   125.9

We can also set up a NULL model, explained variance of which we will use to calculate pseudo R squared later:

#NUll model - constant intercept
model_schools_null<- lm(lang_score ~1, data = schools)
summary(model_schools_null)

## 
## Call:
## lm(formula = lang_score ~ 1, data = schools)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -33.413  -5.413   0.587   6.587  16.587 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  41.4130     0.1451   285.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.893 on 3757 degrees of freedom

We can quickly check whether having varying intercepts improves over having a constant one for each student:

#Log-likelihood ratio test ( null versus control)
lrtest(model_schools_null,model_schools_0)

## Likelihood ratio test
## 
## Model 1: lang_score ~ 1
## Model 2: lang_score ~ (1 | school)
##   #Df LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -13544                         
## 2   3 -13298  1 492.54  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Lets start adding a few relevant covariates, we can start with IQ and then see how much additional variance can be explained by this variable:

#Add IQ scores to the model
model_schools_IQ <- lmer(lang_score ~ IQ_verb + (1|school), data = schools) 
summary(model_schools_IQ)

## Linear mixed model fit by REML ['lmerMod']
## Formula: lang_score ~ IQ_verb + (1 | school)
##    Data: schools
## 
## REML criterion at convergence: 24917.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1952 -0.6378  0.0659  0.7098  3.2132 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept)  9.909   3.148   
##  Residual             40.479   6.362   
## Number of obs: 3758, groups:  school, 211
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 41.05442    0.24402  168.24
## IQ_verb      2.50722    0.05439   46.09
## 
## Correlation of Fixed Effects:
##         (Intr)
## IQ_verb 0.003

Lets calculate variance of empty model, then with IQ so we can see the added contribution from a new covariate:

#Extract the variances:
totvar_model_0 <- (summary(model_schools_0)$sigma)^2 + as.numeric(summary(model_schools_0)$varcor)

totvar_model_IQ <- (summary(model_schools_IQ)$sigma)^2 + as.numeric(summary(model_schools_IQ)$varcor)

# Proportion of variance explained by IQ:
Rsq_IQ <- (totvar_model_0 - totvar_model_IQ)/totvar_model_0
Rsq_IQ

## [1] 0.3786057

There are other ways to access model fit in more standard way. Remember that the mixed effect model is fit via MLE. We can stil however built an equivalent of R sqaured measure that is based on the ratio of two log likelihoods, There is a package for that too:

library(MuMIn)
r.squaredGLMM(model_schools_IQ)

## Warning: 'r.squaredGLMM' now calculates a revised statistic. See the help
## page.

##            R2m       R2c
## [1,] 0.3419141 0.4713251

We will get two outputs: R squared (marginal): 0.3423 and R sqaured (condiftional): 0.471, the former being the one that comes from accounting for fixed effect only and latter comes via combination of both random and fixed.

On top of that, we can also make assessment of intraclass correlation (ICC), this will allow us to investigate whether there are systematic similarities in grades within the schools. ICC is quite useful measure and can be also used to diagnose your models. If there is a very strong correlation between the units within the group or a really weak one - both may require attention.

Lets get one for our first model:

#ICC for our first model (by hand)
ICC_0 <- as.numeric(summary(model_schools_0)$varcor)/((summary(model_schools_0)$sigma)^2 + as.numeric(summary(model_schools_0)$varcor))
ICC_0

## [1] 0.2249341

This tell us that about 22 percent of variation in language scores is at the school level. This can also be stated as the following: if we picked randomly two students that belong to the same school, we would expect about 0.22 correlation in their language scores.

## ICC from the model with IQ  (by hand)
ICC_IQ<-as.numeric(summary(model_schools_IQ)$varcor)/((summary(model_schools_IQ)$sigma)^2 + as.numeric(summary(model_schools_IQ)$varcor))
ICC_IQ

## [1] 0.1966476

Or via direct function from the package sjstats

library(sjstats)

## 
## Attaching package: 'sjstats'

## The following objects are masked from 'package:psych':
## 
##     pca, phi

icc(model_schools_0)

## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.225
##   Conditional ICC: 0.225

icc(model_schools_IQ)

## # Intraclass Correlation Coefficient
## 
##      Adjusted ICC: 0.197
##   Conditional ICC: 0.129

We dropped to around 0.2 now (see Adjusted), this tell us that once we look at the similarity of IQ scores at the school level, the correlation between the scores obtained on lang_score can be explained by similarities in IQ scores within the school.

Quikc note:there is a slight differnce between an adjusted and conditional ICC . Former takes take all sources of uncertainty ( all random effects) into account, including the conditional ICC (fixed effects)

Finally, lets now add the schools average scores dynamic to our model:

#Add schools averages in
model_schools_IQ_sc <- lmer(lang_score ~ IQ_verb + sch_iqv + (1|school), data = schools)
summary(model_schools_IQ_sc)

## Linear mixed model fit by REML ['lmerMod']
## Formula: lang_score ~ IQ_verb + sch_iqv + (1 | school)
##    Data: schools
## 
## REML criterion at convergence: 24893.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2201 -0.6399  0.0631  0.7054  3.2173 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  school   (Intercept)  8.785   2.964   
##  Residual             40.442   6.359   
## Number of obs: 3758, groups:  school, 211
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  41.1132     0.2329 176.526
## IQ_verb       2.4536     0.0555  44.212
## sch_iqv       1.3127     0.2627   4.997
## 
## Correlation of Fixed Effects:
##         (Intr) IQ_vrb
## IQ_verb -0.007       
## sch_iqv  0.043 -0.209

Try to get ICC and R squared by yourself here. You will find that controlling for average IQ in the school will improve model as well. We can thus incorporate the school levels on top of individual level of the final model and perform a likelihood ratio test.

#Finally: we would like to control for effect of IQ at school level
model_school_final <- lmer(lang_score ~ IQ_verb + sch_iqv + (1+IQ_verb|school), #note how we control for IQ here to specify that we have random intercept, individual and group level predictor)
           data = schools)
summary(model_school_final)

## Linear mixed model fit by REML ['lmerMod']
## Formula: lang_score ~ IQ_verb + sch_iqv + (1 + IQ_verb | school)
##    Data: schools
## 
## REML criterion at convergence: 24870.5
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.2604 -0.6337  0.0676  0.7035  2.7622 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  school   (Intercept)  8.9787  2.9964        
##           IQ_verb      0.1995  0.4466   -0.63
##  Residual             39.6858  6.2997        
## Number of obs: 3758, groups:  school, 211
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  41.1275     0.2347 175.248
## IQ_verb       2.4802     0.0645  38.453
## sch_iqv       1.0305     0.2633   3.913
## 
## Correlation of Fixed Effects:
##         (Intr) IQ_vrb
## IQ_verb -0.279       
## sch_iqv -0.002 -0.187
## convergence code: 0
## Model failed to converge with max|grad| = 0.00224192 (tol = 0.002, component 1)

lrtest(model_schools_IQ_sc, model_school_final)

## Likelihood ratio test
## 
## Model 1: lang_score ~ IQ_verb + sch_iqv + (1 | school)
## Model 2: lang_score ~ IQ_verb + sch_iqv + (1 + IQ_verb | school)
##   #Df LogLik Df  Chisq Pr(>Chisq)    
## 1   5 -12447                         
## 2   7 -12435  2 23.394   8.32e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Finally, as usual by now, lets put everything side by side:

#Lest all the models together for comparison
screenreg(list(model_schools_0, model_schools_IQ, model_schools_IQ_sc, model_school_final))

## 
## ===========================================================================================
##                                  Model 1        Model 2        Model 3        Model 4      
## -------------------------------------------------------------------------------------------
## (Intercept)                          41.00 ***      41.05 ***      41.11 ***      41.13 ***
##                                      (0.33)         (0.24)         (0.23)         (0.23)   
## IQ_verb                                              2.51 ***       2.45 ***       2.48 ***
##                                                     (0.05)         (0.06)         (0.06)   
## sch_iqv                                                             1.31 ***       1.03 ***
##                                                                    (0.26)         (0.26)   
## -------------------------------------------------------------------------------------------
## AIC                               26601.69       24925.14       24903.93       24884.54    
## BIC                               26620.38       24950.07       24935.09       24928.16    
## Log Likelihood                   -13297.84      -12458.57      -12446.97      -12435.27    
## Num. obs.                          3758           3758           3758           3758       
## Num. groups: school                 211            211            211            211       
## Var: school (Intercept)              18.24           9.91           8.78           8.98    
## Var: Residual                        62.85          40.48          40.44          39.69    
## Var: school IQ_verb                                                                0.20    
## Cov: school (Intercept) IQ_verb                                                   -0.84    
## ===========================================================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

Looking now at various combinations, which one you think makes most sense both statistically and substantively?

20.8 Random slopes, intercepts and cross level interactions (optional)

For this part we will use the dataset taken from Hox’s (2010) textbook on multilevel modelling. The dataset includes the following variables:

variable	description
PUPIL	pupil id within class
CLASS	class id (1,…,100)
POPULAR	popularity score of the student
SEX	sex of the student (0=male, 1=female)
T.EXP	years of experience of the teacher
FEMALE	student is FEMALE

#Lets read the data in
popularity<-read.csv('Popularity.csv')

We will go on straight to building our models, starting with null once again for later comparison.

#Usual set up of the null and simple model controlling for the group
model_null <- lm(POPULAR ~ 1, data = popularity)
model_pop_0 <- lmer(POPULAR ~ (1|CLASS), data = popularity)
summary(model_pop_0)

## Linear mixed model fit by REML ['lmerMod']
## Formula: POPULAR ~ (1 | CLASS)
##    Data: popularity
## 
## REML criterion at convergence: 5115.6
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -2.88825 -0.63376 -0.05155  0.71091  3.00393 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  CLASS    (Intercept) 0.8798   0.9380  
##  Residual             0.6387   0.7992  
## Number of obs: 2000, groups:  CLASS, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   5.3076     0.0955   55.58

Test whether random intercepts is an improvement:

#Quick model comprison
lrtest(model_pop_0,model_null)

## Likelihood ratio test
## 
## Model 1: POPULAR ~ (1 | CLASS)
## Model 2: POPULAR ~ 1
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   3 -2557.8                         
## 2   2 -3244.8 -1 1373.9  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Lets see whether gender has anything to do with popularity:

#Adding an extra covariate
model_pop_sex <- lmer(POPULAR ~ FEMALE+(1|CLASS), data = popularity)
summary(model_pop_sex)

## Linear mixed model fit by REML ['lmerMod']
## Formula: POPULAR ~ FEMALE + (1 | CLASS)
##    Data: popularity
## 
## REML criterion at convergence: 4492.9
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3184 -0.6892  0.0018  0.5961  3.8239 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  CLASS    (Intercept) 0.8622   0.9286  
##  Residual             0.4599   0.6782  
## Number of obs: 2000, groups:  CLASS, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  4.89722    0.09530   51.39
## FEMALETRUE   0.84370    0.03096   27.25
## 
## Correlation of Fixed Effects:
##            (Intr)
## FEMALETRUE -0.158

And how the variable affects our values of R-squared:

## Pseudo R squared:
totvar_mod_pop_0 <- (summary(model_pop_0)$sigma)^2 + as.numeric(summary(model_pop_0)$varcor)
totvar_mod_pop_sex <- (summary(model_pop_sex)$sigma)^2 + as.numeric(summary(model_pop_sex)$varcor)
R_sq_pop <- (totvar_mod_pop_0 - totvar_mod_pop_sex)/totvar_mod_pop_0
R_sq_pop

## [1] 0.1293066

What about ICC?

#Again either by hand
as.numeric(summary(model_pop_sex)$varcor)/((summary(model_pop_sex)$sigma)^2 + as.numeric(summary(model_pop_sex)$varcor))

## [1] 0.652142

Anything else which can be important here? Lets add teaching experience:

model_pop_teach <- lmer(POPULAR ~ FEMALE+T.EXP+(1|CLASS), data = popularity)
summary(model_pop_teach)

## Linear mixed model fit by REML ['lmerMod']
## Formula: POPULAR ~ FEMALE + T.EXP + (1 | CLASS)
##    Data: popularity
## 
## REML criterion at convergence: 4444.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3585 -0.6797  0.0244  0.5933  3.7851 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  CLASS    (Intercept) 0.4860   0.6971  
##  Residual             0.4599   0.6782  
## Number of obs: 2000, groups:  CLASS, 100
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  3.56068    0.17148  20.765
## FEMALETRUE   0.84467    0.03095  27.291
## T.EXP        0.09345    0.01085   8.609
## 
## Correlation of Fixed Effects:
##            (Intr) FEMALE
## FEMALETRUE -0.088       
## T.EXP      -0.905  0.000

Lets check whether there is an improvement:

totvar_mod_pop_0 <- (summary(model_pop_0)$sigma)^2 + as.numeric(summary(model_pop_0)$varcor)
totvar_mod_teach <- (summary(model_pop_teach)$sigma)^2 + as.numeric(summary(model_pop_teach)$varcor)
Rsq_teach <- (totvar_mod_pop_0 - totvar_mod_teach/totvar_mod_pop_0 )
Rsq_teach

## [1] 0.8955763

Teacher experience plus sex of the student explain an extra chunk of variability in popularity. Does this effect varies across the schools, explore the following model yourself:

#Final model (cross-level interaction)
model_teach_sc <- lmer(POPULAR ~ FEMALE+T.EXP+(1+FEMALE|CLASS), data = popularity)

Run test to see if there are any differences:

#Test
lrtest(model_pop_teach,model_teach_sc)

## Likelihood ratio test
## 
## Model 1: POPULAR ~ FEMALE + T.EXP + (1 | CLASS)
## Model 2: POPULAR ~ FEMALE + T.EXP + (1 + FEMALE | CLASS)
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   5 -2222.2                         
## 2   7 -2137.9  2 168.46  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

You may note a warning.

#I ll have to change the optimizer here to ensure convergence (our model is getting a bit complex)
library(optimx)
model_pop_interaction <- lmer(POPULAR ~ FEMALE*T.EXP+(1+FEMALE|CLASS), data = popularity, 
                              control=lmerControl(optimizer="optimx", #make a note of the optimizer here, you can try to vary those
                                 optCtrl=list(method='nlminb')))
summary(model_pop_interaction)

## Linear mixed model fit by REML ['lmerMod']
## Formula: POPULAR ~ FEMALE * T.EXP + (1 + FEMALE | CLASS)
##    Data: popularity
## Control: 
## lmerControl(optimizer = "optimx", optCtrl = list(method = "nlminb"))
## 
## REML criterion at convergence: 4268.4
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.9337 -0.6519  0.0216  0.5307  3.4883 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  CLASS    (Intercept) 0.4120   0.6419       
##           FEMALETRUE  0.2264   0.4758   0.08
##  Residual             0.3924   0.6264       
## Number of obs: 2000, groups:  CLASS, 100
## 
## Fixed effects:
##                   Estimate Std. Error t value
## (Intercept)       3.313521   0.161015  20.579
## FEMALETRUE        1.329594   0.133049   9.993
## T.EXP             0.110235   0.010232  10.774
## FEMALETRUE:T.EXP -0.034035   0.008457  -4.024
## 
## Correlation of Fixed Effects:
##             (Intr) FEMALETRUE T.EXP 
## FEMALETRUE  -0.046                  
## T.EXP       -0.909  0.042           
## FEMALETRUE:  0.042 -0.908     -0.046

Find the change in explained variance by yourself here and for the grand finale - put all side by side:

#Models summaries
screenreg(list( model_pop_sex, model_pop_teach, model_teach_sc, model_pop_interaction))

## 
## =========================================================================================
##                                    Model 1       Model 2       Model 3       Model 4     
## -----------------------------------------------------------------------------------------
## (Intercept)                            4.90 ***      3.56 ***      3.34 ***      3.31 ***
##                                       (0.10)        (0.17)        (0.16)        (0.16)   
## FEMALETRUE                             0.84 ***      0.84 ***      0.84 ***      1.33 ***
##                                       (0.03)        (0.03)        (0.06)        (0.13)   
## T.EXP                                                0.09 ***      0.11 ***      0.11 ***
##                                                     (0.01)        (0.01)        (0.01)   
## FEMALETRUE:T.EXP                                                                -0.03 ***
##                                                                                 (0.01)   
## -----------------------------------------------------------------------------------------
## AIC                                 4500.89       4454.36       4289.89       4284.43    
## BIC                                 4523.30       4482.36       4329.10       4329.24    
## Log Likelihood                     -2246.45      -2222.18      -2137.95      -2134.22    
## Num. obs.                           2000          2000          2000          2000       
## Num. groups: CLASS                   100           100           100           100       
## Var: CLASS (Intercept)                 0.86          0.49          0.41          0.41    
## Var: Residual                          0.46          0.46          0.39          0.39    
## Var: CLASS FEMALETRUE                                              0.27          0.23    
## Cov: CLASS (Intercept) FEMALETRUE                                  0.02          0.02    
## =========================================================================================
## *** p < 0.001, ** p < 0.01, * p < 0.05

What do you conclude? Which model should we go with?