2 Stepwise methods

2.1 Introduction

We have seen that fitting all the models to select the best one may be computationally intensive. Stepwise methods decrease the number of models to fit by adding (forward) or removing (backward) on variable at each step.

In backward stepwise, we fit with all the predictors in the model. We then remove the predictor with lower contribution to the model. This can be based on the change of AIC or some other statistics, if the variable is removed.
It is also common to remove the predictor with the highest p-value. One keeps removing variables until the removal of any other predictor will decrease the prediction ability (or all the predictors have a significant p-value).

The forward stepwise starts by choosing the predictor with best prediction ability. Than, with that predictor in the model, looks for the next predictor that most improves the model. This process stops when no more predictors improve the model.

Despite being computationally appealing, stepwise methods don’t necessarily choose the correct model, or even the best one.

2.2 Readings

Read the following chapters of An introduction to statistical learning:

6.1.2 Stepwise Selection

2.3 Practical session

Task 1 - Fit a linear model with stepwise backward

As in the previous section, we will use the fat dataset in the library(faraway), we want to fit a linear model to predict body fat (variable brozek) using the other variables available, except for siri (another way of computing body fat), density (it is used in the brozek and siri formulas) and free (it is computed using brozek formula).

Let’s first read the data

#install.packages(c("leaps", "faraway"))
library(leaps)   #function to search for the best model
library(faraway) #has the dataset fat
set.seed(1221)

data("fat")
head(fat)

##   brozek siri density age weight height adipos  free neck chest abdom   hip
## 1   12.6 12.3  1.0708  23 154.25  67.75   23.7 134.9 36.2  93.1  85.2  94.5
## 2    6.9  6.1  1.0853  22 173.25  72.25   23.4 161.3 38.5  93.6  83.0  98.7
## 3   24.6 25.3  1.0414  22 154.00  66.25   24.7 116.0 34.0  95.8  87.9  99.2
## 4   10.9 10.4  1.0751  26 184.75  72.25   24.9 164.7 37.4 101.8  86.4 101.2
## 5   27.8 28.7  1.0340  24 184.25  71.25   25.6 133.1 34.4  97.3 100.0 101.9
## 6   20.6 20.9  1.0502  24 210.25  74.75   26.5 167.0 39.0 104.5  94.4 107.8
##   thigh knee ankle biceps forearm wrist
## 1  59.0 37.3  21.9   32.0    27.4  17.1
## 2  58.7 37.3  23.4   30.5    28.9  18.2
## 3  59.6 38.9  24.0   28.8    25.2  16.6
## 4  60.1 37.3  22.8   32.4    29.4  18.2
## 5  63.2 42.2  24.0   32.2    27.7  17.7
## 6  66.0 42.0  25.6   35.7    30.6  18.8

The function regsubset() that we have used before, it also implements backward and forward selection.

We will start with backward selection for variables selection.

############################################################
#BACKWARD SELECTION
############################################################
model.bwrd <- regsubsets(brozek ~ age + weight + 
                           height + adipos +
                           neck + chest + 
                           abdom + hip + thigh +
                           knee + ankle + 
                           biceps + forearm + 
                           wrist,
                         data = fat, 
                         nvmax = 14, 
                         method = "backward")
summary(model.bwrd)

## Subset selection object
## Call: regsubsets.formula(brozek ~ age + weight + height + adipos + 
##     neck + chest + abdom + hip + thigh + knee + ankle + biceps + 
##     forearm + wrist, data = fat, nvmax = 14, method = "backward")
## 14 Variables  (and intercept)
##         Forced in Forced out
## age         FALSE      FALSE
## weight      FALSE      FALSE
## height      FALSE      FALSE
## adipos      FALSE      FALSE
## neck        FALSE      FALSE
## chest       FALSE      FALSE
## abdom       FALSE      FALSE
## hip         FALSE      FALSE
## thigh       FALSE      FALSE
## knee        FALSE      FALSE
## ankle       FALSE      FALSE
## biceps      FALSE      FALSE
## forearm     FALSE      FALSE
## wrist       FALSE      FALSE
## 1 subsets of each size up to 14
## Selection Algorithm: backward
##           age weight height adipos neck chest abdom hip thigh knee ankle biceps
## 1  ( 1 )  " " " "    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 2  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 3  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 4  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 5  ( 1 )  "*" "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 6  ( 1 )  "*" "*"    " "    " "    " "  " "   "*"   " " "*"   " "  " "   " "   
## 7  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " "*"   " "  " "   " "   
## 8  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   " "   
## 9  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   "*"   
## 10  ( 1 ) "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 11  ( 1 ) "*" "*"    "*"    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 12  ( 1 ) "*" "*"    "*"    " "    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 13  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 14  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   "*"  "*"   "*"   
##           forearm wrist
## 1  ( 1 )  " "     " "  
## 2  ( 1 )  " "     " "  
## 3  ( 1 )  " "     "*"  
## 4  ( 1 )  "*"     "*"  
## 5  ( 1 )  "*"     "*"  
## 6  ( 1 )  "*"     "*"  
## 7  ( 1 )  "*"     "*"  
## 8  ( 1 )  "*"     "*"  
## 9  ( 1 )  "*"     "*"  
## 10  ( 1 ) "*"     "*"  
## 11  ( 1 ) "*"     "*"  
## 12  ( 1 ) "*"     "*"  
## 13  ( 1 ) "*"     "*"  
## 14  ( 1 ) "*"     "*"

#prints the Cp and matrix of predictors
cbind(summary(model.bwrd)$outmat, 
      Cp=round(summary(model.bwrd)$cp,4))

##           age weight height adipos neck chest abdom hip thigh knee ankle biceps
## 1  ( 1 )  " " " "    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 2  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 3  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 4  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 5  ( 1 )  "*" "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 6  ( 1 )  "*" "*"    " "    " "    " "  " "   "*"   " " "*"   " "  " "   " "   
## 7  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " "*"   " "  " "   " "   
## 8  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   " "   
## 9  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   "*"   
## 10  ( 1 ) "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 11  ( 1 ) "*" "*"    "*"    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 12  ( 1 ) "*" "*"    "*"    " "    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 13  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 14  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   "*"  "*"   "*"   
##           forearm wrist Cp       
## 1  ( 1 )  " "     " "   "71.0755"
## 2  ( 1 )  " "     " "   "19.6177"
## 3  ( 1 )  " "     "*"   "13.2793"
## 4  ( 1 )  "*"     "*"   "8.148"  
## 5  ( 1 )  "*"     "*"   "8.3016" 
## 6  ( 1 )  "*"     "*"   "6.5901" 
## 7  ( 1 )  "*"     "*"   "5.3117" 
## 8  ( 1 )  "*"     "*"   "5.2307" 
## 9  ( 1 )  "*"     "*"   "6.2969" 
## 10  ( 1 ) "*"     "*"   "7.5953" 
## 11  ( 1 ) "*"     "*"   "9.1148" 
## 12  ( 1 ) "*"     "*"   "11.0509"
## 13  ( 1 ) "*"     "*"   "13"     
## 14  ( 1 ) "*"     "*"   "15"

Start reading the above matrix from below. First, you have the model will all the predictors (line 14). The knee is removed, followed by adipos. The last model fitted only has abdom. Among these models, the one with 8 predictors has the lowest $Cp$ .

We can now fit the model with those predictors:

#final model from backward 
bwrd.model <- lm(brozek ~ age + weight + 
                    neck + abdom + hip +
                    thigh + forearm + 
                    wrist,
                  data = fat)
summary(bwrd.model)

## 
## Call:
## lm(formula = brozek ~ age + weight + neck + abdom + hip + thigh + 
##     forearm + wrist, data = fat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.0574  -2.7411  -0.1912   2.6929   9.4977 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -20.06213   10.84654  -1.850  0.06558 .  
## age           0.05922    0.02850   2.078  0.03876 *  
## weight       -0.08414    0.03695  -2.277  0.02366 *  
## neck         -0.43189    0.20799  -2.077  0.03889 *  
## abdom         0.87721    0.06661  13.170  < 2e-16 ***
## hip          -0.18641    0.12821  -1.454  0.14727    
## thigh         0.28644    0.11949   2.397  0.01727 *  
## forearm       0.48255    0.17251   2.797  0.00557 ** 
## wrist        -1.40487    0.47167  -2.978  0.00319 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.965 on 243 degrees of freedom
## Multiple R-squared:  0.7467, Adjusted R-squared:  0.7383 
## F-statistic: 89.53 on 8 and 243 DF,  p-value: < 2.2e-16

Now, let’s use forward stepwise. Using the $Cp$ to choose the best model, will result, in this case,in the same set of the predictors as the backward stepwise. This is not always the case but it is quite common to happen.

Also, the matrix in the output is not exactly the same as the backward method.

############################################################
#FORWARD SELECTION
############################################################
model.fwrd <- regsubsets(brozek ~ age + weight + 
                           height + adipos +
                           neck + chest + 
                           abdom + hip + thigh +
                           knee + ankle + 
                           biceps + forearm + 
                           wrist,
                         data = fat, 
                         nvmax = 14, 
                         method = "forward")
summary(model.fwrd)

## Subset selection object
## Call: regsubsets.formula(brozek ~ age + weight + height + adipos + 
##     neck + chest + abdom + hip + thigh + knee + ankle + biceps + 
##     forearm + wrist, data = fat, nvmax = 14, method = "forward")
## 14 Variables  (and intercept)
##         Forced in Forced out
## age         FALSE      FALSE
## weight      FALSE      FALSE
## height      FALSE      FALSE
## adipos      FALSE      FALSE
## neck        FALSE      FALSE
## chest       FALSE      FALSE
## abdom       FALSE      FALSE
## hip         FALSE      FALSE
## thigh       FALSE      FALSE
## knee        FALSE      FALSE
## ankle       FALSE      FALSE
## biceps      FALSE      FALSE
## forearm     FALSE      FALSE
## wrist       FALSE      FALSE
## 1 subsets of each size up to 14
## Selection Algorithm: forward
##           age weight height adipos neck chest abdom hip thigh knee ankle biceps
## 1  ( 1 )  " " " "    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 2  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 3  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 4  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 5  ( 1 )  " " "*"    " "    " "    "*"  " "   "*"   " " " "   " "  " "   " "   
## 6  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " " "   " "  " "   " "   
## 7  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " "*"   " "  " "   " "   
## 8  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   " "   
## 9  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   "*"   
## 10  ( 1 ) "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 11  ( 1 ) "*" "*"    "*"    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 12  ( 1 ) "*" "*"    "*"    " "    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 13  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 14  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   "*"  "*"   "*"   
##           forearm wrist
## 1  ( 1 )  " "     " "  
## 2  ( 1 )  " "     " "  
## 3  ( 1 )  " "     "*"  
## 4  ( 1 )  "*"     "*"  
## 5  ( 1 )  "*"     "*"  
## 6  ( 1 )  "*"     "*"  
## 7  ( 1 )  "*"     "*"  
## 8  ( 1 )  "*"     "*"  
## 9  ( 1 )  "*"     "*"  
## 10  ( 1 ) "*"     "*"  
## 11  ( 1 ) "*"     "*"  
## 12  ( 1 ) "*"     "*"  
## 13  ( 1 ) "*"     "*"  
## 14  ( 1 ) "*"     "*"

cbind(summary(model.fwrd)$outmat, 
      Cp=round(summary(model.fwrd)$cp,4))

##           age weight height adipos neck chest abdom hip thigh knee ankle biceps
## 1  ( 1 )  " " " "    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 2  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 3  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 4  ( 1 )  " " "*"    " "    " "    " "  " "   "*"   " " " "   " "  " "   " "   
## 5  ( 1 )  " " "*"    " "    " "    "*"  " "   "*"   " " " "   " "  " "   " "   
## 6  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " " "   " "  " "   " "   
## 7  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   " " "*"   " "  " "   " "   
## 8  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   " "   
## 9  ( 1 )  "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  " "   "*"   
## 10  ( 1 ) "*" "*"    " "    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 11  ( 1 ) "*" "*"    "*"    " "    "*"  " "   "*"   "*" "*"   " "  "*"   "*"   
## 12  ( 1 ) "*" "*"    "*"    " "    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 13  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   " "  "*"   "*"   
## 14  ( 1 ) "*" "*"    "*"    "*"    "*"  "*"   "*"   "*" "*"   "*"  "*"   "*"   
##           forearm wrist Cp       
## 1  ( 1 )  " "     " "   "71.0755"
## 2  ( 1 )  " "     " "   "19.6177"
## 3  ( 1 )  " "     "*"   "13.2793"
## 4  ( 1 )  "*"     "*"   "8.148"  
## 5  ( 1 )  "*"     "*"   "7.4123" 
## 6  ( 1 )  "*"     "*"   "7.0794" 
## 7  ( 1 )  "*"     "*"   "5.3117" 
## 8  ( 1 )  "*"     "*"   "5.2307" 
## 9  ( 1 )  "*"     "*"   "6.2969" 
## 10  ( 1 ) "*"     "*"   "7.5953" 
## 11  ( 1 ) "*"     "*"   "9.1148" 
## 12  ( 1 ) "*"     "*"   "11.0509"
## 13  ( 1 ) "*"     "*"   "13"     
## 14  ( 1 ) "*"     "*"   "15"

#final model from forward 
fwrd.model <- lm(brozek ~ age + weight + 
                   neck + abdom + 
                   hip + thigh +
                  forearm + wrist,
                 data = fat)
summary(fwrd.model)

## 
## Call:
## lm(formula = brozek ~ age + weight + neck + abdom + hip + thigh + 
##     forearm + wrist, data = fat)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -10.0574  -2.7411  -0.1912   2.6929   9.4977 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -20.06213   10.84654  -1.850  0.06558 .  
## age           0.05922    0.02850   2.078  0.03876 *  
## weight       -0.08414    0.03695  -2.277  0.02366 *  
## neck         -0.43189    0.20799  -2.077  0.03889 *  
## abdom         0.87721    0.06661  13.170  < 2e-16 ***
## hip          -0.18641    0.12821  -1.454  0.14727    
## thigh         0.28644    0.11949   2.397  0.01727 *  
## forearm       0.48255    0.17251   2.797  0.00557 ** 
## wrist        -1.40487    0.47167  -2.978  0.00319 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.965 on 243 degrees of freedom
## Multiple R-squared:  0.7467, Adjusted R-squared:  0.7383 
## F-statistic: 89.53 on 8 and 243 DF,  p-value: < 2.2e-16

TRY IT YOURSELF:

What variables are selected in the example above using forward stepwise, if instead of the $Cp$ we use the $BIC$ ?

See the solution code

#4 predictors: weight, abdom, forearm and wrist
cbind(summary(model.fwrd)$outmat, 
      BIC=round(summary(model.fwrd)$bic,4))

What is the adjusted r-square for the model in 1) and the model with the 8 predictors?

See the solution code

#4 pred = 0.731, 8 pred = 0.738
cbind(summary(model.fwrd)$outmat, 
      AdjR2=round(summary(model.fwrd)$adjr2,4))

Task 2 - Stepwise in a logistic model

With the lowbwt.csv dataset, in the library(faraway), we want to fit a logistic regression to predict low birthweight (<2500gr), using age, lwt, race, smoke, ptl, ht, ui, ftv.

Let’s read the data and make sure that race and ftv are factor predictors. And recode ftv into (0, 1, 2+)

#install.packages("bestglm")
#install.packages("dplyr")
library(bestglm)
library(dplyr)                            #to use the rename() function

lowbwt <- read.csv("https://www.dropbox.com/s/1odxxsbwd5anjs8/lowbwt.csv?dl=1",
                   stringsAsFactors = TRUE)
lowbwt$race <- factor(lowbwt$race, 
                      levels = c(1,2,3),
                      labels = c("White", "Black", "Other"))
#number of previous appointments 
lowbwt$ftv[lowbwt$ftv>1] <- 2       #categories 0, 1, 2+
lowbwt$ftv <- factor(lowbwt$ftv,
                     levels = c(0,1,2),
                     labels = c("0", "1", "2+"))

head(lowbwt)

##   id low age lwt  race smoke ptl ht ui ftv  bwt
## 1 85   0  19 182 Black     0   0  0  1   0 2523
## 2 86   0  33 155 Other     0   0  0  0  2+ 2551
## 3 87   0  20 105 White     1   0  0  0   1 2557
## 4 88   0  21 108 White     1   0  0  1  2+ 2594
## 5 89   0  18 107 White     1   0  0  1   0 2600
## 6 91   0  21 124 Other     0   0  0  0   0 2622

Let’s first fit the model for low using all the predictors.

lowbwt.logit <- glm(low ~ age + 
                      lwt + race + 
                      smoke + ptl +
                      ht + ui + ftv, 
                    family=binomial, data=lowbwt)
summary(lowbwt.logit)

## 
## Call:
## glm(formula = low ~ age + lwt + race + smoke + ptl + ht + ui + 
##     ftv, family = binomial, data = lowbwt)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9907  -0.8247  -0.5154   0.9791   2.1624  
## 
## Coefficients:
##              Estimate Std. Error z value Pr(>|z|)   
## (Intercept)  0.590930   1.213568   0.487  0.62630   
## age         -0.026945   0.037414  -0.720  0.47141   
## lwt         -0.015748   0.006963  -2.262  0.02371 * 
## raceBlack    1.254016   0.529382   2.369  0.01784 * 
## raceOther    0.820827   0.451042   1.820  0.06878 . 
## smoke        0.865245   0.414450   2.088  0.03683 * 
## ptl          0.604607   0.357445   1.691  0.09075 . 
## ht           1.894689   0.705977   2.684  0.00728 **
## ui           0.732822   0.460442   1.592  0.11148   
## ftv1        -0.299057   0.463918  -0.645  0.51917   
## ftv2+        0.178370   0.450777   0.396  0.69233   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 234.67  on 188  degrees of freedom
## Residual deviance: 200.61  on 178  degrees of freedom
## AIC: 222.61
## 
## Number of Fisher Scoring iterations: 4

The function step() also implements stepwise selection based on AIC for generalised linear models. We will use it to select the best predictors using backward stepwise

step(lowbwt.logit)

## Start:  AIC=222.61
## low ~ age + lwt + race + smoke + ptl + ht + ui + ftv
## 
##         Df Deviance    AIC
## - ftv    2   201.43 219.43
## - age    1   201.13 221.13
## <none>       200.61 222.61
## - ui     1   203.10 223.10
## - ptl    1   203.58 223.58
## - smoke  1   205.06 225.06
## - race   2   207.41 225.41
## - lwt    1   206.35 226.35
## - ht     1   208.22 228.22
## 
## Step:  AIC=219.43
## low ~ age + lwt + race + smoke + ptl + ht + ui
## 
##         Df Deviance    AIC
## - age    1   201.99 217.99
## <none>       201.43 219.43
## - ptl    1   203.95 219.95
## - ui     1   204.11 220.11
## - race   2   208.77 222.77
## - lwt    1   206.81 222.81
## - smoke  1   206.92 222.92
## - ht     1   208.81 224.81
## 
## Step:  AIC=217.99
## low ~ lwt + race + smoke + ptl + ht + ui
## 
##         Df Deviance    AIC
## <none>       201.99 217.99
## - ptl    1   204.22 218.22
## - ui     1   204.90 218.90
## - smoke  1   207.73 221.73
## - lwt    1   208.11 222.11
## - race   2   210.31 222.31
## - ht     1   209.46 223.46

## 
## Call:  glm(formula = low ~ lwt + race + smoke + ptl + ht + ui, family = binomial, 
##     data = lowbwt)
## 
## Coefficients:
## (Intercept)          lwt    raceBlack    raceOther        smoke          ptl  
##    -0.08655     -0.01591      1.32572      0.89708      0.93873      0.50321  
##          ht           ui  
##     1.85504      0.78570  
## 
## Degrees of Freedom: 188 Total (i.e. Null);  181 Residual
## Null Deviance:       234.7 
## Residual Deviance: 202   AIC: 218

We can see, step by step, the variables removed. First ftv is removed, followed by age, up to the final model that includes the variables above.

If you get an error because there are missing values in dataset and the

function does not know how to handle them, try to restrict the dataset to 
complete cases only as below.


```r
lowbwt.complete <- lowbwt[complete.cases(lowbwt), ]
lowbwt.logit <- glm(low ~ age + 
                      lwt + race + 
                      smoke + ptl +
                      ht + ui + ftv, 
                    family=binomial, data=lowbwt.complete)

step(lowbwt.logit)

We can see, step by step, the variables removed. First ftv is removed, followed by age, up to the final model that includes the variables above.

2.4 Exercises

With the fat dataset (Task 1), use the step() function to implement backward selection, to select the predictors for body fat (variable brozek) using all the other variables available, except for siri, density and free
The dataset prostate available in the package prostate contains information on 97 men who were about to receive a radical prostatectomy. These data come from a study examining the correlation between the prostate specific antigen (lpsa) and a number of other clinical measures.

Use backward stepwise to select a subset of predictors of lpsa, using all other the variables in the model. You can use the $adjusted-r^2$ to select the best model. You can compare the model that you have obtained with the model using best subset selection (section 1.3)