Beyond Linearity

Introduction

This module will cover methods to explore non-linear effects of numerical predictors on the outcome.

By the end of this module you should be able to:

Identify approaches do model non-linear effects
Implement linear and polynomial piecewise regression
Understand the difference between polynomial splines, b-splines and natural splines
Fit a GLM with different splines
Use smoothing splines to approximate non-linear effects
Integrate smoothing splines in modeling strategies using generalised additive models

Dataset used in the examples

The dataset triceps is available in the MultiKink package. You may install.packages("MultiKink"), load the library (library(MultiKink)) and then run data("triceps").

The data are derived from an anthropometric study of 892 females under 50 years in three Gambian villages in West Africa. There are 892 observations on the following 3 variables:

age - Age of respondents.
lntriceps - Log of the triceps skinfold thickness.
triceps - Triceps skinfold thickness.

The data SA_heart.csv is retrospective sample of males in a heart-disease high-risk region of the Western Cape, South Africa. There are roughly two controls per case of CHD.

Many of the CHD positive men have undergone blood pressure reduction treatment and other programs to reduce their risk factors after their CHD event. In some cases the measurements were made after these treatments. These data are taken from a larger dataset, described in Rousseauw et al, 1983, South African Medical Journal.

The data contains 462 observations on the following 10 variables.

sbp - systolic blood pressure
tobacco - cumulative tobacco (kg)
ldl - low density lipoprotein cholesterol
adiposity - a numeric vector
famhist - family history of heart disease, a factor with levels Absent Present
typea - type-A behavior
obesity - a numeric vector
alcohol - current alcohol consumption
age - age at onset
chd- response, coronary heart disease (1 - chd, 0 - no chd)

Slides from the videos

You can download the slides used in the videos form Beyond Linearity:

Slides

Machine Learning for Biostatistics

Machine Learning for Biostatistics

Module 5

Beyond Linearity

Introduction

Dataset used in the examples

Slides from the videos