Chapter 8 Analyzing Complex Survey Data

In this chapter, you will learn:

  • Complex survey concepts and terminology; and
  • How to incorporate a complex survey design to obtain estimates of population quantities using the following statistical analysis methods:
    • Descriptive statistics;
    • Linear regression;
    • Binary logistic regression;
    • Kaplan-Meier estimate of the survival function;
    • Log-rank test to compare survival functions between groups; and
    • Cox proportional hazards regression.

This chapter assumes that you have read the previous chapters which discuss the unweighted versions of these statistical analysis methods. In places, the presentation in this chapter is brief, only highlighting what is new when analyzing data from a complex survey.

To use the code in this chapter, first load the tidyverse and survey (Lumley 2004, 2023) libraries, set two global survey options, and load the file Functions_rmph.R (downloadable from RMPH Resources).

library(tidyverse)
library(survey)
options(survey.lonely.psu = "adjust")
options(survey.adjust.domain.lonely = TRUE)
source("Functions_rmph.R")

References

Lumley, Thomas. 2004. “Analysis of Complex Survey Samples.” Journal of Statistical Software 9 (1): 1–19.
———. 2023. Survey: Analysis of Complex Survey Samples. http://r-survey.r-forge.r-project.org/survey/.