3.7 Distribution Across Subgroups
Suppose we wanted to investigate whether birthweight differed by the gender of the baby. One clear place to start is to look at variable summaries by group. There are a couple of ways to do this in R. The easiest way to do this in R is using the describeBy() function from the psych package.
Recall that the magrittr packages offers some alternative pipes. Here we use the %$% pipe. This pipe takes what is on the left and affixes it to the left of the arguments on the right, separated by a dollar sign. It is useful for streamlining code - we only have to type the dataset once and it is immediately clear which dataset we are working with. This code will normally use the %$% pipe where it enhances clarity.
#--- Examine birthweight by gender
bab9 %$% describeBy(bweight, sex)
##
## Descriptive statistics by group
## group: male
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 326 3211 666 3290 3256 526 700 4650 3950 -0.88 1.59 36.9
## --------------------------------------------------------
## group: female
## vars n mean sd median trimmed mad min max range skew kurtosis se
## X1 1 315 3044 629 3120 3108 445 630 4416 3786 -1.15 2.04 35.4
#--- Same code, no pipe
#describeBy(bab9$bweight, bab9$sex)
Exercise 9.1: Descriptively, do the means and standard deviations differ by gender?
Recall that if working in the provided R Notebooks, you can write the answer to this question in the blank space directly beneath the code chunk relevant for answering it.