3.7 Distribution Across Subgroups

Suppose we wanted to investigate whether birthweight differed by the gender of the baby. One clear place to start is to look at variable summaries by group. There are a couple of ways to do this in R. The easiest way to do this in R is using the describeBy() function from the psych package.

Recall that the magrittr packages offers some alternative pipes. Here we use the %$% pipe. This pipe takes what is on the left and affixes it to the left of the arguments on the right, separated by a dollar sign. It is useful for streamlining code - we only have to type the dataset once and it is immediately clear which dataset we are working with. This code will normally use the %$% pipe where it enhances clarity.

#--- Examine birthweight by gender 
bab9 %$% describeBy(bweight, sex)
## 
##  Descriptive statistics by group 
## group: male
##    vars   n mean  sd median trimmed mad min  max range  skew kurtosis   se
## X1    1 326 3211 666   3290    3256 526 700 4650  3950 -0.88     1.59 36.9
## -------------------------------------------------------- 
## group: female
##    vars   n mean  sd median trimmed mad min  max range  skew kurtosis   se
## X1    1 315 3044 629   3120    3108 445 630 4416  3786 -1.15     2.04 35.4
#--- Same code, no pipe
#describeBy(bab9$bweight, bab9$sex)

Exercise 9.1: Descriptively, do the means and standard deviations differ by gender?

Recall that if working in the provided R Notebooks, you can write the answer to this question in the blank space directly beneath the code chunk relevant for answering it.