5.4 Pre-prepared Tables

Sometimes, we may wish to use data from prepared tables instead of individual rows. Let’s examine the use of bednets and the presence of an enlarged spleen. We want to manually enter four columns and we can do this using the command c() – ‘c’ stands for concatenate and creates ‘vectors’ of numbers (recall: a vector is a list of objects of the same data type). Each of the three variables take two codes. Village A is coded 1, and village B is coded 2. Spleen enlarged is coded 1, and spleen normal is coded 0. Those with bednets are coded 1 and those without are coded 0.

We can then combine the individual vectors together into a dataframe using the cbind() command, short for column bind. There is an equivalent rbind() command for rows.

We can expand to individual rows using the expand.dft() command from the vcdExtra package, to mimic Stata’s expand command. Note that we have to wrap the output in the as.data.frame() command to explicitly tell R we would like the resulting dataset in the dataframe format, the same format we have been using for all our other datasets.

We can then tabulate the data in the usual way, using the cc() command.

#--- Manually enter the data for each column
village <- c(1,1,1,1,2,2,2,2)
spleen <- c(1,1,0,0,1,1,0,0)
bednet <- c(1,0,1,0,1,0,1,0)
freq <- c(12,42,12,29,15,4,52,12)

#--- Bind together the columns into one data frame
bednets <- cbind(village, spleen, bednet, freq)

#--- Expand the frequency table into individual records
bednets <- as.data.frame(expand.dft(bednets, freq = "freq"))

#--- Get a summary table
bednets %$% cc(spleen, bednet, graph = F)
## 
##        bednet
## spleen    0   1 Total
##   0      41  64   105
##   1      46  27    73
##   Total  87  91   178
## 
## OR =  0.38 
## 95% CI =  0.2, 0.7  
## Chi-squared = 9.9, 1 d.f., P value = 0.002
## Fisher's exact test (2-sided) P value = 0.002
#--- Get village specific summary tables
bednets %>% filter(village == 1) %$% tabpct(spleen, bednet, graph = F)
## 
## Original table 
##        bednet
## spleen    0   1  Total
##   0      29  12     41
##   1      42  12     54
##   Total  71  24     95
## 
## Row percent 
##       bednet
## spleen       0       1  Total
##      0      29      12     41
##         (70.7)  (29.3)  (100)
##      1      42      12     54
##         (77.8)  (22.2)  (100)
## 
## Column percent 
##        bednet
## spleen    0       %   1      %
##   0      29  (40.8)  12   (50)
##   1      42  (59.2)  12   (50)
##   Total  71   (100)  24  (100)
bednets %>% filter(village == 2) %$% tabpct(spleen, bednet, graph = F)
## 
## Original table 
##        bednet
## spleen    0   1  Total
##   0      12  52     64
##   1       4  15     19
##   Total  16  67     83
## 
## Row percent 
##       bednet
## spleen       0       1  Total
##      0      12      52     64
##         (18.8)  (81.2)  (100)
##      1       4      15     19
##         (21.1)  (78.9)  (100)
## 
## Column percent 
##        bednet
## spleen    0      %   1       %
##   0      12   (75)  52  (77.6)
##   1       4   (25)  15  (22.4)
##   Total  16  (100)  67   (100)

Exercise 13.4: * a) Which is the response variable? * b) Does the prevalence of spleen enlargement differ between villages? * c) Use cc() to examine the association of using bednets and the prevalence of spleen enlargement, ignoring village. * d) Use filter() and cc() to examine the association of using bednets and the prevalence of spleen enlargement separately by village, and to examine this association when controlling for village.