5.10 Time: Wave participation & time-point presence

5.10.1 Data & Packages & functions

  • Plot type: Stacked bar plot
  • tidyr::expand(): To create observations/rows for non-observed variable combinations

5.10.2 Graph

  • Here we’ll reproduce and maybe criticize as well as improve Figure 5.17 (Bauer et al. 2020)
  • Questions:
    • What does the graph show? What are the underlying variables (and data)?
    • How many scales/mappings does it use? Could we reduce them?
    • What do you like, what do you dislike about the figure? What is good, what is bad?
    • What kind of information could we add to the graph (if any)?
    • How would you approach a replication of the graph?


Presence/participation at/in different time points/waves

Figure 5.17: Presence/participation at/in different time points/waves



5.10.3 Lab: Data & Code

  • The code for Figure 5.17 is shown below (and creates Figure 5.18).

  • Learning objectives

    • How to make stacked barplots
    • How to expand data

We’ll start by preparing the data for our plot. As you can see below the data is in long-format already and contains an individual identifier pid as well as two variables that contain the same information namely the wave identifier in different format: wave.num and wave.

If you want directly move to the plot…

pid wave.num wave
421518540 1 Wave 1
441620046 1 Wave 1
454072144 1 Wave 1
477478244 1 Wave 1
481214044 1 Wave 1
453648542 1 Wave 1
## [1] 6258



We expand the data creating a new dataframe that we join with the older one. Like that we end up with a dataframe that indicated missings for missing \(\times\) respondent wave observations.

pid wave.num
401008246 1
401008246 2
401008246 3
401008443 1
401008443 2
401008443 3
## [1] 10269
## Joining, by = c("pid", "wave.num")
pid wave.num wave
401008246 1 NA
401008246 2 NA
401008246 3 Wave 3
401008443 1 Wave 1
401008443 2 NA
401008443 3 NA
## [1] 10269



Subsequently, we have to pursue different steps to summarize the data across waves as well as delete the categories with the smallest numbers (participants only in W2/W3 (N = 3) and only in W1/W3 (N = 2)). If you like you can skip this whole part and directly go to the function below.

## Warning: NAs introduced by coercion
wave samples samples_labels
Wave1 532 Only W1 (N = 532)
Wave1 292 W1 and W2 (N = 292)
Wave1 1269 W1, W2 and W3 (N = 1269)
Wave2 482 Only W2 (N = 482)
Wave2 292 W1 and W2 (N = 292)
Wave2 1269 W1, W2 and W3 (N = 1269)
Wave3 843 Only W3 (N = 843)
Wave3 1269 W1, W2 and W3 (N = 1269)

Finally, we plot the participation across waves in Figure 5.18.

Presence/participation at/in different time points/waves

Figure 5.18: Presence/participation at/in different time points/waves

5.10.4 Exercise

  • Try to produce such a graph with a panel survey that you are currently using. Store the panel data in long-format, only keep the participant ID as well as the wave number, rename these pid and wave.num and then start with the code.

References

Bauer, Paul C, Frederic Gerdon, Florian Keusch, and Frauke Kreuter. 2020. “The Impact of the GDPR Policy on Data Sharing/Privacy Attitudes.” Preliminary Draft, 1–22.