2.7 Why visualize? Further examples

  • See the figures below.
    • Q: What do you like about them and what do you dislike? What is good and what is bad? (I will provide some background afterwards)

Figure 2.3 published in Figures and Legewie (2019) visualizes descriptive data on police stops.6

Distribution of of police stops

Figure 2.3: Distribution of of police stops

Figure 2.4 published in Helbling and Traunmüller (2018) visualize results from a survey experiment.7

Survey experimental results

Figure 2.4: Survey experimental results

Figure 2.5 published in Bauer (2018) visualizes estimates of many models in a single graph (2 different datasets \(\times\) 3 outcomes \(\times\) 6 different estimation strategies). The graph summarizes three different tables (Table 3, 4 and 5).

Coefficient plot of many models

Figure 2.5: Coefficient plot of many models

Figure 2.6 published in Imai, Kim, and Wang (2018) visualize a dichotomous treatment variable over time periods (panel data).8

Dichotomous treatment variable over time periods

Figure 2.6: Dichotomous treatment variable over time periods

Figure 2.7 published in Hainmueller, Mummolo, and Xu (2016) visualizes simulated data to illustrate the distributions that may underlie linear estimates of interaction effects.9

Distribution underlying interaction effects

Figure 2.7: Distribution underlying interaction effects

Finally, one of many examples in which interactive data visualization is used. Wuttke, Schimpf, and Schoen (2020) study the measurement of populist attitudes. In addition to the actual paper they provide a shiny app in which users can interact with the data and produce various plots for subsets.


Bauer, Paul C. 2018. “Unemployment, Trust in Government, and Satisfaction with Democracy: An Empirical Investigation.” Socius 4 (January): 1–14.

Figures, Kalisha Dessources, and Joscha Legewie. 2019. “Visualizing Police Exposure by Race, Gender, and Age in New York City.” Socius 5 (January): 2378023119828913.

Hainmueller, J, J Mummolo, and Y Xu. 2016. “How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice.”

Helbling, Marc, and Richard Traunmüller. 2018. “What Is Islamophobia? Disentangling Citizens’ Feelings Toward Ethnicity, Religion and Religiosity Using a Survey Experiment.” Br. J. Polit. Sci., 1–18.

Imai, Kosuke, In Song Kim, and Erik Wang. 2018. “Matching Methods for Causal Inference with Time-Series Cross-Section Data.” Princeton University 1.

Wuttke, Alexander, Christian Schimpf, and Harald Schoen. 2020. “When the Whole Is Greater Than the Sum of Its Parts: On the Conceptualization and Measurement of Populist Attitudes and Other Multidimensional Constructs.” Am. Polit. Sci. Rev. 114 (2): 356–74.

  1. Description of the figure: “Because this visualization provides data across the lines of race, gender, and age, there are several comparative analyses that can be drawn. Trends can be seen within racial groups, gender groups, and age groups. All charts are parabolic in shape. Across all race-gender intersections, police exposure increases in late adolescence, peeks in early adulthood, and steadily declines as age increases thereafter. Across each race group, police stops on men far outnumber police stops on women. For both men and women, black residents experience the highest rate of pedestrian stops, with whites and Hispanic residents following in that particular order. For example, at age 20, black males are stopped 2.4 times more than their Hispanic counterparts and 5.6 times more than their white counterparts. Black females are stopped 2.2 times more than Hispanic women and 3.5 times more than white women” (Figures and Legewie 2019, 1–2).

  2. Title of the study is “What is Islamophobia? Disentangling Citizens’Feelings Toward Ethnicity, Religion and Religiosity Using a Survey Experiment”: Description of the figure: “Figure 1 presents the average feeling thermometer scores that respondents gave toward the groups we differentiate in our survey experiment. The average feeling thermometer score across all groups is around 38 (vertical gray line). The key comparison we are interested in is the difference in attitudes toward Christians (white circles) and Muslims (black triangles), and how this difference behaves when the groups are described as more or less religious and whether they are immigrants or native British.62The horizontal line segments indicate the difference” (Helbling and Traunmüller 2018, 10).

  3. Description of the figure: " In the left panel of Figure 1, we present the distribution of the treatment variable for the Acemoglu et al. (2017)study where a red (blue) rectangle represents a treated (control) country-year observation. White areas indicate the years when countries did not exist. We observe that many countries stayed either democratic or autocratic throughout years with no regime change. Among those who experienced a regime change, most have transitioned from autocracy to democracy, but some of them have gone back and forth multiple times. When ascertaining the causal effects of democratization, therefore,we may consider the effect of a transition from democracy to autocracy as well as that of a transition from autocracy to democracy.The treatment variation plot suggests that researchers can make a variety of comparisons be-tween the treated and control observations. For example, we can compare the treated and control observations within the same country over time, following the idea of regression models with unit fixed effects (Imai and Kim, 2016). With such an identification strategy, it is important not to compare the observations far from each other to keep the comparison credible. We also need to be careful about potential carryover effects where democratization may have a long term effect, introducing post-treatment bias. Alternatively, researchers can conduct comparison within the same year, which would correspond to the identification strategy of year fixed effects models. In this case, we wish to compare similar countries with one another for the same year and yet we may be concerned about unobserved differences among those countries.The right panel of Figure 1 shows the treatment variation plot for the Scheve and Stasavage(2012) study, in which a treated (control) observation represents the time of interstate war (peace)indicated by a red (blue) rectangle. As in the left plot of the figure, a white area represent the time period when a country did not exist. We observe that most of the treated observations are clustered around the time of two world wars. This implies that although the data set extends from1816 to 2000, most observations in earlier and recent years would not serve as comparable control observations for the treated country-year observations." (Imai, Kim, and Wang 2018, 7–8).

  4. Description of the figure: “Figure 2(a) was generated using the DGP where the standard multiplicative interaction models the correct model and therefore the LIE assumption holds. Hence, as Figure 2(a) shows,the conditional effect estimates from the binning estimator and the standard multiplicative interaction model are similar in both datasets. Even with a small sample size (i.e.,N=200), the three estimates from the binning estimator, labeled L, M, and H, sit almost right on the estimated linear marginal-effect line from the true standard multiplicative interaction model. Note that the estimates from the binning estimator are only slightly less precise than those from the true multiplicative interaction model, which demonstrates that there is at best a modest cost in terms of decreased efficiency from using this more flexible estimator. We also see from the histogram that the three estimates from the binning estimator are computed at typical low, medium, and high values of X with sufficient common support which is what we expect given the binning based on terciles.Contrast these results with those in Figure 2(b), which were generated using our simulated data in which the true marginal effect of D is nonlinear. In this case, the standard linear model indicates a slightly negative, but overall very weak, interaction effect, whereas the binning estimates reveal that the effect of D is actually strongly conditioned by X:D exerts a positive effect in the low range of X, a negative effect in the mid range of X, and a positive effect again in the high range of X. In the event of such a nonlinear effect, the standard linear model delivers the wrong conclusion. When the estimates from the binning estimator are far off the line or when they are non-monotonic, weave evidence that the LIE assumption does not hold” (Hainmueller, Mummolo, and Xu 2016, 10–11).