Chapter 13 Malicious Graphics
For the most part, we have looked at examples of good and bad graphs where we expect the creators of the graph were attempting to be truthful about the evidence. Next we will examine cases where people deliberately misrepresent the data for their own purpose.
13.1 Dual Y-Axes
In general, dual y-axes are hard to do correctly, and can be easily used maliciously.
Lisa Charlotte Rost has a great example of how the arbitrary scaling of each axis results can result in unintentionally confusing graphs.
In particular, the relative percent change or the rate of change is now messed up because the slopes on the two axes are not equivalent. Malicious tweaking of the scales would allow us to show whatever we want.
Example. Market share of a particular cookie manufacturer.
13.2 Examples of bad graphs
- Example Here we examine another case where the same quantity is show using two different EPT scales and they do not agree.
- Example This is an Obama administration graphic where the graphic design considerations were put before the data visualization considerations.
- Example This is an example of trying to take data and make it “look” like a chart.
- Example Another example of taking raw percentages and pasting them onto a pie chart without making the area of the slices correspond.
- Example This is talking about cricket scores for various teams and to indicate the team city, they added the most iconic building from the city. Does the area/height/color of the building indicate anything?
- Example Here an analyst compares the stock prices of Tesla and Netflix, but cherry picks certain time points to try to conflate the recovery of Netflix with his prediction of the future of Tesla.
- Example In reporting the 2018 election results, this graph labels are extremely confusing.
13.3 Cherry Picking Issues
Gish Gallop: A debating technique that attempts to overwhelm the listener with as many arguments as possible without any regard for accuracy or coherency.
Alberto Brandolini’s Law of Bullshit Asymmetry: The amount of energy necessary to refute bullshit is an order of magnitude larger than to produce it.
Cherry Picking: Selectively looking at data that supports your position while ignoring evidence that is counter to it.
Example This simplistic graph shows the change in the number of abortions vs cancer screenings done at Planned Parenthood disregards the increases in other services
For another example of all of these, check out this opinion article published by the Washington Examiner. Notice you can click on the highlights to see the commentary by climate scientists that took the time to discuss what the best scientific consensus is on each point. It is exhausting.
- Climate change can’t be true because:
- In 2019, Ted Cruz claims there hasn’t been any global warming since 1998
- This claim is based on Satellite data from 1997-2012. Those years are chosen very carefully. He further considers only satellite data, which is widely considered less accurate than ground station data. Broadly, scientists use satellite data only when there isn’t reliable ground measurements.
- After the satellite data has been aligned with the more accurate ground measurements, the “pause” he refers to disappears even with the highly optimized start/stop points of his pause interval.
- On the wider time scale, there is a clear upward trend.
- In 2019, Ted Cruz claims there hasn’t been any global warming since 1998
- Climate change can’t be true because:
- Regions in Georgia have cooled since the early parts of the 1900s!
- Yes but the rest of the country has warmed.
- Also, 1934 was the 2nd hottest year in the US, so comparing present day to the hottest part of the country in that particularly hot year results in seeing no increase.
Be careful to recognize specific intervals or specific measurements. Ask questions like:
- Does the time interval presented represent all the data, could the author have expanded/contracted the time window?
- Are there other relevant sources of data we aren’t talking about?
- Is the effect consistent across sources of data?