5.2 Boxplots

A boxplot is visual way to present much of the key information about a variable. More specifically, for a numerical variable, it displays the minimum, 25% quantile (Q1), median, 75% quantile (Q3), and maximum. If there are outliers, these will also be displayed in a boxplot.

The below figure displays two example boxplots: one with no outliers, and one with two outliers. In the first plot, the box represents the IQR, while the vertical line in the middle of the box represents the median. The two vertical lines that make up the edges of the box represent the lower and upper 25% of the data. The second boxplot can be interpreted the same way and additionally has two outliers present, each represented by a dot towards the right hand side of the plot.

Another useful property of boxplots is that they can give us an indication of how skewed the data is. For example, the below figure shows four histograms along with their corresponding boxplots for four different sets of data. The first two histograms and corresponding boxplots show relatively symmetrically shaped data. The second two pairs show positively and negatively skewed data respectively.