Recall the concept of quantiles we learned about earlier. There are some special quantiles, called quartiles. To explain: we could divide our data into four quarters, so that each quarter contained 25% of the observations. Then, Quartiles 1, 2, and 3, or Q1, Q2, and Q3, can be defined as follows:
- Q1 is the 25% quantile
- Q2 is the 50% quantile (this is in fact also the median)
- Q3 is the 75% quantile
Height variable from the
survey data set, we have that Q1 = 165, Q2 = 171, and Q3 = 180. We could therefore make the following interpretations about this sample of students:
- 25% of students are shorter than 165cm and 75% of students are taller than 165cm
- 50% of students are shorter than 171cm and 50% of students are taller than 171cm
- 75% of students are shorter than 180cm and 25% of students are taller than 180cm.
The position of the quartiles can give us some insight into how spread out the data is, and also the shape of the data. We will be considering measures of shape in the next section.
The inter-quartile range (IQR) is the distance spanned by the middle 50% of the data. In other words, it is the distance between the 25% quantile (Q1) and the 75% quantile (Q3), and can be calculated as Q3 - Q1. In our example, we have that
\[IQR = Q3 - Q1 = 180 − 165 = 15.\]
In a similar way to the variance and standard deviation, the width of the IQR gives us a good indication as to how spread out our data is.
The range of the data is simply the distance spanned across the whole data set. In other words, it is the difference between the maximum and minimum values. Considering the Height variable again, we have that the minimum and maximum heights are 200cm and 150cm respectively, so that the range is 50cm. One way of interpreting this value would be to say that the maximum difference in height between any two students in the sample is 50cm.