Chapter 1 The Chi-squared distribution
Before we go too much further, we will look at the sampling distribution used for chi-squared tests: the chi-squared distribution. "Chi" is a Greek letter, χ, and is pronounced, "ky".
The chi-squared distribution is defined by the degrees of freedom (df). So, supposing a random variable X2 follows a chi-squared distribution, we would write this as X2∼χ2df. If, for example, we had df = 5, we would write X2∼χ25. The below figure shows some example density curves of the χ2 distribution for varying degrees of freedom:
As we can see, the chi-squared distribution is positively skewed, however as the degrees of freedom increases, the density curve begins to look flatter and more like a density that resembles the normal distribution. The chi-squared distribution only takes on positive values.
When we carry out a chi-squared test, the observed test statistic, χ2 is placed within the context of the corresponding sampling distribution and we calculate the p-value as p=P(X2≥χ2). This means that a large test statistic will result in a small p-value (and subsequently a significant result), whereas a small test statistic will result in a large p-value (and subsequently a non-significant result).