3.5 Single population variance

  • We can use the chi-square distribution to test hypotheses about the variance of a population

  • For example, production processes quality is measured not only by how closely the machine matches the target, but also the variability of the process

  • Here are three important features of the \(\chi^2-\)distribution:

  1. it is defined only for positive values
  2. it is not symmetrical about its mean; instead, it is positively skewed
  3. distribution is uniquely determined by it’s mean; a value known as degrees of freedom
  • With a larger number of degrees of freedom, the distribution becomes more symmetrical and begins to resemble the normal distribution

  • The null and alternative hypotheses are stated in terms of the population variance \(\sigma^2\) assuming that the underlying distribution is normal

\[\begin{equation} H_0:~~\sigma^2=\sigma_0^2 \tag{3.15} \end{equation}\]

  • Assumed population varaince \(\sigma_0^2\) is the value that we think is true

  • Three alternative hypothesis are possible:

\[\begin{align} H_1:&~~\sigma^2 \ne \sigma_0^2 &\text{two-tailed test} \\ \\ H_1:&~~\sigma^2 < \sigma_0^2 &\text{left-tailed test} \\ \\ H_1:&~~\sigma^2 > \sigma_0^2 &\text{right-tailed test} \\ \tag{3.16} \end{align}\]

  • The test statistic follows a \(\chi^2-\)distribution with \((n-1)\) degrees of freedom:

\[\begin{align} \chi^2&=\frac{(n-1)S^2}{\sigma_0^2} \\ \\ S^2&=\frac{\displaystyle\sum_{i=1}^n(x_i-\bar{x})^2}{n-1} \\ \tag{3.17} \end{align}\]

  • Test statistic (3.17) measures how far away is the sample variance \(S^2\) from the assumed population variance \(\sigma_0^2\) (specified value). If the test statistic is extremely large positive or large negative number, it’s highly unlikely that the null hypothesis is true, and it will be rejected. Otherwise, the null hypothesis won’t be rejected.

Example 3.9 A post office finds that the standard deviation of waiting times for customers on Friday afternoon is 7.2 minutes. The post office experiments with a new software to reduce the witing time, and finds that for a random sample of 25 customers, the waiting times have a standard deviation of 3.5 minutes on a Friday afternoon. Test the hypothesis that a new software reduces the variation among waiting times. Compute the p-value in Excel using function =CHISQ.DIST(chi;df;TRUE).

Example 3.10 Suppose that an investor chooses a sample of 30 stocks from his portfolio. He calculates the standard deviation of the returns on these stocks (that is, their volatility) to be 23 percent on an annual basis. The investor wants to know whether the volatility of the entire portfolio is more than 20 percent on an annual basis at the 5 percent level of significance. Compute the p-value in Excel using function =CHISQ.DIST.RT(chi;df).