3.2 Two population means

In addition to testing claims about the mean of a population, hypothesis testing can be used to compare the equality of two different population means
The procedure for testing hypotheses about two population means is similar to the procedure for a single population mean. The null hypothesis states that there is no difference between two population means $\mu_1$ and $\mu_2$ :

$\begin{equation} H_0:~~\mu_1=\mu_2 \tag{3.6} \end{equation}$

The alternative hypothesis can take one of three forms:

$\begin{align} H_1:&~~\mu_1 \ne \mu_2 &\text{two-tailed test} \\ \\ H_1:&~~\mu_1 < \mu_2 &\text{left-tailed test} \\ \\ H_1:&~~\mu_1 > \mu_2 &\text{right-tailed test} \\ \tag{3.7} \end{align}$

To compute a test statistic it is usually assumed:

Both populations are normally distributed
Samples are chosen from the two independent populations
Variances of the two populations are equal, although unknown

The test statistic follows a Student’s t-distribution with $(n_1+n_2-2)$ degrees of freedom:

$\begin{align} t&=\frac{\bar{x}_1-\bar{x}_2}{\sqrt{S_p^2 \bigg(\frac{1}{n_1}+\frac{1}{n_2} \bigg)}} \\ \\ S_p^2 &=\frac{S_1^2 (n_1-1)+S_2^2 (n_2-1)}{n_1+n_2-2} \\ \tag{3.8} \end{align}$

Here is what each term means:

$\begin{align} \bar{x}_1&~\text{is the mean of the first sample (chosen from population 1)} \\ \\ \bar{x}_2&~\text{is the mean of the second sample (chosen from population 2)} \\ \\ n_1&~\text{is the size of the first sample} \\ \\ n_2&~\text{is the size of the second sample} \\ \\ S_1^2&~\text{is the variance of the first sample} \\ \\ S_2^2&~\text{is the variance of the second sample} \\ \\ S_p^2&~\text{is the pooled (common) variance from the both samples} \end{align}$

Example 3.3 A company produces coconut milk. They have both a day shift and a night shift. They would like to know if the day shift and the night shift are equally efficient in processing the coconuts. A study is done by sampling $9$ shifts during the day and $16$ shifts during the night. The results of the number of hours required to process $100$ pounds of coconuts is presented in table 3.1. Is there a statisticaly significant difference in the average number of hours for each shift to process 100 pounds of coconuts? Significance level is $5\%$ .

TABLE 3.1: Time required for processing of 100 pounds of coconuts
Shift	Sample size	Sample mean	Sample standard deviation
Day	9	2.1	0.85
Night	16	3.2	0.96

Example 3.4 At $1\%$ significanve level test the hypothesis that the mean annual revenue of the listed companies is greater than the mean annual revenue of the companies not listed on the stock exchange. Use the data from Excel file. Obtain the p-value in Excel using function =T.TEST(). Perform the same testing by Data Analysis ToolPak.