Chapter 6 Two Samples Hypothesis Testing (Section on Mar 9th)
Testing on Mean and Variance for Two Samples Test
Generally, this kind of problem gives you two groups of randomly sampled data from two normally distributed populations. The task is to test whether the two normal population have the same mean and variance. The test procedure we take is first test whether the variance (Chapter 8-6 on Textbook, start from Page-414), then base on the result to test the mean (Chapter 8-3 on Textbook, start from Page-389).
Definition 6.2 (Test Variance of Two Samples) Let s21 denote the larger of the two sample variances, n1 denote the corresponding sample size, σ21 denote the corresponding population variance. s22,n2,σ22 denote the sample variance, sample size and population variance for the other sample. The two samples variance test is the hypothesis test testing whether σ21 equals σ22. The null and alternative hypothesis is stated below: H0:σ21=σ22H1:σ21≠σ22 The test statistic is F=s21s22 Since the test statistic follows F distribution under null hypothesis, the critical value should be found with respect to the F distribution table. To get the critrical value, we need to the following three values:
Significance level α: usually specified in the problem.
Numerator degree of freedom df1: computed by n1−1.
denominator degree of freedom df2: computed by n2−1

FIGURE 6.1: Two samples variance hypothesis testing reject region
Some tips for this test:
Remember in the calculation of F, you need to put the larger sample variance on the numerator. In such a case, you will always get a F value larger than 1. Therefore, it is not necessary to compare your F value with Fα2,df1,df2.
To get the correct critrical value, make sure you get the correct degree of freedoms by correctly specify n1 and n2. You also need to judge whether this is a two-tailed test or a one-tailed test.
F distribution table can be found as Table A-5 on your textbook or online here.
For two samples mean test, based on the result of variance test, we have two different ways.
Definition 6.3 (Test Mean of Two Samples, When Sample Variance Can Be Assumed Same) Let ˉxi,ni,μi,s2i, i=1,2 denote the sample mean, sample size, population mean and sample variance for two groups. In this test, we usually cares about whether μ1 equals μ2 or not, the two hypotheses is then H0:μ1=μ2H1:μ1≠μ2 The test statistic is then t=ˉx1−ˉx2√s2pn1+s2pn2 where s2p=(n1−1)s21+(n2−1)s22(n1−1)+(n2−1) Since the test statistic follows t distribution under null hypothesis, the critical value should be found with respect to the t distribution table. To get the critrical value, we need to the following two values:
Significance level α: usually specified in the problem.
Degree of freedom: calculated by df=n1+n2−2.

FIGURE 6.2: Procedure of finding p-value
Requirements for using these tests
The two samples are independent.
Both samples are simple random samples.
Either or both of these conditions is satisfied: The two sample sizes are both large (with n1>30 and n2>30) or both samples come from populations having normal distributions.
Exercise 6.1 The test scores of randomly selected 8 female students and 6 male students are given by the following:
scores for male students: 81,84,89,79,82,90
scores for female students: 85,89,92,94,81,78,89,86.
Assuming the scores of females and males following N(μ1,σ1) distribution and N(μ2,σ2) distribution respectively, test the hypothesis H0:σ1=σ2 vs. H1:σ1≠σ2. Would you reject H0 at 5% level of significance?
Test H0:μ1=μ2 vs.H1:μ1≠μ2 and provide the p-value. Would you reject H0 at 5% level of significance?
Proof. (a) We do this test step by step as the text book does.
Step 0: We compute sample variance and sample size for each groups, and denote the group with larger variance as group 1. By doing this wa have s21=29.07,n1=8 and s22=19.77,n2=6.
Step 1: The claim of equal standard deviations is equivalent to a claim of equal variances, which we express symbolically as σ21=σ22.
Step 2: If the original claim is false, then σ21≠σ22.
Step 3: Because the null hypothesis is the statement of equality and because the alternative hypothesis cannot contain equality, we have H0:σ21=σ22H1:σ21≠σ22
Step 4: The significance level is α=0.05.
Step 5: Because this test involves two population variances, we use the F distribution.
Step 6: The test statistic is F=s21s22=29.0719.77=1.47 For the critrical value, we also need degree of freedom, which is 7 and 5. Thus, the critrical value is F0.975,7,5=6.85.
Step 7: since F=1.47<6.85, we fail to reject the null hypothesis and conclude that the two sample standard deviation is the same.
- We do this follows the steps given by examples on the textbook.
Step 0: The sample mean for group 1 (female students) is ˉx1=86.75, with variance s21=29.07 and sample size n1=8. For group 2 (male students), ˉx2=84.17, s22=19.77 and sample size n2=6.
Step 1: The claim of equal means can be expressed symbolically as μ1=μ2.
Step 2: If the original claim is false, then μ1≠μ2.
Step 3: The alternative hypothesis is the expression not containing equality, and the null hypothesis is an expression of equality, so we have H0:μ1=μ2H1:μ1≠μ2
Step 4: The significance level is α=0.05.
Step 5: Because we have two independent samples and we are testing a claim about the two population means, we use a t distribution with the test statistic given earlier in this section.
Step 6: Since we have same variance assumption, we use Definition 6.3 to compute the test statistic as s2p=(n1−1)s21+(n2−1)s22(n1−1)+(n2−1)=25.195t=ˉx1−ˉx2√s2pn1+s2pn2=0.88
Step 7: Since the degree of freedom is df=n+m−2=12, we calculate p-value, the p-value for this problem as 2(1−P(t12<0.88))=0.40>0.05, so we do not reject the null hypothesis and conclude the two sample mean is the same.Exercise 6.2 The heights of randomly selected 5 males from country A and 7 males from country B are given by the following:
heights for males in country A: 163,160,159,159,161
heights for males in country B: 149,182,145,143,184,185,140.
Assuming the heights of males from country A and B following N(μ1,σ1) distribution and N(μ2,σ2) distribution respectively, test the hypothesis H0:σ1=σ2 vs. H1:σ1≠σ2. Would you reject H0 at 5% level of significance?
Test H0:μ1=μ2 vs.H1:μ1≠μ2 and provide the p-value.
Proof. (a) Since s21=451.81, n1=7, s22=2.8,n2=5, we are testing H0:σ21=σ22 vs. H1:σ21≠σ22 at significance level α=0.05. The test statistic is F=451.812.8=161.36. The critrical value is F0.975,6,4=9.20, since F>>9.20 we reject the null hypothesis and reject the null hypothesis, concluding that σ1≠σ2.
For part(b) Since ˉx1=161.14 and ˉx2=160.4, we are testing H0:ˉx1=ˉx2 vs. H1:ˉx1≠ˉx2 at significance level α=0.05, without assuming same variance. The test statistic is therefore t=(ˉx1−ˉx2)√s21n1+s22n2=161.14−160.4√451.817+√2.85=0.09 Then you can either compute p-value as 2(1−P(Z<0.09))=0.93 or 2(1−P(T4<0.09))=0.93, either way you will fail to reject the null hypothesis and conclude the two means are the same.Testing two proportion
This kind of hypothesis testing problem is discussed in detail in Chapter 8-2 on your textbook (start from Page-379).
Exercise 6.3 A survey is conducted in the Santa Cruz and Monterey counties to assess the proportion of smokers. Among 600 people surveyed in both counties, 230 and 180 are found to be smokers in the Santa Cruz and Monterey counties respectively.
If p1 and p2 denote the proportion of smokers in the entire SC and Monterey counties respectively, test H0:p1=p2 vs H1:p0≠p1 under α=0.05.Proof. We do this step by step as the textbook.
Step 0: Get the numbers from sample data, we have n1=n2=600, x1=230, x2=180, ˆp1=230600=0.38 and ˆp2=180600=0.3.
Step 1: The claim of equal proportion can be expressed symbolically as p1=p2.
Step 2: If the original claim is false, then p1≠p2.
Step 3: The alternative hypothesis is the expression not containing equality, and the null hypothesis is an expression of equality, so we have H0:p1=p2H1:p1≠p2
Step 4: The significance level is α=0.05.
Step 5: The reference distribution is standard normal distribution.
Step 6: We calculate test statistic using (6.13) and (6.14) as follow. ˉp=x1+x2n1+n2=230+180600+600=0.34z=ˆp1−ˆp2√ˉpˉqn1+ˉpˉqn2=0.38−0.3√0.34×0.66600+0.34×0.66600=2.93
Step 7: Either compute the reject region as (−∞,Zα2)∪(Z1−α2,∞)=(−∞,−1.96)∪(1.96,∞), z is in the rejection region or compute the p-value as 2(1−P(Z<2.93))=0.003<0.05. Either method we reject the null hypothesis and conclude that two population proportion is not the same.Testing of Correlation This kind of hypothesis testing problem is discussed in detail in Chapter 9-2 on your textbook (start from Page-437).
Exercise 6.4 Theories have been developed about the heights of winning candidates for the U.S. presidency and the heights of candidates who were runners-up. Listed below are heights (in inches) from a few presidential elections.
Heights of winner: 69.5, 73, 73, 74, 74.5, 74.5, 71, 71
Heights of runners-up: 72, 69.5, 70, 68, 74, 74, 73, 76.
What is the correlation between the heights of winning and losing candidates?
Provide p-value for the test H0:ρ=0 vs H1:ρ≠0 where ρ presents the unknown population correlation of heights of the winners and runners-up.
Proof. (a) Plug-in the formula (6.17) we have r=−0.22.
- Since we have n=8, using (6.18) we have t=−0.22√1−(−0.22)28−2=−0.55. Hence, the p-value is 2P(T6<−0.55)=0.60.
In your quiz or exam, you can use a simpler way to do the problem by calculate test statistic, reject region or p-value and then make conclusion, ignoring the steps. However, if you are not confident with your answer, please follow those steps, even though you need to write a lot more, you will get more credit if you make some mistake in calculation.