19 Day 20
Announcements
Exam grades will take longer to publish
- Some people (with proper excusal) still have to take it
Homework corrections for HW 1-8 close this Friday
- Zero exceptions or extensions. Don’t even try.
Hypothesis Testing for Population Means
We’ve spoken about staging hypotheses
Now we’ll learn how to perform the actual test
Starting with a hypothesis test for population mean \(\mu\)
- When population standard deviation \(\sigma\) is unknown
We do need the Central Limit Theorem to hold in order for us to proceed
\[\text{if} \ \ n>30 \ \ \text{then} \ \ \bar{X} \sim N(\mu,\sigma^2)\]
There are two hypothesis test methods
- They produce the same results:
Critical value method
P-value method
We’re going to learn the P-value method:
- State the null and alternate hypotheses
- Choose a significance level \(\alpha =\) (allowed probability of Type I error).
- Compute the test statistic:
\[t = \frac{\bar{x} - \mu_0}{{s}/{\sqrt{n}}}\]
Since \(\sigma\) is unknown, we replace it with the sample standard deviation \(s\)
We use the \(t\) statistic, which comes from the \(t\) distribution with \(\text{df} = n - 1\)
- Compute the P-value of the test statistic \(t\).
Left-tailed test: \(P\)-value = area under the \(t\) distribution to the left of \(t\), i.e., \(P(T < t)\)
Right-tailed test: \(P\)-value = area under the \(t\) distribution to the right of \(t\), i.e., \(P(T > t)\)
Two-tailed test: \(P\)-value = sum of the areas under the \(t\) distribution to the left of \(-|t|\) and right of \(|t|\), i.e., \(2 * P(T < -|t|)\)
The degrees of freedom for the \(t\) distribution is \(\text{df} = n - 1\)
- Determine whether to reject \(H_0\):
Reject \(H_0\) if \(P\)-value \(\leq \alpha\)
Do not reject \(H_0\) if \(P\)-value \(> \alpha\)
- State a conclusion
Example 1
In a recent medical study, 76 subjects were placed on a low-fat diet. After 12 months, their sample mean weight loss was \(\bar{x} = 2.2\) kilograms, with a sample standard deviation of \(s = 6.1\) kilograms. Use the \(\alpha = 0.05\) level of significance to test the claim that the mean weight loss is greater than 0.
Step 1: State the null and alternate hypotheses
\[ H_0 : \mu = 0 \\ H_1 : \mu > 0 \\ (\text{right-tailed test}) \]
Step 2: Choose a significance level \(\alpha\)
\[ \alpha = 0.05 \]
Step 3: Compute the value of the test statistic
\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} = \frac{2.2 - 0}{6.1 / \sqrt{76}} \approx 3.144 \]
Step 4: Use the t-table to compute the P-value
Since \(H_1 : \mu > 0\) is right-tailed, the P-value is \(P(T > 3.144)\)
The degrees of freedom are \(df = 75\), which does not appear in the t-table, so we round it down to the nearest whole number, \(df = 60\)
In the t-table with \(df = 60\), we find that \(P(T > 3.144)\) is between \(P(T > 3.232)\) and \(P(T > 2.915)\), so the P-value is between \(0.001\) and \(0.0025\)
Step 5: Determine whether to reject \(H_0\)
Our P-value is between \(0.0025\) and \(0.001\)
Since the P-value is less than \(\alpha = 0.05\), we reject \(H_0\)
Step 6: State your conclusion
- We conclude that the mean weight loss of people who were placed on a low-fat diet for \(12\) months is greater than \(0\)
Example 2
A type of steel used by a manufacturing company is supposed to have an average hardness of 62 on the Rockwell hardness index. If the steel is too hard or too soft, defects can appear in the final product. A random sample of 10 specimens for a new steel supplier had a mean hardness of 64 with a standard deviation of 4. Test at the 5% significance level whether the mean hardness of the new supplier’s steel is different from the desired hardness of 62. (Assume that the population is normally distributed).
Step 1: State the null and alternate hypotheses
\[ H_0 : \mu = 62 \\ H_1 : \mu \neq 62 \\ (\text{two-tailed test}) \]
Step 2: Choose a significance level
\[ \alpha = 0.05 \]
Step 3: Compute the test statistic
Given:
Sample mean \(\bar{x} = 64\)
Population mean \(\mu_0 = 62\)
Sample standard deviation \(s = 4\)
Sample size \(n = 10\)
The test statistic is: \[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} = \frac{64 - 62}{4 / \sqrt{10}} \approx 1.581 \]
Step 4 & 5: Determine the P-value (two-tailed test)
Using the t-table with \(df = 9\), we find: - \(P(T > 1.581)\) is between \(P(T > 1.833)\) and \(P(T > 1.383)\), so \(P(T > 1.581)\) is between \(0.05\) and \(0.10\)
Therefore, the two-tailed P-value is: \[ \text{P-value} = 2 * P(T > 1.581) \approx \text{between } 0.1 \text{ and } 0.2 \]
Since the P-value is greater than \(\alpha = 0.05\), we fail to reject \(H_0\)
Step 6: State the conclusion
There is not enough evidence to conclude that the mean hardness of the new supplier’s steel is different from 62
Hypothesis Testing for Proportions
Now, we want to test a hypothesis for population proportion \(p\)
- We still need the Central Limit Theorem for proportions to hold:
\[np_0 \geq 10 \quad \text{and} \quad n(1 - p_0) \geq 10,\]
Where \(p_0\) is the population proportion specified by \(H_0\)
- We will use the P-value method to perform the hypothesis test.
Step 1: State the null and alternate hypotheses
The null hypothesis is of the form:
\[ H_0 : p = p_0 \]
The alternate hypothesis is in one of the three forms:
Left-tailed: \(H_1 : p < p_0\)
Right-tailed: \(H_1 : p > p_0\)
Two-tailed: \(H_1 : p \neq p_0\)
Step 2: Choose a significance level \(\alpha\)
Step 3: Compute the test statistic:
\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} \]
Step 4: Compute the P-value of the test statistic \(z\)
Left-tailed: P-value = area under the standard normal distribution to the left of \(z\)
- i.e., \(P(Z < z)\)
Right-tailed: P-value = area under the standard normal distribution to the right of \(z\)
- i.e., \(P(Z > z)\)
Two-tailed: P-value = sum of the areas under the standard normal distribution to the left of \(-|z|\) and right of \(|z|\)
- i.e., \(2 * P(Z < -|z|)\)
Step 5: Determine whether to reject \(H_0\):
Reject \(H_0\) if P-value \(\leq \alpha\)
Do not reject \(H_0\) if P-value \(> \alpha\)
Step 6: State a conclusion
Example 3
Suppose that 67% of all auto damage insurance claims in the US are made by singles under 25 years old. Also suppose that in a random sample of 53 auto damage claims in Manhattan, KS, there were 42 made by singles under 25.
Test at the 5% significance level whether the proportion of auto damage claims made by singles under 25 in Manhattan is different than the proportion for the entire US.
a. State the null and alternate hypotheses
\[ H_0 : p = 0.67 \\ H_1 : p \neq 0.67 \\ \quad (\text{two-tailed test}) \]
b. Compute the value of the test statistic
Given:
Sample proportion \(\hat{p} = \frac{42}{53} \approx 0.7925\)
Population proportion \(p_0 = 0.67\)
The test statistic is:
\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.7925 - 0.67}{\sqrt{\frac{0.67(1 - 0.67)}{53}}} \approx 1.90 \]
c. Determine whether to reject \(H_0\)
Using a two-tailed z-table:
\[ \text{P-value} = 2 \cdot P(Z < -1.90) = 2(0.0287) = 0.0574 \]
Since the P-value \(> \alpha (= 0.05)\), we fail to reject \(H_0\)
d. State your conclusion
There is not enough evidence to conclude that the proportion of auto damage claims made in Manhattan by singles under 25 is different from the national proportion
Example 4
An educational technology specialist is studying attitudes of teachers about the use of virtual reality in the classroom. She samples 500 teachers and finds that 471 of them believe that virtual reality would have a positive effect. Can she conclude that the proportion of teachers who believe that virtual reality would have a positive effect is greater than 0.90? Use the \(\alpha = 0.05\) level of significance.
Step 1: State the null and alternate hypotheses
\[ H_0 : p = 0.90 \\ H_1 : p > 0.90 \\ \quad (\text{right-tailed test}) \]
Step 2: Choose a significance level
\[ \alpha = 0.05 \]
Step 3: Compute the test statistic
Given:
Sample proportion \(\hat{p} = \frac{471}{500} = 0.942\)
Population proportion \(p_0 = 0.90\)
The test statistic is: \[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.942 - 0.90}{\sqrt{\frac{0.90(1 - 0.90)}{500}}} \approx 3.13 \]
Step 4: Determine the P-value
\[ \text{P-value} = P(Z > 3.13) = 0.0009 \]
Since the P-value \(< \alpha\), we reject \(H_0\)
Step 5: State the conclusion
We conclude that more than 90% of teachers believe that virtual reality would have a positive effect on education