Chapter 19 Hypothesis Testing
19.1 Introduction to hypothesis testing
In estimation, we are interested in asking ourselves the question what is the value of some particular parameter of interest in the population. For example, what is the average annual income of residents in the UK?
Often there are times in statistics when we are not interested in the specific value of the parameter, but rather are interested in asserting some statement regarding the parameter of interest. Some examples:
- We want to claim that the average annual income of UK residents is more than or equal to £35,000£35,000.
- We want to assess whether the average annual income of men in academia in the UK is the same as that of women in similar ranks.
- We want to determine whether the number of cars crossing a certain intersection follows a Poisson distribution or whether it is more likely to come from a geometric distribution.
To perform a statistical hypothesis test, one needs to specify two disjoint hypotheses in terms of the parameters of the distribution that are of interest. They are
- H0:H0: Null Hypothesis,
- H1:H1: Alternative Hypothesis.
Traditionally, we choose H0H0 to be the claim that we would like to assert.
Returning to our examples:
- We want to claim that the average annual income of UK residents is more than or equal to £35,000£35,000. We test
H0:μ≥35,000vs.H1:μ<35,000.H0:μ≥35,000vs.H1:μ<35,000. - We want to assess whether the average annual income of men in academia in the UK is the same as that of women at similar ranks. We test
H0:μmen=μwomenvs.H1:μmen≠μwomen.H0:μmen=μwomenvs.H1:μmen≠μwomen. - We want to determine whether the number of cars crossing a certain intersection follows a Poisson distribution or whether it is more likely to come from a geometric distribution. We test
H0:X∼Po(2)vs.H1:X∼Geom(0.5).H0:X∼Po(2)vs.H1:X∼Geom(0.5).
Hypotheses where the distribution is completely specified are called simple hypotheses. For example, H0H0 and H1H1 in the car example and H0H0 in the gender wage example are all simple hypotheses.
Hypotheses where the distribution is not completely specified are called composite hypotheses. For example, H0H0 and H1H1 in the average annual income example and H1H1 in the gender wage example are all composite hypotheses.
Note that in the average annual income and gender wage examples, the null and alternative hypotheses cover all possibilities, whereas for the car example there are many other choices of distributions which could be hypothesized.
The conclusion of a hypothesis test
We will reject H0H0 if there is sufficient information from our sample that indicates that the null hypothesis cannot be true thereby concluding the alternative hypothesis is true.
We will not reject H0H0 if there is not sufficient information in the sample to refute our claim.
The remainder of this section is structured as follows. We define Type I and Type II errors, which are the probability of making the wrong decision in a hypothesis test. In Section 19.3 we show how to construct hypothesis tests starting with hypothesis tests for the mean of a normal distribution with known variance. This is extended to the case where the variance is unknown and where we have two samples we want to compare. We introduce pp-values which give a measure of how likely (unlikely) the observed data are if the null hypothesis is true.We then consider hypothesis testing in a wide range of scenarios:-
19.2 Type I and Type II errors
Type I error
A Type I error occurs when one chooses to incorrectly reject a true null hypothesis.
A Type I error is also commonly referred to as a false positive.
Type II error
A Type II error occurs when one fails to reject a false null hypothesis.
A Type II error is also commonly referred to as a false negative.
Type I error and Type II error are summarised in the following decision table.
One accepts the Null | One rejects the Null | |
---|---|---|
Null hypothesis is true | Correct Conclusion | Type I Error |
Null hypothesis is false | Type II Error | Correct Conclusion |
Significance level
The significance level or size of the test is
Typical choices for αα are 0.01, 0.05 and 0.10.
Probability of Type II error
The probability of a Type II error is
Consider the following properties of αα and ββ:
- It can be shown that there is an inverse relationship between αα and ββ, that is as αα increases, ββ decreases and vice versa. Therefore for a fixed sample size one can only choose to control one of the types of error. In hypothesis testing we choose to control Type I error and select our hypotheses initially so the “worse” error is the Type II error.
- The value of both αα and ββ depend on the value of the underlying parameters. Consequently, we can control αα by first choosing H0H0 to include an equality of the parameter, and then showing that the largest the Type I error can be is at this point of equality. Therefore we may as well choose the parameter to be the size. To illustrate in the average annual income example above
α=P(rejecting H0|μ=35,000)≥P(rejecting H0|μ≥35,000).α=P(rejecting H0|μ=35,000)≥P(rejecting H0|μ≥35,000).
Therefore H0:μ≥35,000H0:μ≥35,000 is often just written as H0:μ=35,000H0:μ=35,000.
- Because H0H0 describes an equality, H1H1 is therefore a composite hypothesis. Therefore β=P(Type II error)β=P(Type II error) is a function of the parameter within the alternative parameter space.
Power of a Test
The power of the test is
The power of a test can be thought of as the probability of making a correct decision.
19.3 Tests for normal means, σσ known
In this section we study a number of standard hypothesis tests that one might perform on a random sample.
We assume throughout this section that x1,x2,…,xnx1,x2,…,xn are i.i.d. samples from XX with E[X]=μE[X]=μ, where μμ is unknown and var(X)=σ2var(X)=σ2 is known.
Test 1: H0:μ=μ0H0:μ=μ0 vs. H1:μ<μ0H1:μ<μ0; σ2σ2 known.
Watch Video 28 for the construction Hypothesis Test 1.
Video 28: Hypothesis Test 1
A summary of the construction of Hypothesis Test 1 is given below.
Data assumptions. We assume either
- X1,X2,…,XnX1,X2,…,Xn are a random sample from a normal distribution with known variance σ2σ2;
- The sample size nn is sufficiently large so that we can assume ˉX¯X is approximately normally distributed by the Central Limit Theorem, and that either the variance is known or that the sample variance s2≈σ2s2≈σ2.
Step 1: Choose a test statistic based upon the random sample for the parameter we want to base our claim on. For example, we are interested in μμ so we want to choose a good estimator of μμ as our test statistic. That is, ˆμ=ˉX^μ=¯X.
Step 2: Specify a decision rule. The smaller ˉX¯X is, the more the evidence points towards the alternative hypothesis μ<μ0μ<μ0. Therefore our decision rule is to reject H0H0 if ˉX<c¯X<c, where cc is called the cut-off value for the test.
Step 3: Based upon the sampling distribution of the test statistic and the specified significance level of the test, solve for the specific value of the cut-off value cc. To find cc,Since P(Z<−zα)=αP(Z<−zα)=α, where zαzα can be found using qnorm(1-alpha)
(P(Z<zα)=1−αP(Z<zα)=1−α) or statistical tables, then
−zα=c−μ0σ/√n−zα=c−μ0σ/√n
and c=μ0−zα⋅σ√nc=μ0−zα⋅σ√n.
So, the decision rule is to reject H0H0 if ˉX<μ0−zα⋅σ√n¯X<μ0−zα⋅σ√n or, equivalently, Z=ˉX−μ0σ√n<−zα.Z=¯X−μ0σ√n<−zα.
Test 2: H0:μ=μ0H0:μ=μ0 vs. H1:μ<μ0H1:μ<μ0; σ2σ2 known.
Test 2: H0:μ=μ0H0:μ=μ0 vs. H1:μ>μ0H1:μ>μ0; σ2σ2 known
This is similar to the previous test, except the decision rule is to reject H0H0 if ˉX>μ0+zασ√n¯X>μ0+zασ√n or, equivalently, Z=ˉX−μ0σ/√n>zα.Z=¯X−μ0σ/√n>zα.
Note that both these tests are called one-sided tests, since the rejection region falls on only one side of the outcome space.
Test 3: H0:μ=μ0H0:μ=μ0 vs. H1:μ≠μ0H1:μ≠μ0; σ2σ2 known.
The test statistic ˉX¯X does not change but the decision rule will. The decision rule is to reject H0H0 if ˉX¯X is sufficiently far (above or below) from μ0μ0. Specifically, reject H0H0 if ˉX<μ0−zα/2⋅σ√n¯X<μ0−zα/2⋅σ√n or ˉX>μ0+zα/2⋅σ√n¯X>μ0+zα/2⋅σ√n. Equivalent to both of these is |Z|=|ˉX−μ0σ/√n|>zα/2.|Z|=∣∣∣¯X−μ0σ/√n∣∣∣>zα/2.
This is called a two-sided test because the decision rule partitions the outcome space into two disjoint intervals.
Coffee machine.
Suppose that a coffee machine is designed to dispense 6 ounces of coffee per cup with a variance σ=0.2σ=0.2, where we assume the amount of coffee dispensed is normally distributed. A random sample of n=20n=20 cups gives ˉx=5.94¯x=5.94.Test whether the machine is correctly filling the cups.
We test H0:μ=6.0H0:μ=6.0 vs. H1:μ≠6.0H1:μ≠6.0 at significance level α=0.05α=0.05.
Using a two-sided test with known variance, the decision rule is to reject H0H0 if |Z|=|ˉx−6.00.2/√20|>z0.05/2=z0.025=1.96|Z|=∣∣∣¯x−6.00.2/√20∣∣∣>z0.05/2=z0.025=1.96. NowTherefore, we conclude that there is not enough statistical evidence to reject H0H0 at α=0.05α=0.05.
19.4 pp values
When our sample information determines a particular conclusion to our hypothesis test, we only report that we either reject or do not reject H0H0 at a particular significance level αα. Hence when we report our conclusion the reader doesn’t know how sensitive our decision is to the choice of αα.
To illustrate, in Example 1 (Coffee Machine) we would have reached the same conclusion that there is not enough statistical evidence to reject H0H0 at α=0.05α=0.05 if |Z|=1.95|Z|=1.95 rather than |Z|=1.34|Z|=1.34. Whereas, if the significance level was α=0.10α=0.10, we would have rejected H0H0 if |Z|=1.95>z0.10/2=1.6449|Z|=1.95>z0.10/2=1.6449, but we would not reject H0H0 if |Z|=1.34<z0.10/2=1.6449|Z|=1.34<z0.10/2=1.6449.
Note that the choice of αα should be made before the test is performed; otherwise, we run the risk of inducing experimenter bias!
pp-value
The pp-value of a test is the probability of obtaining a test statistic at least as extreme as the observed data, given H0H0 is true.
So the p-value is the probability of rejecting H0H0 with the value of the test statistic obtained from the data given H0H0 is true. That is, it is the critical value of αα with regards to the hypothesis test decision.
If we report the conclusion of the test, as well as the pp value then the reader can decide how sensitive our result was to our choice of αα.
Coffee machine (continued).
Compute the pp value for the test in Example 1.
In Example 1 (Coffee machine), we were given ˉx=5.94¯x=5.94, n=20n=20 and σ=0.2σ=0.2. Our decision rule was to reject H0H0 if |Z|=|ˉx−6.00.2/√20|>z0.025|Z|=∣∣∣¯x−6.00.2/√20∣∣∣>z0.025.
To compute the pp-value for the test assume H0H0 is true, that is, μ=6.0μ=6.0. We want to find,Consider the following remarks on Example 2.
- The multiplication factor of 2 has arisen since we are computing the pp value for a two-sided test, so there is an equal-sized rejection region at both tails of the distribution. For a one-tailed test we only need to compute the probability of rejecting in one direction.
- The p value implies that if we had chosen an α of at least 0.1802 then we would have been able to reject H0.
- In applied statistics, the p value is interpreted as the sample providing:
{strong evidence against H0,if p≤0.01,evidence against H0,if p≤0.05,slight evidence against H0,if p≤0.10,no evidence against H0,if p>0.10.
19.5 Tests for normal means, σ unknown
Assume X1,X2,…,Xn is a random sample from a normal distribution with unknown variance σ2.
Test 4: H0:μ=μ0 vs. H1:μ<μ0; σ2 unknown.
where s2 is the sample variance.
Hence,qt
function in R with tn−1,α= qt(alpha,n-1)
or using statistical tables similar to those of the normal tables in Section 5.7. Therefore
Test 5: H0:μ=μ0 vs. H1:μ>μ0; σ2 unknown.
Test 6: H0:μ=μ0 vs. H1:μ≠μ0; σ2 unknown.
Coffee machine (continued).
Suppose that σ is unknown in Example 1, though we still assume the amount of coffee dispensed is normally distributed. A random sample of n=20 cups gives mean ˉx=5.94 and sample variance s2=0.15012.
Test whether the machine is correctly filling the cups.
We test H0:μ=6.0 vs. H1:μ≠6.0 at significance level α=0.05.
The decision rule is to reject H0 if |T|=|ˉx−6.00.1501/√20|>t20−1,0.05/2=t19,0.025=2.093.
Now19.6 Confidence intervals and two-sided tests
Consider the two-sided t-test of size α. We reject H0 if |T|=|ˉX−μ0s/√n|>tn−1,α/2. This implies we do not reject H0 ifis a 100(1−α)% confidence interval for μ. Consequently, if μ0, the value of μ under H0, falls within the 100(1−α)% confidence interval for μ, then we will not reject H0 at significance level α.
In general, therefore, there is a correspondence between the “acceptance region” of a statistical test of size α and the related 100(1−α)% confidence interval. Therefore, we will not reject H0:θ=θ0 vs. H1:θ≠θ0 at level α if and only if θ0 lies within the 100(1−α)% confidence interval for θ.
Coffee machine (continued).
For the coffee machine in Example 3 (Coffee machine - continued) we wanted to test H0:μ=6.0 vs. H1:μ≠6.0 at significance level α=0.05. We were given a random sample of n=20 cups with ˉx=5.94 and s2=0.15012.
Construct a 95% confidence interval for μ.
The limits of a 95% confidence interval for μ are
so the 95% confidence interval for μ is
If we use the confidence interval to perform our test, we see that
so we will not reject H0 at α=0.05.
19.7 Distribution of the variance
Thus far we have considered hypothesis testing for the mean but we can also perform hypothesis tests for the variance of a normal distribution. However, first we need to consider the distribution of the sample variance.
Suppose that Z1,Z2,…,Zn∼N(0,1). Then we have shown thatin Section 14.2.
This can be extended to show thatNote that the degrees of freedom of χ2 is n−1, the number of observations n minus 1 for the estimation of μ by ˉX.
It follows that19.8 Other types of tests
Test 7: H0:σ21=σ22 vs. H1:σ21≠σ22.
Let X1,X2,…,Xm∼N(μ1,σ21) and Y1,Y2,…,Yn∼N(μ2,σ22) be two independent random samples from normal populations.
The test statistic is F=s21s22, whereqf(alpha/2,m-1,n-1)
and qf(1-alpha/2,m-1,n-1)
. Alternatively, Statistical Tables can be used. For the latter you may need to use the identity
to obtain the required values from the table.
Test 8: H0:μ1=μ2 vs. H1:μ1≠μ2; σ2 unknown.
Assume X1,X2,…,Xm∼N(μ1,σ2) and Y1,Y2,…,Yn∼N(μ2,σ2) are two independent random samples with unknown but equal variance σ2.
Note that
- (ˉX−ˉY)∼N((μ1−μ2),σ2(1m+1n)) which implies
(ˉX−ˉY)−(μ1−μ2)√σ2(1m+1n)∼N(0,1); - (m+n−2)s2pσ2∼χ2m+n−2;
- s2p is independent of ˉX−ˉY.
where s2p=(m−1)s2X+(n−1)s2Ym+n−2 is the pooled sample variance.
Blood bank.
Suppose that one wants to test whether the time it takes to get from a blood bank to a hospital via two different routes is the same on average. Independent random samples are selected from each of the different routes and we obtain the following information:
Route X | m=10 | ˉx=34 | s2X=17.111 |
Route Y | n=12 | ˉy=30 | s2Y=9.454 |

Figure 19.1: Routes from blood bank to hospital.
Test H0:μX=μY vs. H1:μX≠μY at significance level α=0.05, where μ1 and μ2 denote the mean travel times on routes X and Y, respectively.
Attempt Exercise 1: Blood bank and then watch Video 29 for the solutions.
Video 29: Blood bank
Alternatively worked solutions are provided:
Solution to Exercise 1: Blood bank
Compute
- F=s21s22=17.1119.454=1.81;
- F9,11,0.975=1F11,9,0.025=13.915=0.256;
- F9,11,0.025=3.588.
Hence F9,11,0.975<F<F9,11,0.025, so we do not reject H0 at α=0.05. Therefore we can assume the variances from the two samples are the same.
Now we test H0:μX=μY vs. H1:μX≠μY at significance level α=0.05
The decision rule is to reject H0 if
Test 9: H0:μ1=μ2 vs. H1:μ1≠μ2; non-independent samples.
Suppose that we have two groups of observations X1,X2,…,Xn and Y1,Y2,…,Yn where there is an obvious pairing between the observations. For example consider before and after studies or comparing different measuring devices. This means the samples are no longer independent.
An equivalent hypothesis test to the one stated is H0:μd=μ1−μ2=0 vs. H1:μd=μ1−μ2≠0. With this in mind define Di=Xi−Yi for i=1,…,n, and assume D1,D2,…,Dn∼N(μd,σ2d) and are i.i.d.
The decision rule is to reject H0 ifDrug Trial.
In a medical study of patients given a drug and a placebo, sixteen patients were paired up with members of each pair having a similar age and being the same sex. One of each pair received the drug and the other recieved the placebo. The response score for each patient was found.
Pair Number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Given Drug | 0.16 | 0.97 | 1.57 | 0.55 | 0.62 | 1.12 | 0.68 | 1.69 |
Given Placebo | 0.11 | 0.13 | 0.77 | 1.19 | 0.46 | 0.41 | 0.40 | 1.28 |
Are the responses for the drug and placebo significantly different?
This is a “matched-pair” problem, since we expect a relation between the values of each pair. The difference within each pair is
Pair Number | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
Di=yi−xi | 0.05 | 0.84 | 0.80 | −0.64 | 0.16 | 0.71 | 0.28 | 0.41 |
We consider the Di’s to be a random sample from N(μD,σ2D). We can calculate that ˉD=0.326, s2D=0.24 so sD=0.49.
To test H0:μD=0 vs H1:μD≠0, the decision rule is to reject H0 if
Now t7,0.05=1.895, so we would not reject H0 at the 10% level (just).
19.9 Sample size calculation
We have noted that for a given sample x1,x2,…,xn, if we decrease the Type I error α then we increase the Type II error β, and visa-versa.
To control for both Type I and Type II error, ensure that α and β are both sufficiently small, we need to choose an appropriate sample size n.
Sample size calculations are appropriate when we have two simple hypotheses to compare. For example, we have a random variable X with unknown mean μ=E[X] and known variance σ2=Var(X). We compare the hypotheses:
- H0:μ=μ0,
- H1:μ=μ1.
Without loss of generality we will assume that μ0<μ1.
Suppose that x1,x2,…,xn represent i.i.d. samples from X. Then by the central limit theoremNote that as n increases, the cut-off for rejecting H0 decreases towards μ0.
We now consider the choice of n to ensure that the Type II error is at most β, or equivalently, that the power of the test is at least 1−β.
The Power of the test is:Lemma 1 (Sample size calculation) gives the smallest sample size n to bound Type I and II errors by α and β in the case where the variance, σ2 is known.
Sample size calculation.
Suppose that X is a random variable with unknown mean μ and known variance σ2.
The required sample size, n, to ensure significance level α and power 1−β for comparing hypotheses:
- H0:μ=μ0
- H1:μ=μ1
is: n=(σμ1−μ0(zα−z1−β))2.
The details of the proof of Lemma 1 (Sample size calculation) are provided but can be omitted.
Proof of Sample Size calculations.
Note:
- We need larger n as σ increases. (More variability in the observations.)
- We need larger n as μ1−μ0 gets closer to 0. (Harder to detect a small difference in mean.)
- Typically α,β<0.5, so zα>0 and z1−β<0. Hence, zα−z1−β becomes larger as α and β decrease. (Smaller errors requires larger n.)
The Shiny App - Sample Size lets you explore the effect of μ1−μ0, σ and α on the sample size n or power 1−β.
This Shiny App lets you explore the effect of μ1−μ0, σ and α on the sample size n or power 1−β.
Task: Lab 10
Attempt the R Markdown file for Lab 10:
Lab 10: Confidence intervals and hypothesis testing
Student Exercises
Attempt the exercises below.
Note that throughout the exercises, for a random variable X and 0<β<1, cβ satisfies P(X>cβ)=β.
Question 1.
Eleven bags of sugar, each nominally containing 1 kg, were randomly selected from a large batch. The weights of sugar were:You may assume these values are from a normal distribution.
- Calculate a 95% confidence interval for the mean weight for the batch.
- Test the hypothesis H0:μ=1 vs H1:μ≠1. Give your answer in terms of a p-value.
Note that
Solution to Question 1.
Hence, the sample standard deviation is s=√‘rvarx‘=‘rsx‘.
- The 95% confidence interval for the mean is given by ˉx±tn−1,0.025s/√n. Now t10,0.025=‘rta‘. Hence the confidence interval is
‘rmean(x)‘±(‘rta‘)(‘rsx‘√‘rn‘)=‘rmean(x)‘±‘rinta‘=(‘rmean(x)−inta‘,‘rmean(x)+inta‘) - The population variance is unknown so we apply a t test with test statistic
t=ˉx−μ0s/√n=‘rmean(x)‘−1‘rsx‘/√‘rn‘=‘rtb‘ and n−1 degrees of freedom. The p value is P(|t|>‘rtb‘). From the critical values given, P(t10>‘rround(qnorm(0.999),4)‘)=0.001 and P(t10>‘rround(qnorm(0.9995),4)‘)=0.0005, so 0.0005<P(t10>‘rtb‘)<0.001. Hence 0.001<p<0.002. Therefore, there is strong evidence that μ≠1.
Question 2.
Random samples of 13 and 11 chicks, respectively, were given from birth a protein supplement, either oil meal or meat meal. The weights of the chicks when six weeks old are recorded and the following sample statistics obtained:- Carry out an F-test to examine whether or not the groups have significantly different variances or not.
- Calculate a 95% confidence interval for the difference between weights of 6-week-old chicks on the two diet supplements.
- Do you consider that the supplements have a significantly different effect? Justify your answer.
Note that
F10,12:Solution to Question 2.
We regard the data as being from two independent normal distributions with unknown variances.
- F-test: H0:σ21=σ22 vs. H1:σ21≠σ22.
We reject H0 if
F=s21s22>Fn1−1,n2−1,1−α/2=1Fn2−1,n1−1,α/2 or F=s21s22<Fn1−1,n2−1,α/2. Now, F12,10,0.025=‘rround(qf(0.975,12,10),4)‘ and F12,10,0.975=1‘rround(qf(0.975,10,12),4)‘=‘rround(1/qf(0.975,10,12),4)‘. From the data F=s21/s22=‘rround(v1/v2,4)‘ so we do not reject H0. There is no evidence against equal population variances.
- Assume σ21=σ22=σ2 (unknown). The pooled estimate of the common variance σ2 is
s2p=(n1−1)s21+(n2−1)s22n1+n2−2=‘rvp‘, so sp=‘rsp‘. The 95% confidence limits for μ1−μ2 are
ˉx1−ˉx2±t22,0.025sp√1n1+1n2=(‘rx1‘−‘rx2‘)±(‘rround(qt(0.975,nu),4)‘בrround(sqrt((1/n1)+(1/n2)),4)‘בrsp‘)=‘rx1−x2‘±‘rAA‘. So the interval is (‘rx1−x2−AA‘,‘rx1−x2+AA‘).
- Since the confidence interval in (b) includes zero (where μ1−μ2=0, μ1=μ2) we conclude that the diet supplements do not have a significantly different effect (at 5% level).
Question 3.
A random sample of 12 car drivers took part in an experiment to find out if alcohol increases the average reaction time. Each driver’s reaction time was measured in a laboratory before and after drinking a specified amount of alcoholic beverage. The reaction times were as follows:
Let μB and μA be the population mean reaction time, before and after drinking alcohol.
- Test H0:μB=μA vs. H1:μB≠μA assuming the two samples are independent.
- Test H0:μB=μA vs. H1:μB≠μA assuming the two samples contain `matched pairs’.
- Which of the tests in (a) and (b) is more appropriate for these data, and why?
Note that
and the critical values for t22 are given above in Question 2.
Solution to Question 3.
- The summary statistics of the reaction times before alcohol are ˉx=‘rmb‘ and s2x=‘rvb‘. Similarly the summary statistics after alcohol are ˉy=‘rma‘ and s2y=‘rva‘. Assuming both samples are from normal distributions with the same variance, the pooled variance estimator is
s2p=(n−1)s2x+(n−1)s2y2(n−1)=‘rvp‘. The null hypothesis is rejected at α=0.05 if
t=|ˉx−ˉysp√1n+1n|>t‘r2∗(n−1)‘,0.025=‘rround(qt(0.975,2∗nu),4)‘. From the data,
t=|‘rmb‘−‘rma‘‘rsp‘√2‘rn‘|=‘rT1‘ Hence, the null hypothesis is not rejected. There is no significant difference between the reaction times.
- The difference in reaction time for each driver is
123456789101112Before‘rbefore[1]‘‘rbefore[2]‘‘rbefore[3]‘‘rbefore[4]‘‘rbefore[5]‘‘rbefore[6]‘‘rbefore[7]‘‘rbefore[8]‘‘rbefore[9]‘‘rbefore[10]‘‘rbefore[11]‘‘rbefore[12]‘After‘rafter[1]‘‘rafter[2]‘‘rafter[3]‘‘rafter[4]‘‘rafter[5]‘‘rafter[6]‘‘rafter[7]‘‘rafter[8]‘‘rafter[9]‘‘rafter[10]‘‘rafter[11]‘‘rafter[12]‘Difference‘rdiff[1]‘‘rdiff[2]‘‘rdiff[3]‘‘rdiff[4]‘‘rdiff[5]‘‘rdiff[6]‘‘rdiff[7]‘‘rdiff[8]‘‘rdiff[9]‘‘rdiff[10]‘‘rdiff[11]‘‘rdiff[12]‘ The sample mean and variance of the differences are ˉd=‘rmd‘ and sd=‘rsd‘. Assuming the differences are samples from a normal distribution, the null hypothesis is rejected at α=0.05 if
t=|ˉdsd/√‘rn‘|>t‘rnu‘,0.025=‘rround(qt(0.975,nu),4)‘. From the data,
t=|‘rmd‘‘rsd‘/√‘rn‘|=‘rround(md/(sd/sqrt(n)),4)‘. Hence, the null hypothesis is rejected. There is a significant difference between the reaction times.
- The matched pair test in (b) is more appropriate. By recording each driver’s reaction time before and after, and looking at the difference for each driver we are removing the driver effect. The driver effect says that some people are naturally slow both before and after alcohol, others are naturally quick. By working with the difference we have removed this factor.