32 Tests for comparing odds
So far, you have learnt to ask a RQ, identify different ways of obtaining data, design the study, collect the data describe the data, summarise data graphically and numerically, understand the tools of inference, and to form confidence intervals.
In this chapter, you will learn about hypothesis tests for odds ratios. You will learn to:
 conduct hypothesis tests for an OR (i.e., comparing two proportions, or comparing two odds), using chisquare tests using jamovi and SPSS output.
 determine whether the conditions for using these methods apply in a given situation.
32.1 Introduction: Meals oncampus
In Sect. 25.1 a study^{483} was introduced to examine the eating habits of university students.
Researchers crossclassified? \(n = 183\) students into groups according to two qualitative variables:
 Where they live: With their parents, or not with their parents;
 Whether they eat most of their meals offcampus, or most of their meals oncampus.
Notice that the two groups (either students who live with parents or do not live with parents; or students who eat most meals at home or do not) contain different students.
Hence, the comparison here is between individuals.
Lives with parents  Doesn't live with parents  Total  

Most offcampus  52  105  157 
Most oncampus  2  24  26 
Total  54  129  183 
Since both variables observed on each student (the unit of analysis) are qualitative, means are not appropriate. However, the data can be compiled into a twoway table of counts (Table 32.1).
Since both qualitative variables have two levels, the table is a \(2\times 2\) table. A graphical summary is shown in Fig. 25.1, and a numerical summary in Table 32.2. (The details of the computations appear in Sect. 25.1).
Odds of having most meals offcampus  Percentage having most meals offcampus  Sample size  

Living with parents  26  96.3  54 
Not living with parents  4.375  81.4  129 
Odds ratio  5.943 
The parameter is the population OR, comparing the odds of eating most meals offcampus for students living with their parents to students not living with their parents.
Understanding how software computes the odds ratio is very important for understanding the output.
In jamovi and SPSS, the odds ratio can be interpreted in either of these two ways (i.e., both ways are correct):
The odds are the odds of eating most meals offcampus (Row 1 of Table 32.1). Then, the odds ratio compares these odds for students living with their parents (Column 1 of Table 32.1) to those not living with their parents (Row 2 of Table 32.1).
That is, the odds are \(52/2 = 26\) (for those living with parents) and \(105/24 = 4.375\) (for those not living with parents), so the OR is then \(26/4.375 = 5.943\), as in the output (jamovi: Fig. 32.1; SPSS: Fig. 32.2).The odds are the odds of living with parents (Column 1 of Table 32.1). Then, the odds ratio compares these odds for students eating most meals offcampus (Row 1 of Table 32.1) to the odds of students eating most meals oncampus (Row 2 of Table 32.1).
That is, the odds of living with parents are \(52/105 = 0.49524\) (for those eating most meals offcampus) and \(2/24 = 0.083333\) (for those eating most meals oncampus), so the OR is then \(0.49524/0.083333 = 5.943\), as in the output (jamovi: Fig. 32.1; SPSS: Fig. 32.2).
In other words, the odds and odds ratios are relative to the first row or first column.
Unlike the previous decisionmaking RQs, this RQ does not concerns means. Instead, the RQ can be written in terms of comparing proportions, odds, or odds ratios.
For reasons that we can't delve into, usually the odds ratio (OR) is used as the parameter. One important reason is that software produces output related to testing the OR. Using the OR, the RQ could be written as
Is the population odds ratio of eating most meals offcampus, comparing students who live with their parents to students not living with their parents, equal to one?
Alternatively, but probably easier to understand, is to write the RQ in terms of comparing the odds in the two groups explicitly:
Are the population odds of students eating most meals offcampus the same for students living with their parents and for students not living with their parents?
The RQ can also be worded as comparing the percentages (or proportions) of students eating meals offcampus in each group. This is equivalent to the RQs above, but is not directly related to the software output, which works with odds ratios.
Another alternative, which sounds less direct but is useful for twoway tables larger than \(2\times 2\) (see Sect. 32.10), is worded in terms of relationships or associations between the variables:
Is there a relationship (or association) between where students eat most of their meals and whether or not the student lives with their parents?
All of these are equivalent. Usually, for \(2\times2\) tables, working with odds or odds ratios is best, because most software (including jamovi and SPSS) readily produces CIs for the odds ratio.
32.2 Hypotheses and notation: Comparing odds
For twoway tables of counts, the parameter is the population odds ratio. As usual, the null hypothesis is the 'no difference, no change, no relationship' position. So in this context:

\(H_0\): The population OR is one;
or (equivalently):
The population odds are the same in each group.
This hypothesis proposes that the sample odds are not the same due to sampling variation. This is the initial assumption.
The alternative hypothesis is

\(H_1\): The population OR is not one;
or (equivalently):
The population odds are not the same in each group.
The alternative hypothesis is always twotailed for analysing twoway tables of counts.
For analysing twoway tables of counts, the alternative hypotheses are always twotailed.
The hypotheses can also be written in terms of differences in percentages (or proportions), though the software output is usually expressed in terms of odds. The hypotheses can also be written in terms of associations:
 \(H_0\): In the population, there is no association between the two variables
 \(H_1\): In the population, there is an association between the two variables
The RQ and hypothesis only needs to be given in one of these ways. The RQ and hypotheses should be consistent (for example, if the RQ is written in terms of odds, the hypotheses should be written in terms of odds).
As usual, following the decisionmaking process, start by assuming that the null hypothesis is true: that the population odds ratio is one.
32.3 Expected values: Comparing odds
Assuming that the odds of having most meals offcampus is the same for both groups (that is, the population OR is one), how would the sample OR be expected to vary from sample to sample just because of sampling variation?
If the population OR was one, the odds are the same in both groups; equivalently, the percentages are the same in both groups. That is, the percentage of students eating most meals offcampus is the same for students living with and not living with their parents.
Let's consider the implication. From Table 32.1, 157 students out of 183 ate most meals offcampus; that is,
\[ \frac{157}{183} \times 100 = 85.79\% \] of the students in the entire sample ate most of their meals offcampus.
If the percentage of students who eat most of their meals offcampus is the same for those who live with their parents and those who don't, then we'd expect 85.79% of students in both groups to be equal to this value. That is, we would expect
 85.79% of the 54 students (that is, 46.33) who live with their parents to eat most meals offcampus; and
 85.79% of the 129 students (that is, 110.67) who don't live with their parents to eat most meals offcampus.
That is, the percentage (and hence the odds) is the same in each group. Those are the numbers that are expected to appear if the percentage was exactly the same in each group (Table 32.3), if the null hypothesis (the assumption) was true.
Consider the expected counts in Table 32.3.
Confirm that the odds of having most meals offcampus is the same for students living with their parents, and for students not living with their parents.
How do those expected values compare to what was observed? For example:
 46.33 of the 54 students who live with their parents are expected to eat most meals offcampus; yet we observed 52.
 110.67 of the 129 students who don't live with their parents are expected to eat most meals offcampus; yet we observed 105.
The observed and expected counts are similar, but not the exactly same. This is no surprise: each sample will produce slightly different observed counts (sampling variation). The difference between what the observed and expected counts may be explained by sampling variation (that is, the null hypothesis explanation).
You do not have to compute the expected values when you answer one of these types of RQs (software does it for you).
However, seeing how the decisionmaking process works in this context is helpful.
When discussing previous hypothesis tests, the sampling distribution of the sample statistic (in this case, the sampling distribution of the sample odds ratio) was described, and this sampling distribution had an approximate normal distribution (whose standard deviation is called the standard error). However, the sampling distribution of the odds ratio is more involved^{484} so will not be presented.
Lives with parents  Doesn't live with parents  Total  

Most offcampus  46.33  110.67  157 
Most oncampus  7.67  18.33  26 
Total  54.00  129.00  183 
32.4 The test statistic: Comparing odds
The decisionmaking process compares what is expected from the sample statistic if the null hypothesis about the parameter is true (Table 32.3) to what is observe in the sample (Table 32.1).
Previously, when the summary statistics were means, \(t\)tests were used. However, these data are not summarised by means, and a different test statistic is used.
Rather than using a \(t\)score as the teststatistic, the teststatistic here is a 'chisquared' statistic, written \(\chi^2\). A \(\chi^2\) statistic measures the overall size of the differences between the expected counts and observed counts, over the entire table.
The Greek letter \(\chi\) is pronounced 'ki', as in kite.
The test statistic \(\chi^2\) is pronounced as 'chisquared'.
From the software (jamovi: Fig. 32.1; SPSS: Fig. 32.2), \(\chi^2=6.934\).
In a \(2\times 2\) table of counts
(when the 'degrees of freedom', or df
, is equal to 1,
as shown in the computer output),
the square root of the \(\chi^2\) value is
approximately equivalent to a \(z\)score.
So here, the equivalent \(z\)score is about \(\sqrt{6.934} = 2.63\), which is fairly large: a small \(P\)value is expected.
More generally, for twoway tables of any size,
\[ \sqrt{\frac{\chi^2}{\text{df}}} \] is like a \(z\)score, where df is the 'degrees of freedom' (related to the size of the table^{485}), as shown in the software output.
This allows a \(P\)value to be estimated using the 689599.7 rule from the value of the \(\chi^2\) statistic.
In a chisquared test,
with a given number of 'degrees of freedom'
(written df
in the software output),
the value of
\[
\sqrt{\frac{\chi^2}{\text{df}}}
\]
is like a \(z\)score.
This allows the \(P\)value to be estimated using the 689599.7 rule.
32.5 \(P\)values: Comparing odds
The differences between the observated sample statistic (the sample OR) and the hypothesised population parameter (the population OR of one) is summarised by \(\chi^2=6.934\) (approximately equivalent to \(z=2.63\)). Using the 689599.7 rule, a small \(P\)value is expected.
The corresponding twotailed \(P\)value reported by
jamovi
(Fig. 32.1, under the p
column)
and SPSS
(Fig. 32.2,
in the Asymptotic Significance (2sided)
column
and Pearson ChiSquare
row)
is very small
(\(0.008\) to three decimals).
Recall that, for twoway tables of counts, the alternative hypotheses are always twotailed, so a twotailed \(P\)value is always reported.
Click on the hotspots in the following image, and describe what the jamovi output tells us.
32.6 Conclusions: Comparing odds
As usual, a very small \(P\)value (\(0.008\) to three decimals) means there is very strong evidence supporting \(H_1\): the evidence suggests a difference in the population odds in the two groups. We write:
The sample provides strong evidence (\(\chi^2=6.934\); twotailed \(P=0.008\)) that the odds in the population of having most meals offcampus is different for students living with their parents (odds: 26) and students not living with their parents (odds: 4.375; OR: \(5.94\); 95% CI from \(1.35\) to \(26.1\)).
Again, as seen in Sect. 29.7, the conclusion includes three components: The answer to the RQ; the evidence used to reach that conclusion ('\(\chi^2=6.934\); twotailed \(P=0.008\)'); and some sample summary statistics (inclding the 95% CI for the odds ratio).
The conclusion also makes clear what the odds and the odds ratio mean. The odds are describing as the 'odds... of having most meals offcampus', and the OR as then comparing these odds between 'students living with their parents... and students not living with their parents'.
For twoway tables, RQs are best framed in terms of ORs or odds (but can be framed in terms of proportions or percentages, or associations or relationships).
For consistency: if the RQ is about the odds ratio, the hypotheses and conclusion should be about the odds ratio; if the RQ is about odds, the hypotheses and conclusion should be about the odds; and so on.
32.7 Statistical validity conditions
As usual, these results hold under certain conditions. The test above is statistically valid if:
 All expected counts are at least five.
Some books may give other (but similar) conditions.
In addition to the statistical validity condition, the test will be
 internally valid if the study was well designed; and
 externally valid if the sample is a simple random sample and is internally valid.
The statistical validity condition refers to the expected (not the observed) counts. SPSS tells us if a problem exists with the expected count condition, underneath the first output table in Fig. 25.3. In jamovi, the expected counts must be explicitly requested to see if this condition is satisfied (Fig. 32.3).
For the studenteating data, the smallest observed count is 2 (living with parents; most meals offcampus), but the smallest expected count is 7.67, which is greater than five. The size of the expected counts is important for the statistical validity condition.
Example 32.1 (Statistical validity) For the universitystudent eating data, all the cells have an expected count of at least five so the statistical validity condition is satisfied.
32.8 Example: Pet birds
A study examined people with lung cancer, and a matched set of similar controls who did not have lung cancer, and compared the proportion in each group that had pet birds.^{486}
These data were studied in Sect. 25.6; the data are shown again in Table 32.4, and the numerical summary in Table 32.5 (the computations are shown in Sect. 25.6).
Adults with lung cancer  Adults without lung cancer  Total  

Kept pet birds  98  101  199 
Did not keep pet birds  141  328  469 
Total  239  429  668 
One RQ in the study was:
Are the odds of having a pet bird the same for people with lung cancer (cases) and for people without lung cancer (controls)?
The parameter is the population OR, comparing the odds of keeping a pet bird, for adults with lung cancer to adults who do not have lung cancer.
The RQ could also be written as:
 Is the percentage of people having a pet bird the same for people with lung cancer (cases) and for people without lung cancer (controls)?
 Is the odds ratio of people having a pet bird, comparing people with lung cancer (cases) and for people without lung cancer (controls), equal to one?
 Is there a relationship between having a pet bird and having lung cancer?
Of these, the first is probably the easiest to understand.
From this RQ (which is written in terms of odds), the hypotheses could be written as:
 \(H_0\): The odds of having a pet bird is the same for people with lung cancer (cases) and for people without lung cancer (controls).
 \(H_1\): The odds of having a pet bird is not the same for people with lung cancer (cases) and for people without lung cancer (controls).
The null hypothesis could also be written as:
 The percentage of people having a pet bird is the same for people with lung cancer (cases) and for people without lung cancer (controls).
 The odds ratio of people having a pet bird, comparing people with lung cancer (cases) and for people without lung cancer (controls), is equal to one.
 There is no relationship between having a pet bird and having lung cancer.
Of these, the first is probably the easiest to understand.
Begin by assuming the null hypothesis is true: no difference exists between the odds in the population. Based on this assumption, the expected counts can be found.
From the data (Table 32.4), overall \(199\div 668 = 29.79\)% of people own a pet bird.
If there really was no difference in the odds (or the percentages) of owning a pet bird between those with and without lung cancer, about \(29.79\)% of the people in both lung cancer groups are expected to own a pet bird.
Odds of keeping pet bird  Percentage keeping pet bird  Sample size  

With lung cancer:  0.6950  41.0%  239 
Without lung cancer:  0.3079  25.5%  429 
Odds ratio:  2.26 
About 29.79% of the 239 lungcancer cases (or 71.20) would be expected to have a pet bird, and about 29.79% of the 429 nonlungcancer cases (or 127.80) would be expected to have a pet bird.
A table of these expected counts (Table 32.6). shows that all expected counts are greater than five. In practice, you do not need to compute the expecte counts; software does this automatically.
Adults with lung cancer  Adults without lung cancer  Total  

Kept pet birds  71.2  127.8  199 
Did not keep pet birds  167.8  301.2  469 
Total  239.0  429.0  668 
The numbers in Table 32.6 are what is expected, if the percentage of people owning a pet bird is the same for lung cancer and nonlung cancer cases. How close are the expected and observed counts (in Table 32.4)?
To compare the sample statistic (what we observed) with the hypothesised population parameter, software is used to compute the value of \(\chi^2\) (jamovi: Fig. 32.4; SPSS: Fig. 32.5): \(\chi^2=22.374\), approximately equivalent to a \(z\)score of
\[ \sqrt{22.374/1} = 4.730, \] which is very large. Hence, a small \(P\)value is expected.
The software shows that the \(P\)value is very small (\(P<0.001\)). As usual, a small \(P\)value means that there is very strong evidence supporting \(H_1\), if \(H_0\) is assumed true. That is, the evidence suggests there is a difference in the odds in the population. We write:
The sample provides very strong evidence (\(\chi^2=22.374\); twotailed \(P<0.001\)) that the odds in the population of having a pet bird is not the same for people with lung cancer (odds: 0.695) and for people without lung cancer (odds: 0.308; OR: \(2.26\); 95% CI from \(1.6\) to \(3.2\)).
This doesn't imply that owning a pet bird causes lung cancer. Why not?
Because the study observational.
Confounders may explain the relationship (can you think of one?). In addition, maybe having lung cancer means that people seek companionship in the form of a pet.
32.9 Example: B12 deficiency
A study in New Zealand^{487} asked:
Among a certain group of women, are the odds of being vitamin B12 deficient different for women on a vegetarian diet compared to women on a nonvegetarian diet?
The population was 'predominantly overweight/obese women of South Asian origin living in Auckland'. The RQ could be worded in terms of odds ratios or proportions, too.
To test the claim, the hypotheses are:
 \(H_0\): \(\text{population odds for vegetarians} = \text{population odds for nonvegetarians}\): The odds of B12 deficiency are the same in both groups.
 \(H_1\): \(\text{population odds for vegetarians} \ne \text{population odds for nonvegetarians}\): The odds of B12 deficiency are not the same in both groups.
The parameter is the population OR, comparing the odds of being B12 deficient, for vegetarians to nonvegetarians.
Here, the odds refer to the odds of a woman being B12 deficient. As with the RQ, the hypotheses could be worded in terms of odds ratios, proportions (or percentages), or relationships.
The data are shown in Table 32.7, and the numerical summary in Table 32.8.
Since the RQ is about odds, a sidebyside bar chart is produced (Fig. 32.6) as the graphical summary.
B12 deficient  Not B12 deficient  Total  

Vegetarians  8  26  34 
Nonvegetarians  8  82  90 
Total  16  108  124 
Odds B12 deficient  Percentage B12 deficient  Sample size  

Vegetarians:  0.3077  23.5%  34 
Nonvegetarians:  0.0976  8.9%  90 
Odds ratio:  3.15 
The software output (jamovi: Fig. 32.7; SPSS: Fig. 32.8) shows that the OR (and 95% CI) is \(3.154\) (\(1.077\) to \(9.238\)). The chisquare value is \(4.707\), approximately equivalent to \(z\)score of
\[ \sqrt{\frac{4.707}{1}} = 2.17; \] a small \(P\)value is expected using the 689599.7 rule.
The software output shows that the twotailed \(P\)value is \(0.030\), which is indeed 'small'.
We conclude:
The sample provides moderate evidence (\(\chi^2 = 4.707\); \(P=0.030\)) that the odds in the population of being vitamin B12 deficient is different for vegetarian women (odds: 0.3077) compared to nonvegetarian women (odds: 0.0976; OR: \(3.2\); 95% CI: \(1.1\) to \(9.2\)).
The statistically valid shoud be checked. The jamovi output (Fig. 32.7) shows that the smallest expected count is 4.39. Likewise, the text under the first table of SPSS output in Fig. 32.8 says that
1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.39.
The smallest expected count is smaller than five, so the results may be statistically invalid. Nonetheless, only one cell has an expected count less than five, and only just under 5, so we shouldn't be too concerned (but it should be noted).
32.10 Example: Kerbside dumping
A study of dumping households goods on the kerbside in Brisbane^{488} asked people about their opinions on the dumping. All participants were from Brisbane suburbs where a high level of kerbside dumping occurred.
The data are summarised in Table 32.9. Notice that this is a \(2\times 3\) table of counts, so it is more difficult to define a parameter.
Acceptable  Not acceptable  Conditionally acceptable  

Reuseable  22  18  15 
Nonreuseable  6  36  1 
The software output is shown in Fig. 32.9 (for jamovi), and Fig. 32.10 (for SPSS); a graphical summary in Fig. 32.11.
Most of the numerical summary must be produced manually (Table 32.10), since software only produces odds ratios for \(2\times 2\) tables.
Odds  Odds ratio  Percentage  Sample size  

Acceptable:  3.667  (Reference)  78.6%  28 
Not acceptable:  0.5  0.136  33.3%  54 
Conditionally acceptable:  15  4.09  93.4%  16 
In Table 32.10, the odds are that the given opinion refers to Reusable goods. Here are some of the details of these calculations:
 For Acceptable goods: the odds that these are reusable goods is \(22/6 = 3.667\).
 For Not acceptable goods: the odds that these are reusable goods is \(18/36 = 0.5\).
 For Conditionally acceptable goods: the odds that these are reusable goods is \(15/1 = 15\).
Then the odds ratios can be computed:
 Comparing the odds of Not acceptable to Acceptable: \(0.5/3.667 = 0.136\).
 Comparing the odds of Conditionally acceptable to Acceptable: \(15/3.667 = 4.09\).
Note that Table 32.10 has three groups to compare, so three odds calculations.
However, the summary has \(3  1 = 2\) odds ratios, since odds ratios compare two odds. The level to which the other two are compared is called the Reference level. In Table 32.10, the reference level is 'Acceptable'.
(In a \(2\times 2\) table, with two groups to compare, the summary has only \(2  1 = 1\) odds ratio.)
The hypotheses can be expressed in many ways (in terms of odds, odds ratio, or percentages), but perhaps the easiest approach with twoway tables larger than \(2\times 2\) is worded in terms of relationships or associations between the two variables:
 \(H_0\): There is no association between the type of rubbish and the opinion of kerbside duming.
 \(H_1\): There is an association between the type of rubbish and the opinion of kerbside duming.
From the software output, the \(\chi^2\)value is 26.318 and the degrees of freedom is two, so this \(\chi^2\) value is approximately equivalent to a \(z\)score of
\[ \sqrt{\frac{26.318}{2}} = 3.63. \] This is a large \(z\)score so, using the 689599.7 rule, a very small \(P\)value is expected; indeed, the software output reports \(P<0.001\). This suggests very strong evidence in the sample that opinions are not the same for reuseable and nonreuseable rubbish.
The conclusion could be written as
The sample provides very strong evidence (\(\chi^2=26.318\); \(\text{df}=2\)) that there is a relationship in the population between opinions about kerbside dumping and the type of rubbish.
While sample summary information could be added, the conclusion statements then become cumbersome. Instead, pointing readers to the numerical summary (Table 32.10) is probably better. Furthermore, CIs are not reported since the software does not produce CIs for tables larger than \(2\times 2\).
All expected values all exceed 5 (as in the jamovi (Fig. 32.9) and SPSS output (Fig. 32.10)), even though one observed count is less than five. The results are statistically valid.
32.11 Summary
To test a hypothesis about a population odds ratio, based on the value of the sample odds ratio, initially assume the value of the population odds ratio in the null hypothesis (usually one) to be true. Then, expected counts (Step 2) can be computed. Since the sample odds ratio varies from sample to sample, under certain statistical validity conditions a quantity closelyrelated to the sample odds ratio varies with an approximate normal distribution. This distribution describes what values of the sample odds ratio could be expected in the sample if the value of the populations odds ratio in the null hypothesis was true. The test statistic is a \(\chi^2\) statistic, which compares the expected and observed counts. (The value of \(\sqrt{\chi^2/\text{df}}\) is like a \(z\)score, where 'df' is the 'degrees of freedom' reported by software, and so an approximate \(P\)value can be estimated using the 689599.7 rule.) Software reports the \(P\)value to assess whether the data are consistent (Step 4) with the assumption.
32.12 Quick review questions
A study^{489} of the adoption of electric vehicle (EVs) by a certain group of professional Americans (Example 5.14) compiled the data in Table 32.11. Output from using jamovi is shown in Fig. 32.12.
Yes  No  

No postgrad  24  8 
Postgrad study  51  29 
The \(\chi^2\) value is:
The approximatelyequivalent \(z\)score (to two decimal places) is:

Using the 689599.5 rule, the approximate \(P\)value is:
From the software output, the \(P\)value is:

The alternative hypothesis will be:
True or false: There is no evidence of a difference in the odds of buying a car in the next 10 years, between those with and without postgraduate study.
Progress:
32.13 Exercises
Selected answers are available in Sect. D.30.
Exercise 32.1 Researchers^{490} studied the number of sandflies caught in light traps set at 3 and 35 feet above ground in eastern Panama. They asked:
In eastern Panama, are the odds of finding a male sandfly the same at 3 feet above ground as at 35 feet above ground?
The data are compiled into a table (Table 32.12), and summarised numerically (Table 32.13; partially edited) and graphically (Fig. 32.13).
Use the jamovi output (Fig. 32.14) to evaluate the evidence, complete Table 32.13, and write a conclusion.
3 feet above ground  35 feet above ground  

Males  173  125 
Females  150  73 
Odds  Percentage  Sample size  

3 feet:  ??  ??  298 
35 feet:  1.71  67.3%  223 
Odds ratio:  0.67 
Exercise 32.2 A prospective observational study in Western Australia compared the heights of scars from burns received.^{491} The data are shown in Table 32.14. SPSS was used to analyse the data (Fig. 32.15). (This study also appeared in Exercise 25.1, where the odds ratio, and the CI for the odds ratio, were computed.)
 Perform a hypothesis test to determine if the odds of having a smooth scar are the same for women and men.
 Write down the conclusion.
 Is the test statistically valid?
Women  Men  

Scar height 0mm (smooth)  99  216 
Scar height more than 0mm, less than 1mm  62  115 
Exercise 32.3 In a study of turbine failures,^{492} 73 turbines were run for around 1800 hours, and seven developed fissures (small cracks). Fortytwo different turbines were run for about 3000 hours, and nine developed fissures.
Exercise 32.4 The Southern Oscillation Index (SOI) is a standardised measure of the air pressure difference between Tahiti and Darwin, and has been shown to be related to rainfall in some parts of the world,^{493} and especially Queensland.^{494}
As an example,^{495} the rainfall at Emerald (Queensland) was recorded for Augusts between 1889 to 2002 inclusive, in Augusts when the monthly average SOI was positive, and when the SOI was nonpositive (that is, zero or negative), as shown in Table 32.15. (This study also appeared in Exercise 25.4.)
 Using the jamovi output in Fig. 32.18, perform a hypothesis test to determine if the odds of having no rain is the same Augusts with nonpositive and negative SOI.
 Write down the conclusion.
 Is the test statistically valid?
Nonpositive SOI  Positive SOI  

No rainfall recorded  14  7 
Rainfall recorded  40  53 
Exercise 32.5 A research study conducted in Brisbane^{496} recorded the number of people at the foot of the Goodwill Bridge, Southbank, who wore sunglasses and hats.
The data were recorded between 11:30am to 12:30pm. Table 32.16 records the number of females and males wearing hats.
 Compute the percentages of females wearing a hat.
 Compute the percentages of males wearing a hat.
 Compute the odds of a female wearing a hat.
 Compute the odds of a male wearing a hat.
 Compute the odds ratio of wearing a hat, comparing females to males.
 Compute the odds ratio of wearing a hat, comparing males to females.
 Find the 95% CI for the appropriate OR.
 Using the SPSS output in Fig. 32.19, perform a hypothesis test to determine if the odds of wearing a hat is the same for females and males.
 Write down the conclusion.
 Is the test statistically valid?
Not wearing hat  Wearing hat  

Male  307  79 
Female  344  22 
Exercise 32.6 A study^{497} asked people about their mobilephone interactions while crossing the road as pedestrians. Part of the data are summarised in Table 32.17.
 Compute the column percentages.
 Compute the odds of low exposure to each behaviour.
 Write the hypothesis for conducting a hypothesis test.
 Compute the expected counts.
 After analysis in jamovi, the value of \(\chi^2\) is 20.923 with two degrees of freedom. What is the approximatelyequivalent \(z\)score? Would you expect a large or small \(P\)value?
 The \(P\)value is given as \(P<0.000\). Write a conclusion.
Answer call  Respond to text  Reply to email  

Low exposure  263  259  302 
High exposure  94  98  51 