36 Tests for comparing odds
So far, you have learnt to ask a RQ, design a study, describe and summarise the data, understand the decisionmaking process and to work with probabilities. You have been introduced to the construction of confidence intervals, and to hypothesis testing. In this chapter, you will learn to:
 conduct hypothesis tests for an OR (i.e., comparing two proportions, or comparing two odds), using chisquare tests in software output.
 determine whether the conditions for using these methods apply in a given situation.
36.1 Introduction: meals oncampus
In Sect. 29.1, a study was introduced that examined the eating habits of university students (Mann and Blotnicky 2017). Researchers classified \(n = 183\) students into groups according to two qualitative variables (Table 36.1): where they lived, and where they ate most of their meals.
Every cell in the \(2\times 2\) table contain different students, so the comparison is between individuals.
Lives with parents  Doesn't live with parents  Total  

Most meals offcampus  \(52\)  \(105\)  \(157\) 
Most meals oncampus  \(\phantom{0}2\)  \(\phantom{0}24\)  \(\phantom{0}26\) 
Total  \(54\)  \(129\)  \(183\) 
Since both qualitative variables have two levels, the table is a \(2\times 2\) table. A graphical summary is shown in Fig. 29.1 (left panel), and a numerical summary in Table 36.2. (The details of the computations appear in Sect. 29.1).
Odds of having most meals offcampus  Percentage having most meals offcampus  Sample size  

Living with parents  \(26.000\)  \(96.3\)  \(\phantom{0}54\) 
Not living with parents  \(\phantom{0}4.375\)  \(81.4\)  \(129\) 
Odds ratio  \(\phantom{0}5.943\) 
The parameter is the population OR of odds of eating most meals offcampus, comparing students living with their parents, to students not living with their parents.
Understanding how software computes the odds ratio is important for understanding the output. In a \(2\times 2\) table, the jamovi output can be interpreted in either of these ways (i.e., both are correct):
The odds compare Row 1 counts to Row 2 counts, for both columns.
Then the odds ratio compares the odds for Column 1 to the odds for Column 2.The odds compare Column 1 counts to Column 2 counts.
Then the odds ratio compares the odds for Row 1 to the odds for Row 2.
Odds and odds ratios are computed with the last row and last column values on the bottom of the fraction.
Example 36.1 (Odds and odds ratio in software) For the data in Table 36.1, the software output can be interpreted in either of these ways (i.e., both are correct):

The odds are the odds of eating most meals offcampus (Row 1) compared to oncampus (Row 2; on the bottom of the fraction)
 for students living with their parents (Column 1): \(52/2 = 26\);
 for students not living with their parents (Column 2): \(105/24 = 4.375\).
So the OR is \(26/4.375 = 5.943\) (Column 2 on bottom of the fraction), as in the output (Fig. 36.1).

The odds are the odds of living with parents (Column 1) compared to not living with parents (Column 2; on the bottom of the fraction):
 for those eating most meals offcampus: \(52/105 = 0.49524\);
 for those eating most meals oncampus: and \(2/24 = 0.083333\).
So the OR is \(0.49524/0.083333 = 5.943\) (Row 2 on bottom of the fraction), as in the output (Fig. 36.1).
In other words, the odds and odds ratios use the last row or last column on the bottom of the fraction.
The RQ can be written using proportions, odds, or odds ratios. Means are not appropriate (the data contain two qualitative variables.) Using the OR, the RQ could be written as
Is the population odds ratio of eating most meals offcampus, comparing students who live with their parents to students not living with their parents, equal to one?
Alternatively, and probably easier to understand, is to write the RQ in terms of comparing the odds in the two groups:
Are the population odds of students eating most meals offcampus the same for students living with their parents and for students not living with their parents?
Equivalent, the RQ can also be worded as comparing the percentage (or proportion) of students eating meals offcampus in each group, though this is less common. However, these are not directly related to the software output (which works with odds ratios). Another alternative, which sounds less direct but is useful for twoway tables larger than \(2\times 2\) (see Sect. 36.9), is worded in terms of relationships or associations (but not correlations) between the variables:
Is there a relationship (or association) between where students eat most of their meals and whether or not the student lives with their parents?
All of these are equivalent. Usually, for \(2 \times2\) tables, working with odds or odds ratios is best, because most software (including jamovi) readily produce output for the OR.
36.2 Statistical hypotheses and notation
For \(2\times 2\) tables of counts, the parameter is the population odds ratio. As usual, the null hypothesis is the 'no difference, no change, no relationship' position. So, in this context:

\(H_0\): The population OR is one; or (equivalently):
The population odds are the same in each group.
This hypothesis proposes that the sample odds are not the same only due to sampling variation. This is the initial assumption. The alternative hypothesis is

\(H_1\): The population OR is not one; or (equivalently):
The population odds are not the same in each group.
For analysing twoway tables of counts, the alternative hypotheses are always twotailed.
The hypotheses can also be written in terms of differences in percentages (or proportions), though the software output is usually expressed in terms of odds. The hypotheses can also be written in terms of relationships or associations:
 \(H_0\): In the population, there is no association between the two variables
 \(H_1\): In the population, there is an association between the two variables
The RQ and hypotheses only need to be given in one of these ways. The RQ and hypotheses should be consistent; for example, if the RQ is written in terms of odds, the hypotheses should be written in terms of odds.
As usual, the decisionmaking process starts by assuming the null hypothesis is true: that the population odds ratio is one (i.e., the population odds in each group are equal).
36.3 Finding expected counts
Assuming that the odds of having most meals offcampus is the same for both groups (that is, the population OR is one), how would the sample OR be expected to vary from sample to sample just because of sampling variation? If the null hypothesis is true, the odds are the same in both groups (and the percentages are the same in both groups). That is, the percentage of students eating most meals offcampus is the same for students living with and not living with their parents.
Let's consider the implication. From Table 36.1, \(157\) students out of \(183\) ate most meals offcampus, so that \(157\div 183 \times 100 = 85.79\)% of the students in the entire sample ate most of their meals offcampus.
If the percentage of students who eat most of their meals offcampus is the same for those who live with their parents and those who don't, then we'd expect \(85.79\)% of students in both groups to be eating most meals offcampus. (These were also found in Sect. 29.5.) That is, we would expect:
 \(85.79\)% of the \(54\) students who live with their parents (i.e., \(46.33\)) to eat most meals offcampus; and
 \(85.79\)% of the \(129\) students who don't live with their parents (i.e., \(110.67\)) to eat most meals offcampus.
In other words, the percentage (and hence the odds) is the same in each group. Those are the expected counts if the percentage was exactly the same in each group (Table 36.3), if the null hypothesis (the assumption) was true.
Consider the expected counts in Table 36.3. Confirm that the odds of having most meals offcampus is the same for students living with their parents, and for students not living with their parents.
 Living with parents: 46.333/7.667 = 6.043$.
 Not living with parents: 110.667/18.333 = 6.036$.
The odds are the same (the small difference is because the expected counts are only given to three decimal places).
How close are the observed counts (Table 36.1) to the expected counts (Table 36.3)?
 \(46.333\) of the \(54\) students who live with their parents are expected to eat most meals offcampus; yet we observed \(52\).
 \(110.667\) of the \(129\) students who don't live with their parents are expected to eat most meals offcampus; yet we observed \(105\).
The observed and expected counts are similar, but not the exactly same. The difference between the observed and expected counts may be explained by sampling variation (that is, the null hypothesis explanation).
You do not have to compute the expected values when you answer one of these types of RQs (software does it in the background). However, seeing how the decisionmaking process works in this context is helpful.
In previous hypothesis tests, the sampling distribution had an approximate normal distribution (whose standard deviation is called the standard error). However, the sampling distribution of the odds ratio is more complicated^{13} so will not be presented. We will use software output instead.
Lives with parents  Doesn't live with parents  Total  

Most meals offcampus  \(46.328\)  \(110.672\)  \(157\) 
Most meals oncampus  \(\phantom{0}7.672\)  \(\phantom{0}18.328\)  \(\phantom{0}26\) 
Total  \(54.000\)  \(129.000\)  \(183\) 
36.4 Computing the value of the test statistic
The decisionmaking process compares what is expected if the null hypothesis about the parameter is true (Table 36.3) to what is observed in the sample (Table 36.1). Previously, when the summary statistics were means, the sampling distribution was a normal distribution, and a \(t\)score was the test statistic. However, the data here are not summarised by means, the sampling distribution is not a normal distribution (but is related to a normal distribution), and a different test statistic is needed.
Here, the teststatistic is a 'chisquared' statistic, written \(\chi^2\). A \(\chi^2\) statistic measures the overall size of the differences between the expected counts and observed counts, over the entire \(2\times 2\) table.
The Greek letter \(\chi\) is pronounced 'ki', as in kite (not "chi" as in China).
The test statistic \(\chi^2\) is pronounced as 'chisquared'.
From the software (Fig. 36.1), \(\chi^2 = 6.934\).
What does this value mean?
The \(\chi^2\)value is better understood by finding the equivalent \(z\)score, which allows a \(P\)value to be estimated using the \(68\)\(95\)\(99.7\) rule.
In a \(2\times 2\) table of counts (when the 'degrees of freedom'^{14}, or df
, is equal to 1, as in the computer output), the square root of the \(\chi^2\) value is equivalent to a \(z\)score of about \(\sqrt{6.934} = 2.63\).
This is large \(z\)score, so expect a small \(P\)value.
For twoway tables of any size, a more general (but simple) calculation is needed.
In a chisquared test, with a given number of 'degrees of freedom' (df
in the software output), the value of
\[
\sqrt{ \chi^2 \div {\text{df}}}
\]
is like a \(z\)score.
This allows the \(P\)value to be estimated using the \(68\)\(95\)\(99.7\) rule.
36.5 Determining \(P\)values
The differences between the observed sample statistic (the sample OR) and the hypothesised population parameter (the population OR of one) is summarised by \(\chi^2 = 6.934\), approximately equivalent to \(z = 2.63\). Using the \(68\)\(95\)\(99.7\) rule, a small \(P\)value is expected.
The twotailed \(P\)value reported by jamovi (Fig. 36.1, under the column p
) is indeed small: \(0.008\) to three decimals.
Recall that, for twoway tables of counts, the alternative hypotheses are always twotailed, so a twotailed \(P\)value is always reported.
Click on the hotspots in the following image, and describe what the jamovi output tells us.
36.6 Writing conclusions
As usual, a very small \(P\)value (\(0.008\) to three decimals) means very strong evidence exists to supporting \(H_1\): the evidence suggests a difference in the population odds in the two groups. We write:
The sample provides strong evidence (\(\chi^2 = 6.934\); twotailed \(P = 0.008\)) that the odds in the population of having most meals offcampus is different for students living with their parents (odds: \(26\)) and students not living with their parents (odds: \(4.375\); OR: \(5.94\); \(95\)% CI from \(1.35\) to \(26.1\)).
The conclusion includes three components (Sect. 33.8): The answer to the RQ; the evidence used to reach that conclusion ('\(\chi^2 = 6.934\); twotailed \(P = 0.008\)'); and some sample summary statistics (including the \(95\)% CI for the odds ratio).
The conclusion also makes clear what the odds and the odds ratio mean. The odds are describing as the 'odds... of having most meals offcampus', and the OR as then comparing these odds between 'students living with their parents... and students not living with their parents'.
For twoway tables, RQs are best framed in terms of ORs or odds (but can be framed in terms of proportions or percentages, or associations or relationships).
For consistency: if the RQ is about the odds ratio, the hypotheses and conclusion should be about the odds ratio; if the RQ is about odds, the hypotheses and conclusion should be about the odds; and so on.
36.7 Statistical validity conditions
As usual, these results hold under certain conditions. The test above is statistically valid if:
 All expected counts are at least five.
Some books may give other (but similar) conditions.
The statistical validity condition refers to the expected (not the observed) counts. In jamovi, the expected counts must be explicitly requested to see if this condition is satisfied (Fig. 36.2).
For the studenteating data, the smallest observed count is \(2\) (living with parents; most meals offcampus), but the smallest expected count is \(7.67\), which is greater than five. The size of the expected counts is important for the statistical validity condition.
Example 36.2 (Statistical validity) For the universitystudent eating data, all the cells have an expected count of at least five so the statistical validity condition is satisfied.
36.8 Example: turtle nests
(This study was seen in Sect. 29.6.) The hatching success of loggerhead turtles on Mediterranean beaches is often compromised by fungi and bacteria. A study (Candan, Katılmış, and Ergin 2021) compared the proportion of infected nests relocated nest due to the risk of tidal inundation, and nonrelocated nests (Table 36.4). The researchers were interested in knowing:
For Mediterranean loggerhead turtles, are the odds of infections the same for natural and relocated nests?
Noninfected  Infected  

Natural  \(29\)  \(10\) 
Relocated  \(14\)  \(\phantom{0}8\) 
The parameter is the odds ratio of infection, comparing natural to relocated nests. A graphical summary is shown in Fig. 29.3. A numerical summary table (Table 29.3, right table) shows that the odds of natural nest being infected is \(1.657\) times the odds of a relocated nest being infected. From the jamovi output (Fig. 36.3), the \(\chi^2\)value is \(0.777\); this is like a \(z\)score of \(z = \sqrt{0.777/1} = 0.88\), which is very small, so expect a large \(P\)value. Indeed, the \(P\)value is \(0.378\) on the output. The smallest expected count is \(6.49\) (Fig. 36.3), so this test is statistically valid. We write:
There is no evidence of a difference in the odds of infection (\(\chi^2\): \(0.777\); \(P\)value: \(0.378\); odds ratio: \(1.657\); \(95\)% CI: \(0.537\) to \(5.12\)) between natural nests (odds: \(2.90\); \(n = 39\)) and relocated nests (odds: \(1.75\); \(n = 22\)).
36.9 Example: shopping bags
A study of \(400\) residents of Klang Valley, Malayasia, examined residents' approach to waste management (Choon, Tan, and Chong 2017). One RQ was:
For residents of Klang Valley, is age associated with whether people bring their own bags when shopping?
The data (Table 36.5) are given in a \(3\times 2\) table of counts. The software output is shown in Fig. 36.4; a graphical summary in Fig. 36.5. Most of the numerical summary must be produced manually (Table 36.6), since jamovi only produces odds ratios for \(2\times 2\) tables. Here are the details of the calculations (notice that Row 1 is on the bottom of the fraction):
Brings own bags  Does not bring own bags  

30 and under  \(126\)  \(138\) 
31 to 40  \(\phantom{0}50\)  \(\phantom{0}32\) 
Over 40  \(\phantom{0}41\)  \(\phantom{0}13\) 
Odds  Odds ratio  Percentage  Sample size  

30 and under  \(0.913\)  \(0.289\)  \(47.7\)  \(264\) 
31 to 40  \(1.563\)  \(0.496\)  \(61.0\)  \(\phantom{0}82\) 
Over 40  \(3.154\)  \(75.9\)  \(\phantom{0}54\) 
 For those '\(30\) or under': the odds of bringing a shopping bag is \(126/138 = 0.913\).
 For those '\(31\) to \(40\)': the odds of bringing a shopping bag is \(50/32 = 1.712\).
 For those 'Over \(40\)': the odds of bringing a shopping bag is \(41/13 = 3.154\).
Then the odds ratios can be computed:
 The OR of bringing a shopping bag, comparing people '\(31\)\(40\)' to people 'Over \(40\)': \(0.913/3.154 = 0.289\).
 The OR of bringing a shopping bag, comparing people 'Over \(40\)' to people 'Over \(40\)': \(1.563/3.154 = 0.496\).
In Table 36.6, the odds of bringing a shopping bag are relative to those 'Over \(40\)' (the last row). Since Table 36.6 has three groups to compare, three odds are needed. However, the summary has \(3  1 = 2\) odds ratios, since odds ratios compare pairs of odds. The level to which the other two are compared is called the reference level. In Table 36.6, the reference level is 'Over \(40\)' (i.e., on the bottom of the fraction when computing the odds ratios). (In a \(2\times 2\) table, with two groups to compare, the summary has only \(2  1 = 1\) odds ratio.)
These odds ratios mean:
 The odds of bringing a shopping bag for those '\(30\) and under' is \(0.289\) times the odds of those 'Over \(40\)'; and
 The odds of bringing a shopping bag for those '\(31\) to \(40\)' is \(0.496\) times the odds of those 'Over \(40\)'.
The hypothesis can be worded in terms of odds:
 \(H_0\): The odds of bringing a shopping bag is the same for all age groups.
 \(H_1\): The odds of bringing a shopping bag is not the same for all age groups.
Alternatively, the hypotheses can be worded in terms of relationships or associations (but not correlations) between the two variables:
 \(H_0\): No association exists between bringing a shopping bag and age group.
 \(H_1\): An association exists between bringing a shopping bag and age group.
For a \(2\times 2\) table, the parameter is the odds ratio. For twoway tables larger than \(2\times 2\), defining a parameter is difficult; it requires a single number to measure the association between the variables, but we need two ORs to summarise the data. Effectively, the \(\chi^2\) statistic becomes the parameter that measures the size of the difference between all three odds. When no relationship exists in the population, \(\chi^2 = 0\); hence \(H_0:\) \(\chi^2 = 0\). The alternative hypothesis is \(H_1\): \(\chi^2 > 0\); that is, the value of \(\chi^2\) in the sample is not zero due to sampling variation.
From the software output, \(\chi^2 = 16.24\) and \(\text{df} = 2\), so this \(\chi^2\) value is approximately equivalent to a \(z\)score of \(\sqrt{16.24\div 2} = 2.85\). This is a large \(z\)score so, using the \(68\)\(95\)\(99.7\) rule, a small \(P\)value is expected; indeed, jamovi reports \(P < 0.001\). This suggests very strong evidence in the sample that bringing a shopping bag is associated with age.
The conclusion could be written as
The sample provides very strong evidence (\(\chi^2 = 16.24\); \(\text{df} = 2\)) that a relationship exists in the population between bringing a shopping bag and age.
While sample summary information could be added to this conclusion, the statements may then become cumbersome. Instead, pointing readers to the numerical summary (Table 36.6) is probably better. Furthermore, CIs are not reported since jamovi does not produce CIs for tables larger than \(2\times 2\).
All expected values exceed \(5\) (Fig. 36.4), so the results are statistically valid.
36.10 Chapter summary
To test a hypothesis about a population odds ratio, based on the value of the sample odds ratio, initially assume the value of the population odds ratio in the null hypothesis (usually one) to be true. Then, expected counts can be computed. Since the sample odds ratio varies from sample to sample, under certain statistical validity conditions, a quantity closelyrelated to the sample odds ratio varies with an approximate normal distribution. This distribution describes what values of the sample odds ratio could be expected* in the sample if the value of the populations odds ratio in the null hypothesis was true. The test statistic is a \(\chi^2\) statistic, which compares the expected and observed counts.
The value of \(\sqrt{\chi^2/\text{df}}\) is like a \(z\)score, where 'df' is the 'degrees of freedom' reported by software, and so an approximate \(P\)value can be estimated using the \(68\)\(95\)\(99.7\) rule. Software reports the \(P\)value to assess whether the data are consistent with the assumption.
36.11 Quick review questions
A study (Egbue, Long, and Samaranayake 2017) of the adoption of electric vehicle (EVs) by a certain group of professional Americans (Example 5.14) compiled the data in Table 36.7. Output from using jamovi is shown in Fig. 36.6.
Yes  No  

No postgrad  \(24\)  \(\phantom{0}8\) 
Postgrad study  \(51\)  \(29\) 
 What is the \(\chi^2\) value?
 What is the equivalent \(z\)score (to two decimal places)?
 Using the \(68\)\(95\)\(99.7\) rule, what is the approximate \(P\)value?
 From the software output, what is the \(P\)value?
 What is the alternative hypothesis?
 True or false: There is no evidence of a difference in the odds of buying a car in the next 10 years, between those with and without postgraduate study.
36.12 Exercises
Selected answers are available in App. E.
Exercise 36.1 Researchers (Christensen, Herrer, and Telford 1972) studied the number of sandflies caught in light traps set at 3 and 35 feet above ground in eastern Panama. They asked:
In eastern Panama, are the odds of finding a male sandfly the same at 3 feet above ground as at 35 feet above ground?
The data are compiled into a table (Table 36.8), and summarised numerically (Table 36.9; partially edited) and graphically (Fig. 36.7). Use the jamovi output (Fig. 36.8) to evaluate the evidence, complete Table 36.9, and write a conclusion.
above ground  above ground  

Males  \(173\)  \(125\) 
Females  \(150\)  \(\phantom{0}73\) 
Odds  Percentage  Sample size  

3 feet:  \(298\)  
35 feet:  \(1.71\)  \(67.3\)  \(223\) 
Odds ratio:  \(0.67\) 
Exercise 36.2 [Dataset: ForwardFall
]
(This study also appeared in Exercise 29.2, where the odds ratio, and the CI for the odds ratio, were computed.)
A forwarddirection observational study in Western Australia compared the heights of scars from burns received (Wallace et al. 2017).
The data are shown in Table 36.10.
jamovi was used to analyse the data (Fig. 36.9).
 Perform a hypothesis test to determine if the odds of having a smooth scar are the same for women and men.
 Write down the conclusion.
 Is the test statistically valid?
Women  Men  

Scar height \(0\) mm (smooth)  \(99\)  \(216\) 
Scar height more than \(0\) mm, less than \(1\) mm  \(62\)  \(115\) 
Exercise 36.3 In a study of turbine failures (Myers, Montgomery, and Vining 2002; Nelson 1982), \(73\) turbines were run for around \(1800\) hrs, and seven developed fissures (small cracks). Fortytwo different turbines were run for about \(3000\) hrs, and nine developed fissures.
Exercise 36.4 (This study also appeared in Exercise 29.5.) The Southern Oscillation Index (SOI) is a standardised measure of the air pressure difference between Tahiti and Darwin, and has been shown to be related to rainfall in some parts of the world (Stone, Hammer, and Marcussen 1996), and especially Queensland (Stone and Auliciems 1992; P. K. Dunn 2001).
As an example (P. K. Dunn and Smyth 2018), the rainfall at Emerald (Queensland) was recorded for Augusts between 1889 to 2002 inclusive, in Augusts when the monthly average SOI was positive, and when the SOI was nonpositive (that is, zero or negative), as shown in Table 36.11.
 Using the jamovi output in Fig. 36.11, perform a hypothesis test to determine if the odds of having no rain is the same Augusts with nonpositive and negative SOI.
 Write down the conclusion.
 Is the test statistically valid?
Nonpositive SOI  Positive SOI  

No rainfall recorded  \(14\)  \(\phantom{0}7\) 
Rainfall recorded  \(40\)  \(53\) 
Exercise 36.5 [Dataset: HatSunglasses
]
(This study also appeared in Exercise 29.6.)
A research study conducted in Brisbane (B. Dexter et al. 2019) recorded the number of people at the foot of the Goodwill Bridge, Southbank, who wore sunglasses and hats.
The data were recorded between \(11\):\(30\)am to \(12\):\(30\)pm.
Of the \(386\) males observed, \(79\) wore hats; of the \(366\) females observed, \(22\) wore hats.
 Compute the percentages of females wearing a hat.
 Compute the percentages of males wearing a hat.
 Compute the odds of a female wearing a hat.
 Compute the odds of a male wearing a hat.
 Compute the odds ratio of wearing a hat, comparing females to males.
 Compute the odds ratio of wearing a hat, comparing males to females.
 Find the \(95\)% CI for the appropriate OR.
 Using the jamovi output in Fig. 36.12, perform a hypothesis test to determine if the odds of wearing a hat is the same for females and males.
 Write down the conclusion.
 Is the test statistically valid?
Exercise 36.6 A study (Lennon, OviedoTrespalacios, and Matthews 2017) asked people about their mobilephone interactions while crossing the road as pedestrians. Part of the data are summarised in Table 36.12.
 Compute the column percentages.
 Compute the odds of low exposure to each behaviour.
 Write the hypothesis for conducting a hypothesis test.
 Compute the expected counts.
 After analysis in jamovi, the value of \(\chi^2\) is \(20.923\) with two degrees of freedom. What is the approximatelyequivalent \(z\)score? Would you expect a large or small \(P\)value?
 The \(P\)value is given as \(P < 0.000\). Write a conclusion.
Answer call  Respond to text  Reply to email  

Low exposure  \(263\)  \(259\)  \(302\) 
High exposure  \(\phantom{0}94\)  \(\phantom{0}98\)  \(\phantom{0}51\) 
Exercise 36.7 [Dataset: PetBirds
]
(This study also appeared in Exercise 29.7.)
A study examined people with lung cancer, and a matched set of similar controls who did not have lung cancer, and compared the proportion in each group that had pet birds (Kohlmeier et al. 1992).
The data are shown again in Table 36.13.
Consider this RQ:
Are the odds of having a pet bird the same for people with lung cancer (cases) and for people without lung cancer (controls)?
 Carefully describe the parameter.
 Write the hypotheses in terms of odds.
 Determine the value of \(z\) that is approximately the same as this \(\chi^2\)value.
 Use the software output to conduct a hypothesis test.
Adults with lung cancer  Adults without lung cancer  Total  

Did not keep pet birds  \(141\)  \(328\)  \(469\) 
Kept pet birds  \(\phantom{0}98\)  \(101\)  \(199\) 
Total  \(239\)  \(429\)  \(668\) 
Exercise 36.8 [Dataset: B12Long
]
(This study was seen in Exercise 29.8.)
A study in New Zealand (Gammon et al. 2012) asked:
Among a certain group of women, are the odds of being vitamin B12 deficient different for women on a vegetarian diet compared to women on a nonvegetarian diet?
The population was 'predominantly overweight/obese women of South Asian origin living in Auckland'. The data are shown in Table 29.11.
 Write down the hypotheses in terms of odds.
 Write down the parameter.
 Determine the \(\chi^2\) value and perform a hypothesis to answer the RQ, using the output in Fig. 36.14.
 Compute the equivalent \(z\)score for this \(\chi^2\)value.
 Write down the conclusion.
 Is the test statistically valid?