## A.4 Answer: TW 4 tutorial

1. $$43\div (43 + 67) \times 100 = 39.1$$%.
2. $$39\div (97 + 39) \times 100 = 28.7$$%.
3. $$43\div 67 = 0.642$$.
4. $$39\div 97 = 0.402$$.
5. $$0.642 \div 0.402 = 1.6$$.

1. $$45\div 240=18.75$$% of all college students in the sample have received one concussion, the same for all student types.
2. Not sure which is best; they do different things. I'd probably prefer (c) (bottom left graph).
3. Odds (non-athlete receiving 2+ concussions) is $$21\div(81-21) = 21\div 60 = 0.35$$. Non-athletes are 0.35 times as likely to receive two or more concussions than fewer than two. Or: For every 100 students with fewer than two concussions, there are 35 with two or more.
4. Odds (soccer player receiving 2+ concussions) is $$13\div(63-13) = 13\div 50 = 0.26$$. Soccer players are 0.26 times more likely to receive two or more concussions than fewer than two concussions.
5. OR: $$0.35 \div 0.26$$, or about $$1.35$$. The odds of a non-athlete receiving two or more concussions is about 1.35 times the odds of a soccer player receiving two or more concussions (that is, smaller odds for soccer players).
6. Not given.
7. Not given.
8. The better table is probably row percentages: The percentages of each type of students with given number of concussions.

1. The smallest jellyfish has a breadth of (approx.) 6mm. A median of 4mm is silly! The median is somewhere between 8 and 10mm

2. A typical Dangar Island jellyfish has a breadth of about 10mm, but the variation is from about 6 to 16mm. The data are slightly skewed to the right (most jellyfish have smaller breadths, but some have larger breadths), but the shape is a bit funny (more data would smooth it out).

3. Site A is Dangar Island.

Jellyfish at Salamander Bay generally have a larger breadth: The median breadth at Salamander Bay (about 16mm) is greater than almost all the jellyfish at Dangar Island.

Jellyfish at Salamander Bay are a little less variable in terms of breadth. The distributions look slightly skewed right at both sites.

1. Descriptive RQ.
2. $$\bar{x}=102.62$$MPa; $$s=5.356$$MPa.
3. The median is $$101.1$$MPa. The range is from $$97.6$$ to $$111.2$$, or $$13.6$$MPa.
4. Report none based on five values! Too few data points! In any case, both the median and the mean have very similar values.
5. If all five measurements came from one board, we would not have a good representation of 'bamboo floorboards' in general. We would have just one unit of analysis.

1. Five variables. ('Participants' would not be summarised, it is technically an identifier and not a variable, as each person has a unique value).
2. Age; Height; Weight and quantitaive continuous.
3. Gender (nominal; two levels); GMFCS (ordinal; three levels)
4. As follows:
• 'Gender': Percentages (or number) F and M
• 'Age': Mean/median; standard deviation/IQR
• 'Height': Mean/median; standard deviation/IQR
• 'Weight': Mean/median; standard deviation/IQR
• 'GMFCS': Percentages (or numbers) in each group
5. As follows:
• 'Gender': Barchart (not really needed)
• 'Age': Histogram/stemplot
• 'Height': Histogram/stemplot
• 'Weight': Histogram/stemplot
• 'GMFCS': Barchart/piechart
6. As follows:
• Between Gender and Height: Boxplot
• Between Gender and GMFCS: Side-by-side or stacked bar chart.

1. Median largest: Class D
2. Median smallest: Class C
3. Standard deviation largest: Class A
4. Standard deviation smallest: Class D

1. A few issues...
• Five decimal places is to the nearest 0.01 of a mm!
• The standard deviation of the difference is not the difference between the individual standard deviations.
• A standard deviation cannot be negative. (Same applies to standard errors, but we aren't there yet.)
• Note that there is a sample size of 0 for the difference!
2. A few issues...
• Five decimal places: That's accuracy to 0.00001 of a millimetre per second. I don't think so...
• There is no numerical measures of the most important thing and the thing the RQ (presumably) concerns:
• The differences between the two brands.

1. Observational: The 'treatment' (brand of battery) is not assigned; we simply take measurements from the batteries that exist.
2. Units of observation and units of analysis: The individual batteries.
3. Energizer: mean: $$7.36$$ hours; std dev: $$0.289$$. Ultracell: mean: $$7.41$$ hours; std dev: $$0.172$$. So Ultracell batteries are slightly better (last longer) on average, and more consistent performers.
4. Energizer: median: 7.46 hours. Ultracell: median: 7.48 hours. So Ultracell batteries are slightly better (last longer) on average.
5. Probably median (outliers?), but mean and median are similar in any case.
6. Values are close. Certainly no practically importance difference. Energizer batteries take, on average, less time to reach 1.0 volts, so are 'worse' in this regard. They also have a lot more variation. But in practical terms, the difference is minimal. (Based on means, the difference is 0.05 hrs, or 3 mins in over 7 hrs use! Based on medians, the difference is 0.02 hours, or 1.2 mins!) The practical difference is negligible. 7.* Quite possibly, some very low values for both brands.
7. Yes: At the time of the study, the Ultracell batteries were substantially cheaper, and hence much better value.

4.0.1875/0.096 = 1.95\$; the odds are about twice as great. 5. $$60 \div 380 = 15.8$$%. 6. $$33\div 377 = 8.8$$%. 7. Probably not: big difference seen in a large size sample.

1. See Table A.1.
2. $$60/320 = 0.1875$$. A worker with SMND is 0.1875 times as likely to have worked with metal than not.
3. $$33/344 = 0.096$$. A worker with SMND is 0.096 times as likely to have worked with metal than not.
TABLE A.1: SMND cases, and whether they had worked with metals
Worked with metals Did not work with metals Total
SMND cases 60 320 380
Controls 33 344 377
Total 93 664 757

1. 70% directly from the table.
2. $$281/(404 - 281) - 2.28$$.
3. For patients treated with ETI, about 278 die for every 100 that survive.
4. $$2.78/2.28 = 1.22$$.

1. Odds less than one.
2. Odds less than one.
3. Odds greater than one.
4. Odds greater than one.

FAS here is about 20--25 in general, ranging from 10 to 50 (which are actually the smallest and largest possible scores). Here's is a detailed explanation. You are not expected to go to this detail! The FAS is about 25 on average, ranging from about 10 to 50, slightly skewed right with no outliers. The distribution is unimodal. (The heights of the bars are, as best as I can figure, 3; 14; 44; 34; 22; 17; 5; 3.) With $$n=142$$, the median is observation number 71.5, which is in the 25--30 bar. The quartiles will have about 35 in them, so $$Q_1$$ will be in the 20--25 bar, and $$Q_3$$ will be in the 30--35 bar. Histogram is pretty good; bars really should be touching.

1. Not shown.
2. The median temperatures similar; a slight increase from Office A to C. IQR similar for each office, and range similar for Offices A and C (slightly larger for Office B). In summary: Office a a bit different (cooler) on average.
3. Perhaps Office A.

1. Stacked or side-by-side barcharts; whether bird injured or not, and whether upper or lower are both qualitative variables.
2. Boxplots would be OK; one quantitative (length-of-stay) and one qualitative (therapy type) variable.
3. Scatterplots; two quantitative variables (frequency; amount wheat consumed).
4. Two-way table or stacked/side-by-side barchart, or table; two qualitative variables vars (30 mins of physical activity or not; vigorous physical activity or not).

1. The percentage of bridges collapsing due to deterioration is 36 divided by 433, or 8.3%.
2. The odds of a bridge collapsing due to deterioration is 36 divided by 397, or 0.091.
3. The odds of a bridge collapsing due to deterioration is 0.091; this means: b. For every 100 bridges that do not collapse due to deterioration, about 9.1 bridges do collapse.
4. The percentage of bridges not collapsing due to deterioration is 397 divided by 433, or 91.7%.
5. The odds of a bridge not collapsing due to deterioration is 397 divided by 36, or 11.0.
6. The percentage of bridge collapses due to collisions is 82 divided by 433, or 18.9%.
7. The odds of a bridge collapsing due to collisions is 83 divided by 351, or 0.234.
8. The odds of an event cannot be larger than one: FALSE.
9. The odds of an event cannot be smaller than one. FALSE.
10. The following statement is true: c. Odds cannot be negative

1. Observational.
2. The percentage of residents in each area that detected the smell at the given frequency. So, for example, the first number in the first column would become $$48\div 97\times 100 = 49.5$$. This means that $$49.5$$% of the sample that lived in Area I detected the odour at least once a week.
3. For example, the percentage of residents who detected an odour at least once a week, who lived in Area I.
4. Probably column percentages.
5. $$77\div291 = 26.46\%$$.

1. $$(381 + 345)/1347 = 0.53898$$, or about 53.9%.
2. $$(381 + 345)/(295 + 326) = 1.1691$$, or about 1.17.
3. $$345/326 = 1.05828$$, or about 1.06.
4. $$381/295 = 1.29153$$, or about 1.29.
5. $$1.05828 / 1.29153 = 0.81940$$, or about 0.819.
1. a $$(240 + 138)/960 = 0.39375$$, or 39.4%.
1. $$(240 + 342)/960 = 0.60625$$, or 60.6%.
2. $$(240 + 240)/960 = 0.5000$$, or 50%.
3. $$(240 + 138)/(240 + 342) = 0.64948$$, or about 0.649.
4. $$240/240 = 1.000$$.
5. $$138/342 = 0.40351$$ or about 0.404.
6. $$0.40351/ 1.000 = 0.40351$$ (i.e., the answer to f divided by the answer to e).