# Review 1

## Measures of central tendency and measures of dispersion

### Mean, median, mode

**Median:** The median is the middle value of the ordered sample values. It is the number which is half-way into the list of reported values

**Mode:** The mode is the value with highest frequency in the sample.

**Mean:**

\[\overline{x} = \frac{1}{n}\sum x_i\]

### Easier way to calculate the mean

Sample: 0, 0, 1, 3, 3, 4, 4, 4, 5, 8.

x | f | fx |
---|---|---|

0 | 2 | 0 |

1 | 1 | 1 |

3 | 2 | 6 |

4 | 3 | 12 |

5 | 1 | 5 |

8 | 1 | 8 |

\(\sum x = 21\) | \(\sum f=10\) | \(\sum fx=32\) |

\[\overline{x} = \frac{\sum fx}{\sum f} = \frac{32}{10} = 3.2\]

### Skweness

It is important to understand how the skewness of a distribution influences the relationship between the mean, median and mode. The figure below is from **here**.

### Standard deviation

\[sd = \sqrt{\frac{f(x - \overline{x})^2}{n-1} }\]

\(x\) | f | fx | \(x - \overline{x}\) | \((x - \overline{x})^2\) | \(f(x - \overline{x})^2\) |
---|---|---|---|---|---|

… | … | … | … | … | … |

. | \(\sum f\) | \(\sum fx\) | \(\sum above = A\) |

Calculate the standard deviation: \[sd = \sqrt{\frac{A}{n-1} }\]

### Variance

\[variance = sd^2\]

## Z-scores

\[z=\frac{x_i-\overline{x}}{sd}\]

where \(s\) is the sample standard deviation

Recall that, for the z-distribution:

- Mean = 0
- SD = 1

## Confidence intervals

**Standard error:**
\[se = \frac{sd}{\sqrt{n}}\]

**CIs:**
\[\text{Lower bound} = \bar x - z_{(\frac{\alpha}{2})}se\]

\[\text{Upper bound} = \bar x + z_{(\frac{\alpha}{2})}se\]

- For 95% CI: \(z_{(\frac{\alpha}{2})}= z_{0.025} \approx 2\)
- For 99% CI: \(z_{(\frac{\alpha}{2})}= z_{0.005} \approx 3\)

## Hypothesis testing: one-sample t-test

\[t = \frac {\bar x-\mu_0}{se}\]

- We
*reject the null hypothesis*only when the calculated \(t\) is higher than the critical value (\(c_\alpha\)).

- We
- We
*reject the null hypothesis*only when \(p \leq \alpha\)

- We
- We
*reject the null hypothesis*if the mean difference CI does not include 0.

- We

## Practice questions

- Suppose \(\overline{x} = 10\) for a negatively skewed distribution. What could be the median and mode of this distribution?

- median = 11 and mode = 12
- median = 12 and mode = 11

- Suppose you created a z-score distribution from a given sample. What is the mean of this distribution?

- 1
- 0
- There is not enough information to calculate the mean.

- The p-value equals:

- \(\alpha\)
- The significance level
- 0.05

- What is the percentile of a z-score \(z= 1.18\)

- 45%
- 17%
- 88%

- Using the z-table, find the values of z:

- That is just above 97.5% of all values in the distribution.
- That is just above 99.5% of all values in the distribution.

- Using the t-table, find the value of t for:

- Two-tail, \(\alpha=0.05\) (n=31)
- One-tail, \(\alpha=0.05\) (n=31)
- Two-tail, \(\alpha=0.01\) (n=31)
- One-tail, \(\alpha=0.01\) (n=31)

- Compared to a 95% confidence interval, a 99% confidence interval:

- Is more precise
- Is more accurate

- If \(p = 0.03\)

- We reject the null when \(\alpha = 0.05\)
- We reject the null when \(\alpha = 0.01\)
- We never reject the null

- As the absolute value of \(t_{calculated}\) increases, we are:

- More likely to reject the null
- Less likely to reject the null

- Given a 95% mean difference CI = (-0.12;1.2), do we reject the null that the population mean = 15 (assuming \(\alpha = 0.05)\)?

- Yes
- No

- As sample size increases a 95% CI becomes:

- Narrower, more precise
- Less accurate