# C Symbols, formulas, statistics and parameters

## C.1 Symbols and standard errors

Parameter | Statistic | Standard error | S.E. formula reference | |
---|---|---|---|---|

Proportion (CI) | \(p\) | \(\hat{p}\) | \(\displaystyle\text{s.e.}(\hat{p}) = \sqrt{\frac{ \hat{p} \times (1 - \hat{p})}{n}}\) | Def. 20.2 |

Proportion (Test) | \(p\) | \(\hat{p}\) | \(\displaystyle\text{s.e.}(\hat{p}) = \sqrt{\frac{ p \times (1 - p)}{n}}\) | Def. 20.1 |

Mean | \(\mu\) | \(\bar{x}\) | \(\displaystyle\text{s.e.}(\bar{x}) = \frac{s}{\sqrt{n}}\) | Def. 22.1 |

Standard deviation | \(\sigma\) | \(s\) | ||

Mean difference | \(\mu_d\) | \(\bar{d}\) | \(\displaystyle\text{s.e.}(\bar{d}) = \frac{s_d}{\sqrt{n}}\) | Def. 23.2 |

Diff. between means | \(\mu_1 - \mu_2\) | \(\bar{x}_1 - \bar{x}_2\) | \(\displaystyle\text{s.e.}(\bar{x}_1 - \bar{x}_2)\) | Value given |

Odds ratio | Pop. OR | Sample OR | Value given | |

Correlation | \(\rho\) | \(r\) | ||

Slope of regression line | \(\beta_1\) | \(b_1\) | \(\text{s.e.}(b_1)\) | Value given |

Intercept of regression line | \(\beta_0\) | \(b_0\) | \(\text{s.e.}(b_0)\) | Value given |

R-squared | \(R^2\) |

## C.2 Confidence intervals

**Confidence intervals** have the form

\[
\text{statistic} \pm ( \text{multiplier} \times \text{s.e.}(\text{statistic})).
\]
when the sampling distribution has an approximate normal distribution.

**Notes:**

- The multiplier is
*approximately*2 for an*approximate*95% CI (based on the 68--95--99.7 rule). -
\(\text{multiplier} \times \text{s.e.}(\text{statistic})\) is called the
*margin of error*. - When the sampling distribution for the statistic does not have an approximate normal distribution (e.g., for
*odds ratios*and*correlation coefficients*),**this formula does not apply**.

## C.3 Hypothesis testing

For **hypothesis tests**, the *test statistic* is a \(t\)-score, which has the form:

\[
t = \frac{\text{statistic} - \text{parameter}}{\text{s.e.}(\text{statistic})}.
\]
when the sampling distribution has an approximate normal distribution.

**Notes:**

- Since \(t\)-scores are a little like \(z\)-scores, the 68--95--99.7 rule can be used to
*approximate*\(P\)-values. - Tests involving
*odds ratios*do not use \(t\)-scores, so**this formula does not apply for tests involving odds ratios**. - When the sampling distribution for the statistic does not have an approximate normal distribution (e.g., for
*odds ratios*and*correlation coefficients*),**this formula does not apply**. - A hypothesis test about
**odds ratios**uses a \(\chi^2\) test statistic, whose value is approximately like a \(z\)-score with a value of

\[ \sqrt{\frac{\chi^2}{\text{df}}}. \] where \(\text{df}\) is the 'degrees of freedom' given in the software output.

## C.4 Other formulas

To calculate \(z\)-scores (Sect. 17.4): \(\displaystyle z = \frac{x - \mu}{\sigma}\) or, more generally, \[ z = \frac{\text{value of variable} - \text{mean of distribution}}{\text{standard deviation of distribution}}. \]

The

**unstandardizing formula**(Sect. 17.9): \(x = \mu + (z\times \sigma)\).To estimate the sample size needed (Sect. 26.2) for

**estimating a proportion**:

\[ n = \frac{1}{(\text{Margin of error})^2}. \]To estimate the sample size needed (Sect. 26.3) for

**estimating a mean**:

\[ n = \left( \frac{2\times s}{\text{Margin of error}}\right)^2. \]

**Notes:**

- In
**sample size calculations**, always**round up**the sample size found from the above formulas. - \(t\)-scores are like \(z\)-scores, except that the standard deviation of the distribution includes some values estimated from the sample.