C Symbols, formulas, statistics and parameters

C.1 Symbols and standard errors

TABLE C.1: Some sample statistics used to estimate population parameters. Empty table cells means that these are not studied in this textbook. The dashes means that no formula is given in this textbook.
Parameter Statistic Standard error S.E. formula reference
Proportion \(p\) \(\hat{p}\) \(\displaystyle\text{s.e.}(\hat{p}) = \sqrt{\frac{ \hat{p} \times (1 - \hat{p})}{n}}\) Def. 20.2
Mean \(\mu\) \(\bar{x}\) \(\displaystyle\text{s.e.}(\bar{x}) = \frac{s}{\sqrt{n}}\) Def. 22.1
Standard deviation \(\sigma\) \(s\)
Mean difference \(\mu_d\) \(\bar{d}\) \(\displaystyle\text{s.e.}(\bar{d}) = \frac{s_d}{\sqrt{n}}\) Def. 23.2
Diff. between means \(\mu_1 - \mu_2\) \(\bar{x}_1 - \bar{x}_2\) \(\displaystyle\text{s.e.}(\bar{x}_1 - \bar{x}_2)\) --
Odds ratio Pop. OR Sample OR \(\displaystyle\text{s.e.}(\text{sample OR})\) --
Correlation \(\rho\) \(r\)
Slope of regression line \(\beta_1\) \(b_1\) \(\text{s.e.}(b_1)\) --
Intercept of regression line \(\beta_0\) \(b_0\) \(\text{s.e.}(b_0)\) --
R-squared \(R^2\)

C.2 Confidence intervals

Almost all confidence intervals have the form

\[ \text{statistic} \pm ( \text{multiplier} \times \text{s.e.}(\text{statistic})). \]

Notes:

  • The multiplier is approximately 2 for an approximate 95% CI (based on the 68--95--99.7 rule).
  • \(\text{multiplier} \times \text{s.e.}(\text{statistic})\) is called the margin of error.
  • Confidence intervals for odds ratios are slightly different, so this formula does not apply for odds ratios. For the same reason, a standard error for ORs is not given.

C.3 Hypothesis testing

For many hypothesis tests, the test statistic is a \(t\)-score, which has the form:

\[ t = \frac{\text{statistic} - \text{parameter}}{\text{s.e.}(\text{statistic})}. \]

Notes:

  • Since \(t\)-scores are a little like \(z\)-scores, the 68--95--99.7 rule can be used to approximate \(P\)-values.
  • Tests involving odds ratios do not use \(t\)-scores, so this formula does not apply for tests involving odds ratios.
  • For tests involving odds ratios, the test statistic is a \(\chi^2\) score and not \(t\)-score. For the same reason, a standard error for ORs is not given.
  • The \(\chi^2\) statistic is approximately like a \(z\)-score with a value of (where \(\text{df}\) is the 'degrees of freedom' given in the software output):

\[ \sqrt{\frac{\chi^2}{\text{df}}}. \]

C.4 Other formulas

  • To estimate the sample size needed when estimating a proportion: \(\displaystyle n = \frac{1}{(\text{Margin of error})^2}\).
  • To estimate the sample size needed when estimating a mean: \(\displaystyle n = \left( \frac{2\times s}{\text{Margin of error}}\right)^2\).
  • To calculate \(z\)-scores: \(\displaystyle z = \frac{x - \mu}{\sigma}\) or, more generally, \(\displaystyle z = \frac{\text{specific value of variable} - \text{mean of variable}}{\text{measure of variable's variation}}\).
  • The unstandardizing formula: \(x = \mu + (z\times \sigma)\).

Notes:

  • In sample size calculations, always round up the sample size found from the above formulas.

C.5 Other symbols used

TABLE C.2: Some symbols used
Symbol Meaning Reference
\(H_0\) Null hypothesis Sect. 29.2
\(H_1\) Alternative hypothesis Sect. 29.2
df Degrees of freedom Sect. 32.4
CI Confidence interval Chap. 21
s.e. Standard error Def. 18.3
\(n\) Sample size
\(\chi^2\) The chi-squared test statistic Sect. 32.4