Chapter 2 Notation Summary

This chapter provides a summary of notation used throughout STM1001, designed to be used as a reference. For convenience, the chapter is divided into sections based on the topic for which the notation is most relevant, starting with a section containing general notation used throughout STM1001. The final section contains a consolidated table of all notation used throughout STM1001.

2.1 General notation used throughought STM1001

The table below provides a summary of general notation used throughout STM1001. Most of this notation is introduced early on in the subject, during Topics 1-4, and is then used throughout the subject.

Notation Meaning Comments
n Sample size
xi The ith x value, usually from a list of n x values, e.g. (x1,x2,,xn), where i can take any value from 1 to n
Summation sign
ni=1 The sum from i=1 to n See the previous chapter for an example
μ Population mean This is a Greek letter pronounced 'm-yoo'
¯x Sample mean The line on top of the x is referred to as a 'bar', so that ¯x is pronounced 'x bar'
σ2 Population variance σ is a Greek letter pronounced 'sigma'
s2 Sample variance
σ Population standard deviation σ is a Greek letter pronounced 'sigma'
s Sample standard deviation
Q1 Quartile 1: 25% quantile
Q2 Quartile 2: 50% quantile, and also the median
Q3 Quartile 3: 75% quantile
ρ Population correlation This is a Greek letter pronounced 'rho'
r Sample correlation
|x| The 'absolute value' of some number x See the previous chapter for an example
X A random variable (discrete or continuous)
E(X) Expected value (or mean) of X
Var(X) Variance of X
SD(X) Standard deviation of X
¯X Sample mean (random)
± Plus or minus. For example, x±a=(xa,x+a)

2.2 Topics 3 and 4: Probability, Distributions and Sampling Distributions

The following table provides a summary of notation used in Topics 3 and 4.

The lecture slides for Topic 3 can be found here.

The readings for Topic 3 can be found here.

The lecture slides for Topic 4 can be found here.

The readings for Topic 4 can be found here.

Notation Meaning Comments
Ω Sample space This is a Greek letter pronounced 'omega'
Null event (or null set) Also sometimes referred to as 'empty set'
P(A) The probability of event A
AC The complement of A, or "not A" Sometimes denoted as A
X A random variable (discrete or continuous)
P(X=x) The probability that the random variable X takes the value x. For example, let X denote the number of times you check this document. Then P(X=2) denotes the probability that you check this document exactly 2 times.
E(X) Expected value (or mean) of X
Var(X) Variance of X
SD(X) Standard deviation of X
¯X Sample mean (random)
XN(μ,σ2) X follows a Normal distribution with mean μ and variance σ2 To define a Normal distribution, we need to know the mean and variance
ZN(0,1) Z follows the "Standard Normal distribution", that is, a Normal distribution with μ=0 and σ2=1. The usual convention is to use Z instead of X when using the standard Normal distribution. The variance of Z is σ2=12=1. The standard deviation is also 1, since σ2=12=1
z z-score, defined as z=xμσ For a given value of x, the corresponding z-score can be thought of as its "standardised" value
± Plus or minus. For example, x±a=(xa,x+a)

2.3 Topics 5 and 6: Hypothesis testing and t-tests

The following table provides a summary of notation used in Topics 5 and 6.

The lecture slides for Topic 5 can be found here.

The readings for Topic 5 can be found here.

The lecture slides for Topic 6 can be found here.

The readings for Topic 6 can be found here.

Notation Meaning Comments
± Plus or minus. For example, x±a=(xa,x+a)
t-distribution The distribution used for t-tests To define a t-distribution, we need to know the degrees of freedom
df Degrees of freedom
Ttdf T follows a t-distribution with degrees of freedom equal to df. For example, if df=1, then Tt1
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
μ0 The population mean under the null hypothesis
T Random test statistic
¯X Sample mean (random)
S Estimator of the standard deviation of X
SE Estimator of the standard error, i.e. standard deviation of the sample mean Equal to Sn
t Observed test statistic Sometimes called the 't value'
ˉx Observed sample mean
s Observed standard deviation
se Observed standard error, i.e. observed standard deviation of the sample mean Equal to sn
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
tdf,1α/2 The value from the tdf distribution such that P(Ttdf,1α/2)=1α/2, i.e. the (1α/2)th quantile. For example, if α=0.05, 1α/2=10.05/2=10.025=0.975. We then have that P(Ttdf,0.975)=0.975
d Effect size, Cohen's d We use Cohen's d for t-tests

2.4 Topic 7: One-way ANOVA

The following table provides a summary of notation used in Topic 7.

The lecture slides for Topic 7 can be found here.

The readings for Topic 7 can be found here.

Notation Meaning Comments
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
ANOVA ANalysis Of VAriance
F-distribution The distribution used for ANOVA Hypothesis tests To define the F-distribution, we need to know d1 and d2
d1 Degrees of freedom 1
d2 Degrees of freedom 2
N Total sample size (in the context of ANOVA)
k Number of groups (in the context of ANOVA)
Fd1,d2 F-distribution with degrees of freedom 1 equal to d1 and degrees of freedom 2 equal to d2. For example, if d1=3 and d2=45, then our distribution is F3,45
η2 Effect size, 'eta squared' η is a Greek letter pronounced 'eta'. We use η2 for One-way ANOVA tests

2.5 Topic 8: Correlation and Simple Linear Regression

The following table provides a summary of notation used in Topic 8.

The lecture slides for Topic 8 can be found here.

The readings for Topic 8 can be found here.

Notation Meaning Comments
ρ Population correlation This is a Greek letter pronounced 'rho'
r Sample correlation
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
x The explanatory variable, also referred to as the independent variable or predictor variable
y The response variable, also referred to as the dependent variable
β0 Intercept coefficient in the simple linear regression model β is a Greek letter pronounced 'beta'. β0 is pronounced 'beta nought'
β1 Slope coefficient in the simple linear regression model β is a Greek letter pronounced 'beta'. β1 is pronounced 'beta 1'
ϵ Random error term in the simple linear regression model
ˆy The estimated value for y based on a simple linear regression model The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆy is pronounced 'y-hat'
ˆβ0 The estimated value for β0 The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆβ0 is pronounced 'β0-hat'
ˆβ1 The estimated value for β1 The estimated value for β0
R2 Coefficient of Determination. This value can be used to evaluate the fit of a simple linear regression model and is also the correlation squared.

2.6 Topic 9: Hypothesis testing for one and two sample proportions

The following table provides a summary of notation used in Topic 9.

The lecture slides for Topic 9 can be found here.

The readings for Topic 9 can be found here.

Notation Meaning Comments
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
p In the context of a one-sample test of proportions, p is the proportion of a population with a certain characteristic Not to be confused with the p-value in the context of hypothesis testing
n In the context of a one-sample test of proportions, n either is the number of observations in a random sample, or the number of independent trials
x In the context of a one-sample test of proportions, x is either the number of observations in the sample that have a certain characteristics, or the number of success in n trials
p0 The population proportion under the null hypothesis
ˆp The estimate of p
p1 The proportion of Population 1 (or Group 1) with a certain characteristic
p2 The proportion of Population 2 (or Group 2) with a certain characteristic
n1 The sample size from Population 1 (or Group 1)
n2 The sample size from Population 2 (or Group 2)
x1 The number of individuals in the sample from Population 1 (or Group 1) exhibiting the trait of interest
x2 The number of individuals in the sample from Population 2 (or Group 2) exhibiting the trait of interest
ˆp1 The estimate of p1
ˆp2 The estimate of p2

2.7 Topic 10: Chi-squared tests for categorical data

The following table provides a summary of notation used in Topic 10.

The lecture slides for Topic 10 can be found here.

The readings for Topic 10 can be found here.

Notation Meaning Comments
Summation sign
ni=1 The sum from i=1 to n See the previous chapter for an example
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
χ2-distribution The distribution used for Chi-squared tests χ is a Greek letter pronounced 'ky'
χ2df χ2-distribution with degrees of freedom equal to df. For example, if df=5, then our distribution is χ25
X2 The random test statistic for a Chi-squared test. X2χ2df under H0
χ2 The observed test statistic for a Chi-squared test
Oi The observed frequency in the ith category in a Chi-squared goodness of fit test
Ei The expected frequency for the ith category in a Chi-squared goodness of fit test
k The number of categories in a Chi-squared goodness of fit test
Oij The observed frequency in the ith row and the jth column in a Chi-squared test of independence
Eij The expected frequency of the ith row and the jth column in a Chi-squared test of independence
r The number of rows in a Chi-squared test of independence Not to be confused with the sample correlation r in the context of correlation
c The number of columns in a Chi-squared test of independence

2.8 Topic 11: Statistical Power and Sample Size Calculation

The following table provides a summary of notation used in Topic 11.

The lecture slides for Topic 11 can be found here.

The readings for Topic 11 can be found here.

Notation Meaning Comments
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
α The probability of a Type I Error This is also the significance level
β The probability of a Type II Error Not to be confused with β0 or β1 in the context of simple linear regression

2.9 Complete table of all notation used throughout STM1001

The following table provides a summary of all notation used in STM1001. All notation summarised in the previous sections of this chapter are provided in the consolidated table below.

Notation Meaning Comments
n Sample size
xi The ith x value, usually from a list of n x values, e.g. (x1,x2,,xn), where i can take any value from 1 to n
Summation sign
ni=1 The sum from i=1 to n See the previous chapter for an example
μ Population mean This is a Greek letter pronounced 'm-yoo'
¯x Sample mean The line on top of the x is referred to as a 'bar', so that ¯x is pronounced 'x bar'
σ2 Population variance σ is a Greek letter pronounced 'sigma'
s2 Sample variance
σ Population standard deviation σ is a Greek letter pronounced 'sigma'
s Sample standard deviation
Q1 Quartile 1: 25% quantile
Q2 Quartile 2: 50% quantile, and also the median
Q3 Quartile 3: 75% quantile
ρ Population correlation This is a Greek letter pronounced 'rho'
r Sample correlation
|x| The 'absolute value' of some number x See the previous chapter for an example
Ω Sample space This is a Greek letter pronounced 'omega'
Null event (or null set) Also sometimes referred to as 'empty set'
P(A) The probability of event A
AC The complement of A, or "not A" Sometimes denoted as A
X A random variable (discrete or continuous)
P(X=x) The probability that the random variable X takes the value x. For example, let X denote the number of times you check this document. Then P(X=2) denotes the probability that you check this document exactly 2 times.
E(X) Expected value (or mean) of X
Var(X) Variance of X
SD(X) Standard deviation of X
¯X Sample mean (random)
XN(μ,σ2) X follows a Normal distribution with mean μ and variance σ2 To define a Normal distribution, we need to know the mean and variance
ZN(0,1) Z follows the "Standard Normal distribution", that is, a Normal distribution with μ=0 and σ2=1. The usual convention is to use Z instead of X when using the standard Normal distribution. The variance of Z is σ2=12=1. The standard deviation is also 1, since σ2=12=1
z z-score, defined as z=xμσ For a given value of x, the corresponding z-score can be thought of as its "standardised" value
± Plus or minus. For example, x±a=(xa,x+a)
t-distribution The distribution used for t-tests To define a t-distribution, we need to know the degrees of freedom
df Degrees of freedom
Ttdf T follows a t-distribution with degrees of freedom equal to df. For example, if df=1, then Tt1
H0 Null hypothesis
H1 Alternative hypothesis Sometimes denoted Ha
μ0 The population mean under the null hypothesis
T Random test statistic
S Estimator of the standard deviation of X
SE Estimator of the standard error, i.e. standard deviation of the sample mean Equal to Sn
t Observed test statistic Sometimes called the 't value'
ˉx Observed sample mean
s Observed standard deviation
se Observed standard error, i.e. observed standard deviation of the sample mean Equal to sn
p p-value
α Significance level, such that if p<α, we reject H0 Usually α=0.05, but different values for α can be chosen
Type I error The error that occurs when we reject H0 when H0 is true
Type II error The error that occurs when we fail to reject H0 when H0 is false
tdf,1α/2 The value from the tdf distribution such that P(Ttdf,1α/2)=1α/2, i.e. the (1α/2)th quantile. For example, if α=0.05, 1α/2=10.05/2=10.025=0.975. We then have that P(Ttdf,0.975)=0.975
d Effect size, Cohen's d We use Cohen's d for t-tests
ANOVA ANalysis Of VAriance
F-distribution The distribution used for ANOVA Hypothesis tests To define the F-distribution, we need to know d1 and d2
d1 Degrees of freedom 1
d2 Degrees of freedom 2
N Total sample size (in the context of ANOVA)
k Number of groups (in the context of ANOVA)
Fd1,d2 F-distribution with degrees of freedom 1 equal to d1 and degrees of freedom 2 equal to d2. For example, if d1=3 and d2=45, then our distribution is F3,45
η2 Effect size, 'eta squared' η is a Greek letter pronounced 'eta'. We use η2 for One-way ANOVA tests
x In the context of simple linear regression, x is the explanatory variable, also referred to as the independent variable or predictor variable
y In the context of simple linear regression, y is the response variable, also referred to as the dependent variable
β0 Intercept coefficient in the simple linear regression model β is a Greek letter pronounced 'beta'. β0 is pronounced 'beta nought'
β1 Slope coefficient in the simple linear regression model β is a Greek letter pronounced 'beta'. β1 is pronounced 'beta 1'
ϵ Random error term in the simple linear regression model
ˆy The estimated value for y based on a simple linear regression model The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆy is pronounced 'y-hat'
ˆβ0 The estimated value for β0 The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆβ0 is pronounced 'β0-hat'
ˆβ1 The estimated value for β1 The estimated value for β0
R2 Coefficient of Determination. This value can be used to evaluate the fit of a simple linear regression model and is also the correlation squared.
p In the context of a one-sample test of proportions, p is the proportion of a population with a certain characteristic Not to be confused with the p-value in the context of hypothesis testing
n In the context of a one-sample test of proportions, n either is the number of observations in a random sample, or the number of independent trials
x In the context of a one-sample test of proportions, x is either the number of observations in the sample that have a certain characteristics, or the number of success in n trials
p0 The population proportion under the null hypothesis
ˆp The estimate of p
p1 In the context of a two-sample test of proportions, p1 is the proportion of Population 1 (or Group 1) with a certain characteristic
p2 In the context of a two-sample test of proportions, p2 is the proportion of Population 2 (or Group 2) with a certain characteristic
n1 In the context of a two-sample test of proportions, n1 is the sample size from Population 1 (or Group 1)
n2 In the context of a two-sample test of proportions, n2 is the sample size from Population 2 (or Group 2)
x1 In the context of a two-sample test of proportions, x1 is the number of individuals in the sample from Population 1 (or Group 1) exhibiting the trait of interest
x2 In the context of a two-sample test of proportions, x2 is the number of individuals in the sample from Population 2 (or Group 2) exhibiting the trait of interest
ˆp1 The estimate of p1
ˆp2 The estimate of p2
χ2-distribution The distribution used for Chi-squared tests χ is a Greek letter pronounced 'ky'
χ2df χ2-distribution with degrees of freedom equal to df. For example, if df=5, then our distribution is χ25
X2 The random test statistic for a Chi-squared test. X2χ2df under H0
χ2 The observed test statistic for a Chi-squared test
Oi The observed frequency in the ith category in a Chi-squared goodness of fit test
Ei The expected frequency for the ith category in a Chi-squared goodness of fit test
k The number of categories in a Chi-squared goodness of fit test
Oij The observed frequency in the ith row and the jth column in a Chi-squared test of independence
Eij The expected frequency of the ith row and the jth column in a Chi-squared test of independence
r The number of rows in a Chi-squared test of independence Not to be confused with the sample correlation r in the context of correlation
c The number of columns in a Chi-squared test of independence
α The probability of a Type I Error This is also the significance level
β The probability of a Type II Error Not to be confused with β0 or β1 in the context of simple linear regression