Chapter 2 Notation Summary
This chapter provides a summary of notation used throughout STM1001, designed to be used as a reference. For convenience, the chapter is divided into sections based on the topic for which the notation is most relevant, starting with a section containing general notation used throughout STM1001. The final section contains a consolidated table of all notation used throughout STM1001.
2.1 General notation used throughought STM1001
The table below provides a summary of general notation used throughout STM1001. Most of this notation is introduced early on in the subject, during Topics 1-4, and is then used throughout the subject.
Notation | Meaning | Comments |
---|---|---|
n | Sample size | |
xi | The ith x value, usually from a list of n x values, e.g. (x1,x2,…,xn), where i can take any value from 1 to n | |
∑ | Summation sign | |
n∑i=1 | The sum from i=1 to n | See the previous chapter for an example |
μ | Population mean | This is a Greek letter pronounced 'm-yoo' |
¯x | Sample mean | The line on top of the x is referred to as a 'bar', so that ¯x is pronounced 'x bar' |
σ2 | Population variance | σ is a Greek letter pronounced 'sigma' |
s2 | Sample variance | |
σ | Population standard deviation | σ is a Greek letter pronounced 'sigma' |
s | Sample standard deviation | |
Q1 | Quartile 1: 25% quantile | |
Q2 | Quartile 2: 50% quantile, and also the median | |
Q3 | Quartile 3: 75% quantile | |
ρ | Population correlation | This is a Greek letter pronounced 'rho' |
r | Sample correlation | |
|x| | The 'absolute value' of some number x | See the previous chapter for an example |
X | A random variable (discrete or continuous) | |
E(X) | Expected value (or mean) of X | |
Var(X) | Variance of X | |
SD(X) | Standard deviation of X | |
¯X | Sample mean (random) | |
± | Plus or minus. For example, x±a=(x−a,x+a) |
2.2 Topics 3 and 4: Probability, Distributions and Sampling Distributions
The following table provides a summary of notation used in Topics 3 and 4.
The lecture slides for Topic 3 can be found here.
The readings for Topic 3 can be found here.
The lecture slides for Topic 4 can be found here.
The readings for Topic 4 can be found here.
Notation | Meaning | Comments |
---|---|---|
Ω | Sample space | This is a Greek letter pronounced 'omega' |
∅ | Null event (or null set) | Also sometimes referred to as 'empty set' |
P(A) | The probability of event A | |
AC | The complement of A, or "not A" | Sometimes denoted as A′ |
X | A random variable (discrete or continuous) | |
P(X=x) | The probability that the random variable X takes the value x. For example, let X denote the number of times you check this document. Then P(X=2) denotes the probability that you check this document exactly 2 times. | |
E(X) | Expected value (or mean) of X | |
Var(X) | Variance of X | |
SD(X) | Standard deviation of X | |
¯X | Sample mean (random) | |
X∼N(μ,σ2) | X follows a Normal distribution with mean μ and variance σ2 | To define a Normal distribution, we need to know the mean and variance |
Z∼N(0,1) | Z follows the "Standard Normal distribution", that is, a Normal distribution with μ=0 and σ2=1. The usual convention is to use Z instead of X when using the standard Normal distribution. | The variance of Z is σ2=12=1. The standard deviation is also 1, since √σ2=√12=1 |
z | z-score, defined as z=x−μσ | For a given value of x, the corresponding z-score can be thought of as its "standardised" value |
± | Plus or minus. For example, x±a=(x−a,x+a) |
2.3 Topics 5 and 6: Hypothesis testing and t-tests
The following table provides a summary of notation used in Topics 5 and 6.
The lecture slides for Topic 5 can be found here.
The readings for Topic 5 can be found here.
The lecture slides for Topic 6 can be found here.
The readings for Topic 6 can be found here.
Notation | Meaning | Comments |
---|---|---|
± | Plus or minus. For example, x±a=(x−a,x+a) | |
t-distribution | The distribution used for t-tests | To define a t-distribution, we need to know the degrees of freedom |
df | Degrees of freedom | |
T∼tdf | T follows a t-distribution with degrees of freedom equal to df. For example, if df=1, then T∼t1 | |
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
μ0 | The population mean under the null hypothesis | |
T | Random test statistic | |
¯X | Sample mean (random) | |
S | Estimator of the standard deviation of X | |
SE | Estimator of the standard error, i.e. standard deviation of the sample mean | Equal to S√n |
t | Observed test statistic | Sometimes called the 't value' |
ˉx | Observed sample mean | |
s | Observed standard deviation | |
se | Observed standard error, i.e. observed standard deviation of the sample mean | Equal to s√n |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
tdf,1−α/2 | The value from the tdf distribution such that P(T≤tdf,1−α/2)=1−α/2, i.e. the (1−α/2)th quantile. For example, if α=0.05, 1−α/2=1−0.05/2=1−0.025=0.975. We then have that P(T≤tdf,0.975)=0.975 | |
d | Effect size, Cohen's d | We use Cohen's d for t-tests |
2.4 Topic 7: One-way ANOVA
The following table provides a summary of notation used in Topic 7.
The lecture slides for Topic 7 can be found here.
The readings for Topic 7 can be found here.
Notation | Meaning | Comments |
---|---|---|
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
ANOVA | ANalysis Of VAriance | |
F-distribution | The distribution used for ANOVA Hypothesis tests | To define the F-distribution, we need to know d1 and d2 |
d1 | Degrees of freedom 1 | |
d2 | Degrees of freedom 2 | |
N | Total sample size (in the context of ANOVA) | |
k | Number of groups (in the context of ANOVA) | |
Fd1,d2 | F-distribution with degrees of freedom 1 equal to d1 and degrees of freedom 2 equal to d2. For example, if d1=3 and d2=45, then our distribution is F3,45 | |
η2 | Effect size, 'eta squared' | η is a Greek letter pronounced 'eta'. We use η2 for One-way ANOVA tests |
2.5 Topic 8: Correlation and Simple Linear Regression
The following table provides a summary of notation used in Topic 8.
The lecture slides for Topic 8 can be found here.
The readings for Topic 8 can be found here.
Notation | Meaning | Comments |
---|---|---|
ρ | Population correlation | This is a Greek letter pronounced 'rho' |
r | Sample correlation | |
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
x | The explanatory variable, also referred to as the independent variable or predictor variable | |
y | The response variable, also referred to as the dependent variable | |
β0 | Intercept coefficient in the simple linear regression model | β is a Greek letter pronounced 'beta'. β0 is pronounced 'beta nought' |
β1 | Slope coefficient in the simple linear regression model | β is a Greek letter pronounced 'beta'. β1 is pronounced 'beta 1' |
ϵ | Random error term in the simple linear regression model | |
ˆy | The estimated value for y based on a simple linear regression model | The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆy is pronounced 'y-hat' |
ˆβ0 | The estimated value for β0 | The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆβ0 is pronounced 'β0-hat' |
ˆβ1 | The estimated value for β1 | The estimated value for β0 |
R2 | Coefficient of Determination. This value can be used to evaluate the fit of a simple linear regression model and is also the correlation squared. |
2.6 Topic 9: Hypothesis testing for one and two sample proportions
The following table provides a summary of notation used in Topic 9.
The lecture slides for Topic 9 can be found here.
The readings for Topic 9 can be found here.
Notation | Meaning | Comments |
---|---|---|
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
p | In the context of a one-sample test of proportions, p is the proportion of a population with a certain characteristic | Not to be confused with the p-value in the context of hypothesis testing |
n | In the context of a one-sample test of proportions, n either is the number of observations in a random sample, or the number of independent trials | |
x | In the context of a one-sample test of proportions, x is either the number of observations in the sample that have a certain characteristics, or the number of success in n trials | |
p0 | The population proportion under the null hypothesis | |
ˆp | The estimate of p | |
p1 | The proportion of Population 1 (or Group 1) with a certain characteristic | |
p2 | The proportion of Population 2 (or Group 2) with a certain characteristic | |
n1 | The sample size from Population 1 (or Group 1) | |
n2 | The sample size from Population 2 (or Group 2) | |
x1 | The number of individuals in the sample from Population 1 (or Group 1) exhibiting the trait of interest | |
x2 | The number of individuals in the sample from Population 2 (or Group 2) exhibiting the trait of interest | |
ˆp1 | The estimate of p1 | |
ˆp2 | The estimate of p2 |
2.7 Topic 10: Chi-squared tests for categorical data
The following table provides a summary of notation used in Topic 10.
The lecture slides for Topic 10 can be found here.
The readings for Topic 10 can be found here.
Notation | Meaning | Comments |
---|---|---|
∑ | Summation sign | |
n∑i=1 | The sum from i=1 to n | See the previous chapter for an example |
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
χ2-distribution | The distribution used for Chi-squared tests | χ is a Greek letter pronounced 'ky' |
χ2df | χ2-distribution with degrees of freedom equal to df. For example, if df=5, then our distribution is χ25 | |
X2 | The random test statistic for a Chi-squared test. X2∼χ2df under H0 | |
χ2 | The observed test statistic for a Chi-squared test | |
Oi | The observed frequency in the ith category in a Chi-squared goodness of fit test | |
Ei | The expected frequency for the ith category in a Chi-squared goodness of fit test | |
k | The number of categories in a Chi-squared goodness of fit test | |
Oij | The observed frequency in the ith row and the jth column in a Chi-squared test of independence | |
Eij | The expected frequency of the ith row and the jth column in a Chi-squared test of independence | |
r | The number of rows in a Chi-squared test of independence | Not to be confused with the sample correlation r in the context of correlation |
c | The number of columns in a Chi-squared test of independence |
2.8 Topic 11: Statistical Power and Sample Size Calculation
The following table provides a summary of notation used in Topic 11.
The lecture slides for Topic 11 can be found here.
The readings for Topic 11 can be found here.
Notation | Meaning | Comments |
---|---|---|
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
α | The probability of a Type I Error | This is also the significance level |
β | The probability of a Type II Error | Not to be confused with β0 or β1 in the context of simple linear regression |
2.9 Complete table of all notation used throughout STM1001
The following table provides a summary of all notation used in STM1001. All notation summarised in the previous sections of this chapter are provided in the consolidated table below.
Notation | Meaning | Comments |
---|---|---|
n | Sample size | |
xi | The ith x value, usually from a list of n x values, e.g. (x1,x2,…,xn), where i can take any value from 1 to n | |
∑ | Summation sign | |
n∑i=1 | The sum from i=1 to n | See the previous chapter for an example |
μ | Population mean | This is a Greek letter pronounced 'm-yoo' |
¯x | Sample mean | The line on top of the x is referred to as a 'bar', so that ¯x is pronounced 'x bar' |
σ2 | Population variance | σ is a Greek letter pronounced 'sigma' |
s2 | Sample variance | |
σ | Population standard deviation | σ is a Greek letter pronounced 'sigma' |
s | Sample standard deviation | |
Q1 | Quartile 1: 25% quantile | |
Q2 | Quartile 2: 50% quantile, and also the median | |
Q3 | Quartile 3: 75% quantile | |
ρ | Population correlation | This is a Greek letter pronounced 'rho' |
r | Sample correlation | |
|x| | The 'absolute value' of some number x | See the previous chapter for an example |
Ω | Sample space | This is a Greek letter pronounced 'omega' |
∅ | Null event (or null set) | Also sometimes referred to as 'empty set' |
P(A) | The probability of event A | |
AC | The complement of A, or "not A" | Sometimes denoted as A′ |
X | A random variable (discrete or continuous) | |
P(X=x) | The probability that the random variable X takes the value x. For example, let X denote the number of times you check this document. Then P(X=2) denotes the probability that you check this document exactly 2 times. | |
E(X) | Expected value (or mean) of X | |
Var(X) | Variance of X | |
SD(X) | Standard deviation of X | |
¯X | Sample mean (random) | |
X∼N(μ,σ2) | X follows a Normal distribution with mean μ and variance σ2 | To define a Normal distribution, we need to know the mean and variance |
Z∼N(0,1) | Z follows the "Standard Normal distribution", that is, a Normal distribution with μ=0 and σ2=1. The usual convention is to use Z instead of X when using the standard Normal distribution. | The variance of Z is σ2=12=1. The standard deviation is also 1, since √σ2=√12=1 |
z | z-score, defined as z=x−μσ | For a given value of x, the corresponding z-score can be thought of as its "standardised" value |
± | Plus or minus. For example, x±a=(x−a,x+a) | |
t-distribution | The distribution used for t-tests | To define a t-distribution, we need to know the degrees of freedom |
df | Degrees of freedom | |
T∼tdf | T follows a t-distribution with degrees of freedom equal to df. For example, if df=1, then T∼t1 | |
H0 | Null hypothesis | |
H1 | Alternative hypothesis | Sometimes denoted Ha |
μ0 | The population mean under the null hypothesis | |
T | Random test statistic | |
S | Estimator of the standard deviation of X | |
SE | Estimator of the standard error, i.e. standard deviation of the sample mean | Equal to S√n |
t | Observed test statistic | Sometimes called the 't value' |
ˉx | Observed sample mean | |
s | Observed standard deviation | |
se | Observed standard error, i.e. observed standard deviation of the sample mean | Equal to s√n |
p | p-value | |
α | Significance level, such that if p<α, we reject H0 | Usually α=0.05, but different values for α can be chosen |
Type I error | The error that occurs when we reject H0 when H0 is true | |
Type II error | The error that occurs when we fail to reject H0 when H0 is false | |
tdf,1−α/2 | The value from the tdf distribution such that P(T≤tdf,1−α/2)=1−α/2, i.e. the (1−α/2)th quantile. For example, if α=0.05, 1−α/2=1−0.05/2=1−0.025=0.975. We then have that P(T≤tdf,0.975)=0.975 | |
d | Effect size, Cohen's d | We use Cohen's d for t-tests |
ANOVA | ANalysis Of VAriance | |
F-distribution | The distribution used for ANOVA Hypothesis tests | To define the F-distribution, we need to know d1 and d2 |
d1 | Degrees of freedom 1 | |
d2 | Degrees of freedom 2 | |
N | Total sample size (in the context of ANOVA) | |
k | Number of groups (in the context of ANOVA) | |
Fd1,d2 | F-distribution with degrees of freedom 1 equal to d1 and degrees of freedom 2 equal to d2. For example, if d1=3 and d2=45, then our distribution is F3,45 | |
η2 | Effect size, 'eta squared' | η is a Greek letter pronounced 'eta'. We use η2 for One-way ANOVA tests |
x | In the context of simple linear regression, x is the explanatory variable, also referred to as the independent variable or predictor variable | |
y | In the context of simple linear regression, y is the response variable, also referred to as the dependent variable | |
β0 | Intercept coefficient in the simple linear regression model | β is a Greek letter pronounced 'beta'. β0 is pronounced 'beta nought' |
β1 | Slope coefficient in the simple linear regression model | β is a Greek letter pronounced 'beta'. β1 is pronounced 'beta 1' |
ϵ | Random error term in the simple linear regression model | |
ˆy | The estimated value for y based on a simple linear regression model | The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆy is pronounced 'y-hat' |
ˆβ0 | The estimated value for β0 | The "^" symbol is referred to as a 'hat' and is normally used to denote an estimate, so that ˆβ0 is pronounced 'β0-hat' |
ˆβ1 | The estimated value for β1 | The estimated value for β0 |
R2 | Coefficient of Determination. This value can be used to evaluate the fit of a simple linear regression model and is also the correlation squared. | |
p | In the context of a one-sample test of proportions, p is the proportion of a population with a certain characteristic | Not to be confused with the p-value in the context of hypothesis testing |
n | In the context of a one-sample test of proportions, n either is the number of observations in a random sample, or the number of independent trials | |
x | In the context of a one-sample test of proportions, x is either the number of observations in the sample that have a certain characteristics, or the number of success in n trials | |
p0 | The population proportion under the null hypothesis | |
ˆp | The estimate of p | |
p1 | In the context of a two-sample test of proportions, p1 is the proportion of Population 1 (or Group 1) with a certain characteristic | |
p2 | In the context of a two-sample test of proportions, p2 is the proportion of Population 2 (or Group 2) with a certain characteristic | |
n1 | In the context of a two-sample test of proportions, n1 is the sample size from Population 1 (or Group 1) | |
n2 | In the context of a two-sample test of proportions, n2 is the sample size from Population 2 (or Group 2) | |
x1 | In the context of a two-sample test of proportions, x1 is the number of individuals in the sample from Population 1 (or Group 1) exhibiting the trait of interest | |
x2 | In the context of a two-sample test of proportions, x2 is the number of individuals in the sample from Population 2 (or Group 2) exhibiting the trait of interest | |
ˆp1 | The estimate of p1 | |
ˆp2 | The estimate of p2 | |
χ2-distribution | The distribution used for Chi-squared tests | χ is a Greek letter pronounced 'ky' |
χ2df | χ2-distribution with degrees of freedom equal to df. For example, if df=5, then our distribution is χ25 | |
X2 | The random test statistic for a Chi-squared test. X2∼χ2df under H0 | |
χ2 | The observed test statistic for a Chi-squared test | |
Oi | The observed frequency in the ith category in a Chi-squared goodness of fit test | |
Ei | The expected frequency for the ith category in a Chi-squared goodness of fit test | |
k | The number of categories in a Chi-squared goodness of fit test | |
Oij | The observed frequency in the ith row and the jth column in a Chi-squared test of independence | |
Eij | The expected frequency of the ith row and the jth column in a Chi-squared test of independence | |
r | The number of rows in a Chi-squared test of independence | Not to be confused with the sample correlation r in the context of correlation |
c | The number of columns in a Chi-squared test of independence | |
α | The probability of a Type I Error | This is also the significance level |
β | The probability of a Type II Error | Not to be confused with β0 or β1 in the context of simple linear regression |