Statistics 1
Preface
Statistics as the science of data
Branches of statistics
1
Statistical data
1.1
Observational studies and experiments
1.2
Population and sample
1.3
Sources of statistical data
1.4
Quantitative and qualitative variables
1.5
Measurement scales
1.5.1
Four main scales of measurement
1.5.2
Why measurement scales matter
1.5.3
Other scales
1.6
Other data classifications
1.7
Numbers
1.7.1
Names: short and long scale
1.7.2
Decimal symbol and thousands separator
1.7.3
Scientific/engineering notation
1.7.4
Percentages
1.8
Links
1.9
Questions
1.9.1
Discussion questions
1.9.2
Test questions
One variable distribution
2
Empirical distribution
2.1
Presenting empirical distribution in tables
2.1.1
Individual series (raw data)
2.1.2
Discrete frequency distribution table
2.1.3
Grouped (interval) frequency distribution table
2.2
Visualization of qualitative variable distributions
2.2.1
Bar charts
2.2.2
Stacked bar charts
2.2.3
Pie charts
2.3
Histogram – visualisation of quantitative variable distributions
2.3.1
What is on the Y-axis?
2.3.2
Shapes of histograms
2.3.3
Number of Class Intervals
2.3.4
Kernel density estimator
2.3.5
Violin plot
2.4
Empirical cumulative distribution function
2.5
Links
2.6
Exercises
3
Central tendency and positional measures
3.1
Mean
3.1.1
Arithmetic mean
3.1.2
Weighted arithmetic mean
3.1.3
Harmonic mean
3.1.4
Geometric mean
3.2
Median
3.2.1
Approximating the median from a grouped frequency distribution
3.3
Mode
3.3.1
Determining the mode from an interval distribution series
3.3.2
Relationship between mode, median, and mean
3.4
Quantiles
3.4.1
Quartiles
3.4.2
Two meanings of the word quartile
3.4.3
Quintiles
3.4.4
Deciles
3.4.5
Percentiles
3.4.6
Determining Quantiles in Practice
3.5
Links
3.6
Discussion questions
3.7
Exercises
4
Measures of dispersion
4.1
Standard deviation
4.1.1
Variance
4.1.2
Coefficient of variation
4.1.3
Using the standard deviation
4.1.4
The standard deviation is not the mean deviation
4.2
Interquartile range
4.2.1
Interquartile deviation and positional coefficient of variation
4.2.2
Decile range
4.2.3
Range
4.3
Boxplot
4.4
Links
4.5
Discussion questions
4.6
Test questions
4.7
Exercises
5
Standardization and the normal distribution
5.1
Data standardization (z-score)
5.2
Normal distribution
5.3
Empirical rule
5.4
Chebyshev’s Inequality
5.5
Test questions
6
Distribution shape measures
6.1
Skewness
6.1.1
Fisher’s moment coefficient of skewness
6.1.2
Other measures of skewness
6.2
Kurtosis
6.3
Outliers
6.3.1
Identifying outliers using position measures
6.3.2
Identifying outliers using z-scores
6.4
Links
7
The Gini index
7.1
Graphical interpretation
7.2
Formula
Analysis of association
8
Correlation
8.1
Scatter plot
8.2
Pearson correlation coefficient
8.2.1
Correlation coefficient — formula
8.2.2
Pearson correlation coefficient – properties
8.2.3
What counts as a strong correlation?
8.2.4
Covariance
8.3
Association and causation
8.4
Spearman rank correlation coefficient
8.4.1
Converting values to ranks
8.4.2
Properties of the Spearman correlation coefficient
8.5
Kendall’s Tau (
\(\tau\)
)
8.6
Correlation test
8.7
Links
8.8
Questions
8.8.1
Discussion questions
8.8.2
Test questions
8.9
Exercises
9
Simple regression
9.1
Standard deviation line on a scatter plot
9.2
Conditional averages on a scatter plot
9.3
Simple linear regression
9.3.1
Variable names
9.3.2
Regression line on a scatter plot
9.3.3
Formula
9.3.4
Residuals and least squares
9.3.5
R-squared
9.4
Using and interpreting regression
9.4.1
Purpose of regression models
9.4.2
Interpretation of the fitted model
9.4.3
Prediction
9.5
Regression to the mean
9.6
Regression vs correlation
9.7
Variable transformation
9.7.1
Prediction Using a Log–Log Linear Model
9.8
Limitations of linear regression
9.9
Links
9.10
Questions
9.10.1
Discussion questions
9.10.2
Test questions
10
Multiple regression
10.1
Formula
10.2
Interpretation
10.3
Binary variables in linear regression
10.4
Matrix notation
10.5
Links
11
Association in categorical data
11.1
Contingency table
11.2
Cramér’s V
11.3
Chi-squared independence test
11.4
Correlation ratio and eta-squared
11.5
Links
11.6
Exercises
12
Association in binary data
12.1
2x2 tables
12.1.1
Phi coefficient
12.1.2
Odds ratio
12.1.3
Confusion matrix
12.2
Binary and quantitative variables
12.2.1
Point-biserial correlation
12.2.2
Cohen’s d
12.3
Binary and ordinal variables
12.3.1
AUC
12.3.2
Somers’ D
12.4
Links
12.5
Exercises
Measuring change over time
13
Time series and index numbers
13.1
Time series
13.1.1
Flow and stock
13.2
Absolute and relative change
13.3
Index numbers
13.3.1
Chain index
13.3.2
Fixed-based index
13.4
Average Rate of Change
13.5
Aggregate Price and Quantity Indices
13.5.1
Laspeyres Index
13.5.2
Paasche Index
13.5.3
Fisher Index
13.6
Exercises
14
Trend and seasonal fluctuations
14.1
Components of a time series
14.2
Moving average
14.3
Linear trend model
14.4
Additive and multiplicative seasonal fluctuations
Literature
Published with bookdown
Statistics 1
Chapter 14
Trend and seasonal fluctuations
14.1
Components of a time series
14.2
Moving average
14.3
Linear trend model
14.4
Additive and multiplicative seasonal fluctuations