Chapter 8 Motivation for Statistical Inference

8.1 Introduction

In Section 1, we introduced the statistical paradigm and the concepts of a population and a sample (see Section 1.2). We have seen a range of statistics, in particular, those for measuring location (mean, median, mode) and those for measuring spread (variance, interquartile range, range) derived from a sample. In the proceeding Sections we have introduced probability as a means for measuring and encapsulating uncertainty. We now begin to combine these ideas to link sample statistics to population statistics.

8.2 Motivating example

From a sample of \(52\) university students, four individuals were found to be left-handed. We can easily summarise the sample information as the proportion \(\frac{4}{52} = \frac{1}{13}\). However:

  • What are we able to say about the population?
  • What proportion of all university students are left-handed?
  • Is \(\frac{1}{13}\) a good estimate and what do we mean by ‘good’?

We identify the important features of statistical inference in this example. Here, the population are all university students. The population has some parameter or characteritic, \(\theta\), which we wish to estimate. In this example, \(\theta\) is the probability of an individual being left-handed.

From the population we take a random sample which means each member of the population has an equal chance of being chosen. The sample gives rise to data \(x_1, x_2, \ldots, x_n\). We estimate the parameter \(\theta\) by means of a statistic \(T(x_1, x_2, \ldots, x_n)\).

8.3 Modelling assumptions

  1. Identically distributed assumption: Every sample observation (data point) \(x\) is the outcome of a random variable \(X\) which has an identical distribution (either discrete or continuous) for every member of the population.

  2. Independence assumption: The random variables \(X_1, X_2, \ldots, X_n\) which give rise to the data points \(x_1, x_2, \ldots, x_n\) are independent.

Note that we defined a random sample to be a set of i.i.d. random variables. See Section 6.4 for further details on independence and identically distributed.

The subtle point here is that we are treating the observed data as just one possible outcome from the many different outcomes that could occur.

8.4 Parametric models

In the parametric approach to statistics (inference), we assume that the random sample that we collect was generated by some specific probability distribution which is completely known, except for a small number of parameters. For example:

  • we could assume that the annual income in the U.K. is normally distributed but we don’t know its mean, \(\mu\), or its variance, \(\sigma^2\);
  • in studying the effectiveness of a certain drug’s ability to decrease the size of tumours in laboratory rats, we assume that the outcome of the tumour size being decreased (or not) has a Binomial distribution where \(n\) is the known sample size and \(p\) is the unknown probability of a successful treatment of one tumour.

There are a number of approaches to determine the underlying model that should be used:

  • Physical argument, e.g. counts of events from a Poisson process follow a Poisson distribution;
  • Mathematical argument, e.g. central limit theorem leading to a normal distribution;
  • Flexible model which fairly arbitrarily covers a wide range of possibilities.