## 7.6 Prediction: Mean

• Model = Mathematical equation(s)
• Underlying a model is always a (joint) distribution
• Model summarizes (joint) distribution with fewer parameters
• e.g. intercept/coefficents in linear model
• Simple model: Mean of the distribution of a variable
$$\bar{y} = \frac{y_{1}+y_{2}+\cdots +y_{n}}{n}=\frac{\sum_{i}^{n} trust2006_{i}}{n} = \frac{40668}{6633} = 6.13$$

$$y_{i} = \underbrace{\color{blue}{\overline{y}}}_{\color{green}{\widehat{y}}_{i}} \pm \color{red}{\varepsilon}_{i}$$

\begin{aligned} trust2006_{Anna} = 3 = \underbrace{\color{blue}{\overline{y}}}_{\color{green}{\widehat{y}}_{Anna}} \pm \color{red}{\varepsilon}_{Anna} = \color{blue}{6.13} - \color{red}{3.13} \end{aligned}

• Mean (= model) predicts Anna’s value with a certain error
• Q: How well does the model (mean = 6.33) predict person’s that have values of 0, of 3 or of 4?
• Important: We could use this model – this mean – to predict trust values of another group of people
• First train model (= calculate mean) on this data (training dataset), then use it to predict outcome in other data (validation dataset)
• Sometime this is called out of sample prediction