3.2 Invariant estimators

Let us introduce the concept of invariance with an example.

Example 3.7 The manufacturer of a given product claims that the product packages contain at least $\theta$ grams of product. If this claim is true, then the product content within a package is distributed as $\mathcal{U}(\theta, \theta+100).$ To check the claim of the manufacturer, a srs measuring the content was taken. The realization of this srs is $(x_1,\ldots,x_n)$ and is used to compute an estimate $\hat{\theta}(x_1,\ldots,x_n).$ But, after computing the estimate, it is discovered that the balance was weighing systematically $c$ grams less. Can we just simply correct the estimate as $\hat{\theta}(x_1,\ldots,x_n)+c$ ?

The answer to the question depends on whether the estimator verifies

$\begin{align*} \hat{\theta}(x_1+c,\ldots,x_n+c)=\hat{\theta}(x_1,\ldots,x_n)+c. \end{align*}$

If this is not the case, then we will need to compute $\hat{\theta}(x_1+c,\ldots,x_n+c)$ without being able to reuse $\hat{\theta}(x_1,\ldots,x_n).$

Definition 3.3 (Translation-invariant estimator) An estimator $\hat{\theta}$ is translation-invariant if, for sample realization $(x_1,\ldots,x_n)$ and any $c\in\mathbb{R},$

$\begin{align*} \hat{\theta}(x_1+c,\ldots,x_n+c)=\hat{\theta}(x_1,\ldots,x_n)+c. \end{align*}$

Example 3.8 Check that $X_{(1)},$ $\bar{X},$ and $(X_{(1)}+X_{(n)})/2$ are statistics invariant to translations, but that the geometric mean $(\prod_{i=1}^n X_i)^{1/n}$ and the harmonic mean $n/\sum_{i=1}^n X_i^{-1}$ are not.

For $X_{(1)}=\min_{1\leq i\leq n} X_i,$ we have

$\begin{align*} \min_{1\leq i\leq n} (X_i+c)=\min_{1\leq i\leq n} (X_i)+c. \end{align*}$

Therefore, $X_{(1)}$ is translation-invariant. For $\bar{X}=(1/n)\sum_{i=1}^n X_i,$

$\begin{align*} \frac{1}{n}\sum_{i=1}^n (X_i+c)=\frac{1}{n}\left[\sum_{i=1}^n X_i+nc\right]=\frac{1}{n}\sum_{i=1}^n X_i+c=\bar{X}+c, \end{align*}$

so $\bar{X}$ is translation-invariant too. We now check $(X_{(1)}+X_{(n)})/2$ :

$\begin{align*} \frac{1}{2}\left[\min_{1\leq i\leq n}(X_i+c)+\max_{1\leq i\leq n}(X_i+c)\right]&=\frac{1}{2}\left[\min_{1\leq i\leq n}X_i+\max_{1\leq i\leq n}X_i+2c\right]\\ &=\frac{1}{2}\left(X_{(1)}+X_{(n)}\right)+c. \end{align*}$

To see that neither the geometric nor the harmonic means are invariant to translations, we only need to find counterexamples. For that, consider the sample realization $(x_1,x_2,x_3)=(1,2,3).$ For these data, the geometric and harmonic means are, respectively,

$\begin{align*} \left[\prod_{i=1}^n x_i\right]^{1/n}&=(1\times 2\times 3)^{1/3}=6^{1/3}=1.82,\\ \frac{n}{\sum_{i=1}^n x_i^{-1}}&=\frac{3}{1+\frac{1}{2}+\frac{1}{3}}=\frac{18}{11}=1.64. \end{align*}$

However, if we take $c=1$ :

$\begin{align*} \left[\prod_{i=1}^n (x_i+c)\right]^{1/n}&=(2\times 3\times 4)^{1/3}=2.88\neq 1.82+1,\\ \frac{n}{\sum_{i=1}^n (x_i+c)^{-1}}&=\frac{3}{\frac{1}{2}+\frac{1}{3}+\frac{1}{4}}=\frac{36}{13}=2.77\neq 1.64+1, \end{align*}$

and we see that none of these statistics is translation-invariant.

Example 3.9 A woman always arrives to the bus stop at the same hour. She wishes to estimate the maximum time waiting for the bus, knowing that the waiting time is distributed as $\mathcal{U}(0,\theta).$ For that purpose, she times the waiting times during $n$ days and obtains a realization of a srs, $(x_1,\ldots,x_n),$ measured in seconds. Based on that sample, she obtains an estimate of the maximum waiting time $\hat{\theta}(x_1,\ldots,x_n)$ in seconds. If she wants to convert the result to minutes, can she just compute $\hat{\theta}(x_1,\ldots,x_n)/60$ ?

The answer depends on whether the estimator satisfies

$\begin{align*} \hat{\theta}(x_1/60,\ldots,x_n/60)=\hat{\theta}(x_1,\ldots,x_n)/60. \end{align*}$

If this is not the case, then she will need to compute $\hat{\theta}(x_1/60,\ldots,x_n/60).$

Definition 3.4 (Scale-invariant estimators) An estimator $\hat{\theta}$ is scale-invariant if, for sample realization $(x_1,\ldots,x_n)$ and any $c>0,$

$\begin{align*} \hat{\theta}(cx_1,\ldots,cx_n)=c\,\hat{\theta}(x_1,\ldots,x_n). \end{align*}$

Example 3.10 Check that $\bar{X}$ and $X_{(n)}$ are scale-invariant estimators and that $\log((1/n)\sum_{i=1}^n\exp({X_i}))$ and $X_{(n)}/X_{(1)}$ are not.