# 6 Statistics

This note was a TP to render as part of my Statistics program. Unfortunately, for legal reasons, I can only put a small part.

times <- c(75, 265, 225, 402, 35, 105, 411, 346, 159, 229,
62, 256, 431, 177, 56, 144, 354, 178, 386, 294)

## 6.1 Determine a probable probability law for these data.

h1<-hist(times, breaks=quantile(times,seq(0,1,0.1)),
main="Histogram")

cumfreq1<-cumsum(h1$counts) / length(times) plot(h1$breaks,c(0, cumfreq1), "l",
main = "Distribution function",
xlab = "times",ylab="cumul")
lines(h1$breaks,punif(h1$breaks,min(times),max(times)),col="red")

ks.test(times,"punif")
##
##  One-sample Kolmogorov-Smirnov test
##
## data:  times
## D = 1, p-value < 2.2e-16
## alternative hypothesis: two-sided

$f(x) = \left\{ \begin{array}{rl} \frac{1}{\theta} & \mbox{if } 0 \leq x \leq \theta, \\ 0 & \mbox{if not } \end{array} \right.$

## 6.2 Estimate the parameter of this law by the method of moments.

The expection of the Uniform law is: $\mathbb{E}(X) = \frac{\theta}{2}$ $\Rightarrow \hat{\theta}^{MM} = 2 \bar{X}_n$

theta_mm <- 2*mean(times); theta_mm
## [1] 459

The estimator by the method of moments is then equal to twice the empirical mean.

## 6.3 Estimate this parameter by the maximum likelihood method.

The likelihood is given by: $\mathcal{L}(x_1, x_2,...,x_n, \theta) = \prod_{i=1}^n f(x_i)$

$\mathcal{L}(x_1, x_2,...,x_n, \theta) = \frac{1}{\theta^n}$

The function $\theta \rightarrow \mathcal{L}(x_1, x_2,...,x_n, \theta)$ is strictly decreasing, so it reaches its maximum when $$\theta$$ is minimum. And like: $\forall i \in \{1,...,r\}: \theta \geq x_i$

It follows that $$\theta$$ is minimum when : $\theta=max(x_1,...,x_n)$

So the maximum likelihood estimator of a r-sample of a structure uniformity is: $\hat{\theta} = max(X_1,...,X_n)$

To determine the probability density of $$\hat{\theta}$$, let’s first calculate its distribution function. $F_{\hat{\theta}}(x) = P[max(X_1,...X_n) < x]$ $F_{\hat{\theta}}(x) = \prod_{i=1}^n P[X_i < x]$

$F_{\hat{\theta}}(x) = \prod_{i=1}^n P[X_i < x]$

$F_{\hat{\theta}}(x) = \left\{ \begin{array}{rl} (\frac{x}{\theta})^n & \mbox{if } 0 \leq x \leq \theta, \\ 0 & \mbox{if } x \leq 0 \end{array} \right.$

The probability density of $$\hat{\theta}$$: $f_{\hat{\theta}}(x)= \frac{\partial F_{\hat{\theta}}(x)}{\partial x}$

$f_{\hat{\theta}}(x) = \left\{ \begin{array}{rl} \frac{x^{n-1}}{\theta ^n} n & \mbox{if } x \in ]0,\theta[, \\ 0 & \mbox{if } ux\notin ]0,\theta[ \end{array} \right.$

The expection of $$\hat{\theta}$$ is: $\mathbb{E}(\hat{\theta}) = \int_{0}^\theta x f(x) dx$

$\mathbb{E}[\hat{\theta}] = n \int_{0}^\theta \frac{x^n}{\theta^n} dx = \frac{n}{n+1}\theta$

This estimator is only asymptotically unbiased.

n <- length(times)
theta_ll <- function(teta_n){
res <- ((n + 1) / n) * teta_n
res
}

theta_ll(100)
## [1] 105
max(times)
## [1] 431
2*mean(times)
## [1] 459
mean(times)
## [1] 229.5

## 6.4 Compare the quality of these estimators.

The estimator of the method of moments is biased: $\mathbb{E}(\theta^{MM}) = 2\mathbb{E}(\bar{X}_n) = 2 \theta \quad \neq \theta$

Like the maximum likelihood estimator: $\mathbb{E}[\hat{\theta}] = \frac{n}{n+1}\theta$

We will choose the following estimator which is unbiased. $T_n = \frac{n+1}{n} \theta_n$ $\mathbb{E}(T_n) = \mathbb{E}(\frac{n+1}{n} \frac{n}{n+1} \theta) = \theta$

## 6.5 When, on average, will the next particle arrive?

mean(times)
## [1] 229.5

### 6.5.1 What is the likelihood that the next particle will arrive before the next 100 hours?

punif(100,min=0,max=max(times))
## [1] 0.2320186

### 6.5.2 How many particles can we expect to receive in a year?.

(365*24)/mean(times)
## [1] 38.16993