Exercises

Exercise 2.1 (Normal additive property) Prove the additive property of the normal distribution, that is, prove that if Xi, i=1,,n are independent rv’s with respective distributions N(μi,σ2i), i=1,,n, then

ni=1XiN(ni=1μi,ni=1σ2i).

You can use that MN(μ,σ2)(s)=esμ+12σ2s2.

Exercise 2.2 (Sampling distribution of the maximum) Let (X1,,Xn) be a srs of a rv X with cdf FX. Prove that the sampling distribution of the statistic T(X1,,Xn)=X(n) is [FX(t)]n.

Exercise 2.3 Consider a srs (X1,,Xn) of U(θ,θ), θ>0. Derive the pdf of

  1. X(1) (use Example 2.6).
  2. X(n) (use Exercise 2.2).

Then, plot the pdf’s for θ=2 and n=2,5,10,30. What do you observe?

Exercise 2.4 Define the distribution Cauchy(μ,σ) as

Cauchy(μ,σ)d=μ+σt1,

where t1 is the Student’s t distribution (Definition 2.4) with one degree of freedom, μR is the location, and σR+ is the scale (note these parameters are not the mean and standard deviation).

  1. Show that the pdf of Cauchy(μ,σ) is f(x;μ,σ)=1πσ[1+(xμσ)2],xR.
  2. Show that the cdf of Cauchy(μ,σ) is F(x;μ,σ)=12+1πarctan(xμσ),xR.

Exercise 2.5 Consider a srs (X1,,Xn) of Cauchy(μ,σ). Derive the pdf of

  1. X(1) (use Example 2.6).
  2. X(n) (use Exercise 2.2).

Then, plot the pdf’s for μ=0, σ=1,2, and n=2,5,10,30. What do you observe?

Exercise 2.6 Consider a srs (X1,,Xn) of Pois(λ). Compute the cdf and the pmf of

  1. X(1) (use Example 2.6).
  2. X(n) (use Exercise 2.2).

Then, plot the pmf’s for λ=1 and n=2,5,10,30. What do you observe?

Exercise 2.7 Let X be the rv that describes the number of days a patient is in an intensive care unit after an operation. It is known that the distribution of X is

r 1 2 3
P(X=r) 0.3 0.4 0.3

Find:

  1. The mean of the population.
  2. The standard deviation of the population.
  3. Let X1 and X2 be a srs of two patients. Find the distribution of the sample mean from the joint distribution of X1 and X2,

Exercise 2.8 The monthly savings (in euros) of a student is a normal rv with mean μ=100 and standard deviation σ=50. Sixteen students were selected at random, with ˉX being the sample mean of the measured savings.

  1. What is the distribution of ˉX?
  2. Compute the probability that ˉX is larger than 125.
  3. Compute the probability that ˉX is between 90 and 130.

Exercise 2.9 Several government posts believe that a salary increment (in percentage) of the employees in the banking sector follows a normal distribution with standard deviation 3.37. A sample of n=16 employees from the sector is taken. Find the probability that the sample standard deviation is:

  1. Smaller than 1.99;
  2. Larger than 2.89.

Exercise 2.10 Assuming that the births of boys and girls are equally likely, find the probability that in the next 200 births:

  1. Less than 40% of them are boys;
  2. Between 43% and 57% are girls;
  3. More than 54% are boys.

Exercise 2.11 A tobacco manufacturer company claims that the mean nicotine content in their cigarettes is μ=0.6 mg. per cigarette. The nicotine content is assumed to be a N(μ,σ2) rv. An independent organization measures the nicotine content of a sample of n=16 of their cigarettes and finds that the average nicotine content in that batch is ˉX=0.72 and that the quasistandard deviation is S=0.1. What was the probability of observing |ˉXμ|0.12 if μ is actually equal to 0.6, as the company claims? What can you hint from this probability?

Exercise 2.12 The daily expenses in heating of two similar-sized company departments follows a normal rv with an average expense of 10 euros for both departments, and a standard deviation of 1 for the first and 1.5 for the second. In order to audit the expenses, the expenses are measured at both departments for 10 days chosen at random. Compute:

  1. The probability that in the 10 days, the average expense of the first department is above the average expense of the second by at least 10 euros.
  2. The probability that the sample variance of the first department is smaller than two times the sample variance of the second.

Exercise 2.13 The lifetime of certain electronic components follows a normal distribution with mean 1600 hours and standard deviation 400 hours.

  1. Given a srs of 16 components, find the probability that ˉX1500 hours.
  2. Given a srs of 16 components, what is the number of hours h such that the probability that ˉXh is 0.15.
  3. Given a srs of 16 components, what is the number of hours h such that the probability that Sh is 0.10 .
  4. Given a srs of 121 components, find the probability that at least half of the sample components have a lifetime longer than 1500 hours.
  5. Find the number of components for a sample that is required for ensuring that, with probability 0.92, the average lifetime of the sample is larger than 1500 hours.

Exercise 2.14 Given the srs of size 10 from a normal distribution with standard deviation 2, compute the probability that the sample and the population means differ in more than 0.5 units. Compute the size of the sample required for ensuring that, with probability 0.9, the sample and the population means differ in less than 0.1 units.

Exercise 2.15 The effectiveness (measured in days) of a certain drug is distributed as N(14,σ2). The drug is given to 16 patients and the observed quasistandard deviation in the sample is 1.4 days. The minimum average effectiveness required for its commercialization is 13 days. Determine:

  1. The probability that the average effectiveness does not attain the required minimum.
  2. The probability that variance is underestimated more than a 20%.
  3. Does the previous probability increase or decrease with the sample size?
  4. The sample size such that the probability in part b is 0.05.
  5. A reason of why there is so much concern about variance estimation.

Exercise 2.16 The bearing balls of a given manufacturer weigh 0.5 grams on average and have a standard deviation of 0.02 grams. Find the probability that two batches of 1000 balls differ by weight more than 2 grams.

Exercise 2.17 A factory produces a certain chemical product, whose amount of impurities has to be controlled. For that aim, 20 batches of the product are examined. If the standard deviation of the percentage of impurities is above 2.5%, then the production chain will have to be carefully examined. It is assumed that the percentage of impurities is normally distributed.

  1. What is the probability that the production chain will have to be examined if the population standard deviation is 2%?
  2. What is the probability that the average percentage of impurities in the sample is above 5% if the average population percentage is 1%?

Exercise 2.18 Let (X1,X2) be a srs of a Exp(λ) rv. What is the distribution of the ratio statistic X1/X2?

Exercise 2.19 Let (X1,,Xn) be a srs of a rv distributed as N(μ,σ2). Show that:

E[S2]=n1nσ2,Var[S2]=2(n1)n2σ4,E[S2]=σ2,Var[S2]=2n1σ4.

Exercise 2.20 An environmental protection agency is interested in establishing norms for the amount of permissible chemical products in lakes and rivers. A commonly employed toxicity metric is the quantity of any pollutant that will kill half of the test specimens in a given time interval (usually 96 hours for fish). This metric is denoted as LC50 (Lethal Concentration that kills the 50% of the test specimens). It has been observed in previous studies that log(LC50) follows a normal distribution. Let S21 be the sample quasivariance of a srs of 10 values of log(LC50) for copper and let S22 be the sample quasivariance of a srs of 8 values of log(LC50) for lead. Both samples were obtained from the same fish species. Assume that the population variance for the copper measurements is twice the one for lead. Assuming that S21 and S22 are independent, find two numbers a and b such that

P(aS21S22b)=0.90.

Exercise 2.21 Fifty people have simulated samples of size three from a rv N(μ,σ2). Each of the samples gave the value of the statistic

ˆμ=X1+3X2X35,

that is going to be used as an estimator of μ. The fifty values of ˆμ are represented in a histogram, and it turns out that the normal distribution that better fits the data has mean 1.68 and standard deviation 1.59. With this information, is it possible to estimate the distribution N(μ,σ2) from which the data came from?

Exercise 2.22 Let XN(0,1) and Yχ2ν be two independent rv’s. Show that the density of the rv defined as

T=XY/ν

is

fT(t)=Γ(ν+12)νπΓ(ν2)(1+t2ν)(ν+1)/2,tR.

Exercise 2.23 Using R, verify by simulations that the following equality of distributions are true:

  1. F1,νd=t2ν, for all ν>0.
  2. 1/Fν1,ν2d=Fν2,ν1, for all ν1,ν2>0.

To do so, simulate M=104 values from the Snedecor’s F distribution using rf() and from the Student’s t distribution using rt() and compare the histograms of the simulated values for several degrees of freedom.

Exercise 2.24 Using R, verify by simulations that Fisher’s Theorem claim on the independence between ˉX and S2 on a normal population is true. To do so, simulate M=104 values of (ˉX,S2) for n=10 and the choices of μ and σ that you want. Then, inspect if there is dependence on the sample {(ˉXi,S2i)}Mi=1. Use rnorm() for simulating values from a normal distribution.

Exercise 2.25 Let XF be a continuous random variable with a monotone non-decreasing cdf F and pdf f. Let UU(0,1). Using transformations, show that:

  1. F(X)U(0,1).
  2. F1(U)F.

The second result is very useful for simulation matters: a srs from F can be simulated with a srs (U1,,Un) from U(0,1) to which F1 is applied. That is, (F1(U1),,F1(Un)) is a srs from F. Derive F1 for

  1. Exp(λ) and
  2. Cauchy(μ,σ),

and code in R functions to simulate samples for each distribution. Check that the simulated samples follow the desired distributions by comparing the histograms of the samples with the theoretical pdf’s.