Chapter 5 Transforms

A patient who was given a medication in an earlier ward round. Before the doctor provides any further treatment, they would like to understand the amount of medication still in the patient’s system. The amount of medication that was administered is distributed by a known random variable \(M_0\), and the time since the ward round until the doctor next sees the patient is distributed by a known random variable \(T\). The amount of medication still in the patients system is given by \(M_{now} = M_0 e^{-\frac{T}{4}}\). What is the distribution of \(M_0\)?

How do we solve this type of problem mathematically?

5.1 One-dimensional Transformations

Consider a continuous random variable \(X\) whose PDF is \(f_X(x)\) and CDF is \(F_X(x)\). Let \(g: \mathbb{R} \rightarrow \mathbb{R}\) be a continuous function. Define a new continuous random variable by \(Y=g(X)\). The aim is to find the probability density function \(f_Y(y)\) of \(Y\).

In general there are two steps to do this:

  1. Compute the CDF of \(Y\), that is \[F_Y(y)=P(Y \leq y)\] by substituting \(Y\) for \(g(X)\), rearranging to write the event \(Y \leq y\) in terms of \(X\) and \(y\), and finally using the known expressions for \(F_X(x)\) and \(f_X(x)\).

  2. Derive the PDF of \(Y\), \(f_Y(y)\), from the CDF \(F_Y(y)\) using the fact that \[f_Y(y)=\dfrac{dF_Y}{dy}(y).\]

Dunder Mifflin Paper Company is investigating employee efficiency in their Scranton branch. In particular they are concerned with employee time being taken up by conference room meetings. The length of time a meeting takes is distributed by \(T_{meet} \sim Exp(\lambda)\). There are 13 employees in the branch. Calculate the CDF, PDF and distribution of the total amount of employee time \(T_{total}\) taken up by a meeting.

Since \(T_{meet} \geq 0\), it follows that \(T_{total}=13 T_{meet} \geq 0\). Therefore \(F_{T_{total}}(t)=0\) for \(t<0\). Instead assuming \(t\geq 0\), we have \[\begin{align*} &T_{total} \leq t \\[3pt] \iff \qquad &13 T_{meet} \leq t \\[3pt] \iff \qquad & T_{meet} \leq \frac{t}{13}. \end{align*}\] Using the known CDF of the exponential distribution, that is \(F_{T_{meet}}(x) = 1-e^{-\lambda x}\), we can calculate the CDF of \(T_{total}\): \[\begin{align*} F_{T_{total}}(t) &= P(T_{total} \leq t) \\[3pt] &= P\left( T_{meet} \leq \frac{t}{13} \right) \\[3pt] &= F_{T_{meet}} \left( \frac{t}{13} \right) \\[3pt] &= 1 - e^{-\frac{\lambda t}{13}}. \end{align*}\] So the PDF of \(Y\) is: \[f_{T_{total}}(t) = \frac{d}{dt} F_{T_{total}}(t) = \frac{\lambda}{13} e^{-\frac{\lambda t}{13}}.\] Therefore \[ f_{T_{total}}(t) = \begin{cases} \frac{\lambda}{13} e^{-\frac{\lambda t}{13}}, & \text{if } t \geq 0 \\[5pt] 0, & \text{otherwise,} \end{cases}\] that is, \(T_{total} \sim Exp \left( \frac{\lambda}{13} \right)\).
Consider \(X \sim N(0,1)\). Let \(g(x)=x^2\), and define \(Y=g(X)=X^2\). Find the PDF of \(Y\).



Clearly \(Y \geq 0\), so it is enough to assume \(y\) is non-negative when finding \(F_Y(y)\) and \(f_Y(y)\). Calculate

\[\begin{align*} F_Y(y) &= P(Y \leq y) \\[3pt] &= P \left( X^2 \leq y \right) \\[3pt] &= P \left( -\sqrt{y} \leq X \leq \sqrt{y} \right) \\[3pt] &= P \left( X \leq \sqrt{y} \right) - P \left(X \leq -\sqrt{y} \right) \\[3pt] &= F_X \left(\sqrt{y} \right) - F_X \left(-\sqrt{y} \right). \end{align*}\]

Using that \(F_X(x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}}\,dx\), it follows that

\[F_Y(y) = F_X \left(\sqrt{y} \right) - F_X \left(-\sqrt{y} \right) = \int_{-\sqrt{y}}^{\sqrt{y}} \frac{1}{\sqrt{2\pi}} e^{-\frac{x^2}{2}} \,dx.\]

Now if \(y>0\),

\[\begin{align*} f_Y(y) &= \frac{dF_Y}{dy}(y) \\[3pt] &= \frac{d}{dy} \left( \int_{-\sqrt{y}}^{\sqrt{y}} \frac{1}{\sqrt{2 \pi}} e^{-\frac{x^2}{2}} \right) \\[3pt] &= \frac{d}{dy} \left( F_X(\sqrt{y}) - F_X(-\sqrt{y}) \right) \\[3pt] &= \frac{d}{dy} \left( \sqrt{y} \right) F_{X}' \left(\sqrt{y} \right) - \frac{d}{dy} \left( -\sqrt{y} \right) F_{X}' \left(-\sqrt{y} \right) \\[3pt] &= \frac{1}{2\sqrt{y}} \cdot \frac{1}{\sqrt{2\pi}} e^{- \frac{(\sqrt{y})^2}{2}} - \left( - \frac{1}{2\sqrt{y}} \right) \cdot \frac{1}{\sqrt{2\pi}} e^{- \frac{(-\sqrt{y})^2}{2}} \\[3pt] &= \frac{1}{2\sqrt{2\pi y}} e^{- \frac{y}{2}} + \frac{1}{2\sqrt{2\pi y}} e^{- \frac{y}{2}} \\[3pt] &= \frac{1}{\sqrt{2\pi y}} e^{-\frac{y}{2}}. \end{align*}\]

5.2 Two-dimensional Transformations

The discussion of Section 5.1 only considers function of one variable. The medication example outlined at the beginning of the chapter outlines two random variables, \(M_{0}\) and \(T\), that both act as function inputs. In this section we develop the theory to address this situation.

Let \(T: A \rightarrow B\) where \(A,B \subseteq \mathbb{R}^{2}\) be a one-to-one transformation, that is, every pair \((y_1,y_2) \in B\) is obtained uniquely by applying \(T\) to some \((x_1,x_2) \in A\), and similarly each \((x_1,x_2) \in A\) is mapped to exactly one point in \(B\) by \(T\). Recall the definition of the Jacobian of such a transformation.

The Jacobian of a one-to-one transformation \(T: A \rightarrow B\) that maps \((x_1,x_2)\) to \((y_1,y_2) = \big( y_1(x_1,x_2), y_2(x_1,x_2) \big)\), denoted \(\frac{\partial (y_1,y_2)}{\partial (x_1,x_2)}\), is \[\frac{\partial (y_1,y_2)}{\partial (x_1,x_2)} = \det \begin{pmatrix} \frac{\partial y_1}{\partial x_1} & \frac{\partial y_1}{\partial x_2} \\[4pt] \frac{\partial y_2}{\partial x_1} & \frac{\partial y_2}{\partial x_2} \end{pmatrix}.\]

A key characteristic of a one-to-one transformation is that they are invertible. Specifically there exists a one-to-one transformation \(T^{-1}:B \rightarrow A\) such that \(T \cdot T^{-1} = T^{-1} \cdot T\) is the identity mapping. Typically if \(T(x_1,x_2) = \big( y_1(x_1,x_2),y_2(x_1,x_2) \big)\), then the inverse is notated \[T^{-1}(y_1,y_2) = \big( x_1(y_1,y_2),x_2(y_1,y_2) \big).\] The Jacobian of \(T^{-1}\) is the reciprocal of the Jacobian of \(T\): \[\frac{\partial (x_1,x_2)}{\partial (y_1,y_2)} = \frac{1}{\frac{\partial (y_1,y_2)}{\partial (x_1,x_2)}}.\]

Suppose \(X_1, X_2\) are continuous random variables with joint PDF \(f_{X_1,X_2}(x_1,x_2)\). Define two new random variables by \((Y_1,Y_2)=T(X_1,X_2)\). The aim is to find the joint PDF \(f_{Y_1,Y_2}(y_1,y_2)\) of \(Y_1, Y_2\).

Define \((Y_1,Y_2)=T(X_1,X_2) = \big( Y_1(X_1,X_2), Y_2(X_1,X_2) \big)\) where \(T\) is some one-to-one transformation. Set \[A= \left\{ (x_1,x_2): f_{X_1,X_2}(x_1, x_2)>0 \right\}.\] If \(T\) is a one-to-one function and the Jacobian of \(T^{-1}\), denoted \(\frac{\partial (x_1,x_2)}{\partial (y_1,y_2)}\), is non-zero in \(T(A)\), then the joint PDF of \(Y_1,Y_2\) is \[f_{Y_1,Y_2}(y_1,y_2) = \begin{cases} f_{X_1,X_2} \big(x_1(y_1,y_2),x_2(y_1,y_2) \big) \left|\frac{\partial (x_1,x_2)}{\partial (y_1,y_2)}\right|, & \text{if } (y_1, y_2) \in T(A), \\[5pt] 0, & \text{otherwise.} \end{cases}\]

Let \(X_1, X_2\) be IID as \(U(0,1)\). Define \[ Y_1=X_1+X_2, \qquad \text{and} \qquad Y_2=X_1-X_2.\] Find the joint PDF of \(Y_1\) and \(Y_2\).



The joint PDF of \(X_1\) and \(X_2\) is

\[\begin{align*} f_{X_1,X_2}(x_1,x_2) &= f_{X_1}(x_1)f_{X_2}(x_2) \\[5pt] &= \begin{cases} 1, & \text{if } 0 \leq x_1, x_2 \leq 1,\\[3pt] 0, & \text{otherwise.}\end{cases} \end{align*}\]

Now \(T:(x_1,x_2) \mapsto (y_1,y_2)\) is defined by

\[ y_1(x_1,x_2)=x_1+x_2,\qquad \text{and} \qquad y_2(x_1,x_2)=x_1-x_2.\]

The inverse \(T^{-1}: (y_1,y_2) \mapsto (x_1,x_2)\) is given by

\[x_1(y_1,y_2) = \frac{y_1+y_2}{2}, \qquad \text{and} \qquad x_2(y_1,y_2) = \frac{y_1-y_2}{2}.\]

The Jacobian of \(T^{-1}\) is

\[ \frac{\partial (x_1,x_2)}{\partial (y_1,y_2)} = \det\begin{pmatrix} \frac{\partial x_1}{\partial y_1} & \frac{\partial x_1}{\partial y_2} \\ \frac{\partial x_2}{\partial y_1} & \frac{\partial x_2}{\partial y_2} \end{pmatrix} =\det \begin{pmatrix} \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & -\frac{1}{2} \end{pmatrix} = -\frac{1}{2}.\]

The set of points at which \(f_{X,Y}\) is positive is

\[A= \big\{ (x_1,x_2): f_{X,Y}(x,y) >0 \big\} = \big\{ (x_1,x_2): 0 \leq x_1, x_2 \leq 1 \big\}.\]

Rewriting the bounding lines of \(A\) in terms of \(y_1,y_2\) obtain

\[\begin{align*} x_1 = 0 \qquad &\iff \qquad \frac{y_1+y_2}{2}=0 \qquad \iff \qquad y_1 + y_2=0, \\[5pt] x_1 = 1 \qquad &\iff \qquad \frac{y_1+y_2}{2}=1 \qquad \iff \qquad y_1 + y_2=2, \\[5pt] x_2 = 0 \qquad &\iff \qquad \frac{y_1-y_2}{2}=0 \qquad \iff \qquad y_1 - y_2=0, \\[5pt] x_2 = 1 \qquad &\iff \qquad \frac{y_1-y_2}{2}=1 \qquad \iff \qquad y_1 - y_2=2. \end{align*}\]

It follows that

\[ T(A)= \big\{ (y_1,y_2): 0\leq y_1+y_2, y_1-y_2 \leq 2 \big\}.\]
Therefore by Theorem 5.2.2
\[\begin{align*} f_{Y_1,Y_2}(y_1,y_2) &= \begin{cases} f_{X_1,X_2} \left( \frac{y_1 +y_2}{2},\frac{y_1 -y_2}{2} \right) \cdot \left\vert - \frac{1}{2} \right\vert, & \text{if } (y_1,y_2) \in T(A),\\[3pt] 0, & \text{otherwise.}\end{cases} \\[9pt] &= \begin{cases} \frac{1}{2}, & \text{if } 0 \leq y_1+y_2, y_1-y_2 \leq 2, \\[3pt] 0, & \text{otherwise.}\end{cases} \end{align*}\]
Let \(X_1, X_2\) be IID as the exponential distribution \(Exp(\lambda)\). Define \[Y_1 = \frac{X_1}{X_2} \qquad \text{and} \qquad Y_2=X_1+X_2.\] Find the joint PDF of \(Y_1\) and \(Y_2\), and the PDF of \(Y_1\).



The joint PDF of \(X_1\) and \(X_2\) is given by

\[\begin{align*} f_{X_1,X_2}(x_1,x_2) &= f_{X_1}(x_1)f_{X_2}(x_2) \\[5pt] &= \begin{cases} \lambda e^{-\lambda x_1} \lambda e^{-\lambda x_2}, & \text{if } x_1,x_2 > 0, \\[3pt] 0, & \text{otherwise,} \end{cases} \\[5pt] &= \begin{cases} \lambda^2 e^{-\lambda(x_1+x_2)}, & \text{if } x_1,x_2 > 0, \\[3pt] 0, & \text{otherwise.} \end{cases} \end{align*}\]

Now \(T:(x_1,x_2) \mapsto (y_1,y_2)\) is defined by

\[y_1 = \frac{x_1}{x_2} \qquad \text{and} \qquad y_2=x_1+x_2.\]

The inverse \(T^{-1}: (y_1,y_2) \mapsto (x_1,x_2)\) is given by

\[x_1 = \frac{y_1y_2}{y_1+1} \qquad \text{and} \qquad x_2 = \frac{y_2}{y_1+1}.\]

The Jacobian of \(T^{-1}\) is

\[\begin{align*} \frac{\partial (x_1,x_2)}{\partial (y_1,y_2)} &= \det \begin{pmatrix} \frac{\partial x_1}{\partial y_1} & \frac{\partial x_1}{\partial y_2} \\ \frac{\partial x_2}{\partial y_1} & \frac{\partial x_2}{\partial y_2} \end{pmatrix} \\[5pt] &= \det \begin{pmatrix} \frac{y_2}{(y_1+1)^2} & \frac{y_1}{y_1+1} \\ -\frac{y_2}{(y_1+1)^2} & \frac{1}{y_1+1} \end{pmatrix} \\[5pt] &= \frac{y_2}{(y_1+1)^3} + \frac{y_1y_2}{(y_1+1)^3} \\[5pt] &= \frac{y_2}{(y_1+1)^2}. \end{align*}\]

The set of points at which \(f_{X,Y}\) is positive is

\[A = \big\{ (x_1,x_2):f_{X_1,X_2}(x_1,x_2)>0 \big\} = \big\{ (x_1,x_2):x_1,x_2>0 \big\}.\]

Since \(x_1, x_2>0\), it follows \(y_1=\frac{x_1}{x_2}>0.\) Furthermore, since \(x_1=\frac{y_1y_2}{y_1+1}>0\), then \(y_1y_2>0\) implies \(y_2>0\). Therefore,

\[T(A)=\big\{ (y_1,y_2):y_1>0,y_2>0 \big\}.\]

Consequently by Theorem 5.2.2

\[\begin{align*} f &= \begin{cases} f_{X_1,X_2} \left( \frac{y_1y_2}{1+y_1},\frac{y_2}{1+y_1} \right) \left| \frac{\partial (x_1,x_2)}{\partial (y_1,y_2)} \right|, & \text{if } (y_1,y_2) \in T(A), \\[3pt] 0, & \text{otherwise,} \end{cases} \\[9pt] &= \begin{cases} \lambda^2 e^{-\lambda \left( \frac{y_1y_2}{(1+y_1)} + \frac{y_2}{(1+y_1)} \right)} \frac{y_2}{(1+y_1)^2}, & \text{if } y_1,y_2>0, \\[3pt] 0, & \text{otherwise,} \end{cases} \\[9pt] &= \begin{cases} \lambda^2 e^{-\lambda y_2} \frac{y_2}{(1+y_1)^2}, & \text{if } y_1,y_2>0, \\[3pt] 0, & \text{otherwise.} \end{cases} \end{align*}\]


The PDF of \(Y_1\) is the marginal PDF of \(Y_1\) coming from the joint PDF \(f_{Y_1,Y_2}(y_1,y_2)\). Therefore if \(y_1>0\):

\[\begin{align*} f_{Y_1}(y_1) &= \int_0^{\infty} \lambda^2 e^{-\lambda y_2} \frac{y_2}{(1+y_1)^2} \,dy_2 \\[3pt] &= \frac{1}{(1+y_1)^2} \int_0^{\infty} \lambda e^{-\lambda y_2} \,dy_2 \\[3pt] &= \frac{1}{(1+y_1)^2}. \end{align*}\]

So,

\[ f_{Y_1}(y_1) = \begin{cases} \frac{1}{(1+y_1)^2}, & \text{if } y_1>0, \\[3pt] 0, & \text{otherwise.} \end{cases} \]

This method to understand probability density functions under transformations extends to the case of \(n\) random variables.

5.3 Maximum and Minimums

Consider a machine that has a number of components. The lifetime of the machine is determined by how long all the individual components continue to work for, that is, if one component breaks then the machine breaks. How does one describe the lifetime of the machine in terms of the lifetime of each of the components?

In mathematical terms: suppose one can model the lifetime of each of the individual components by random variables \(X_1, X_2, \ldots, X_n\). The lifetime of the machine is then given by \(\min \left( X_1, X_2, \ldots, X_n \right)\). Given knowledge of the distributions \(X_1, X_2, \ldots, X_n\), can one calculate the CDF or PDF of \(\min \left( X_1, X_2, \ldots, X_n \right)\), or for that matter \(\max \left( X_1, X_2, \ldots, X_n \right)\)?

Note that the funcions \(\max\) and \(\min\) are not of the form of the functions considered in Section 5.2, so the theory there does not apply.

Let \(X_1, X_2, \ldots, X_n\) be IID random variables each with CDF given by \(F_X(x)\). Then

  • The random variable \(Y= \min \left( X_1, X_2, \ldots, X_n \right)\) has CDF given by \[F_{Y}(y) = 1 - \big( 1-F_{X}(y) \big)^n\] for any real number \(y\).

  • The random variable \(Z= \max \left( X_1, X_2, \ldots, X_n \right)\) has CDF given by \[F_{Z}(z) = F_{X}(z)^n\] for any real number \(z\).

Can you justify either of the formulas that appear in Theorem 5.3.1?

What are the PDFs of the random variables \(Y\) and \(Z\) that appear in Theorem 5.3.1? Give your answer in terms of the CDF and the PDF of the \(X_1, \ldots, X_n\) random variables.

Typically F1 cars will change to fresh tires once or twice during a race by using a pit stop. Obviously teams want to make this change as quickly as possible. From when the car pulls up, the four tires are changed simultaneously: on average it takes \(3\) seconds to change a tire, and the time taken is distributed exponentially. Lewis Hamilton needs his complete pit stop to take less than \(2.5\) seconds. What is the probability that he gets his wish?



Interpreting the question mathematically: the time it takes to change each of the four tires is given by a random variable \(T_1, T_2, T_3, T_4\) respectively. These random variables are IID and distributed by \(Exp \left( \frac{1}{3} \right)\). The \(\frac{1}{3}\) has come from the fact that we know the average tire change time to be \(3\) seconds, and an exponential distribution \(Exp(\lambda)\) has average \(\frac{1}{\lambda}\). Lewis Hamilton wants a pit stop of less than \(2.5\) seconds and for this to happen every tire change would need to be completed quickly enough, that is, \(T_{total} = \max \left(T_1,T_2,T_3,T_4 \right)\) to be less than \(2.5\). Therefore we calculate \(P \left( T_{total} < 2.5 \right)\).

Since \(T_1,T_2,T_3,T_4\) are distributed by \(Exp \left( \frac{1}{3} \right)\), we have for \(i=1,2,3,4\) that \[F_{T_i}(t) = \begin{cases} 1-e^{-\frac{t}{3}}, & \text{if } t>0, \\[3pt] 0, & \text{if } t \geq 0. \end{cases}\] Since \(T_{total} = \max \left( T_1, T_2, T_3, T_4 \right)\) by Theorem 5.3.1, the CDF of \(T_{total}\) is \[F_{T_{total}} (t) = F_{T_1}(t)^4 = \begin{cases} \left( 1-e^{-\frac{t}{3}} \right)^4, & \text{if } t>0, \\[3pt] 0, & \text{if } t \geq 0. \end{cases}\] Therefore \(P \left( T_{total} < 2.5 \right) = F_{T_{total}} (2.5) = \left( 1-e^{-\frac{2.5}{3}} \right)^4 = \left( 1-e^{-\frac{5}{6}} \right)^4 \approx 0.102.\)