Chapter 59: multivariate normal distribution

Definition 40.1 probability density function (PDF) of normal distribution (= Gaussian distribution)

N(xμ,σ2)=12πσexp{(xμ)22σ2}

If a continuous random variable X follows a normal distribution with mean 0 and variance σ2

Xn(μ,σ2)=N(μ,σ2)fX(x)=1σ2πe12(xμσ)2=e12(xμσ)2σ2π=n(xμ,σ2)=12πσexp{(xμ)22σ2}=N(xμ,σ2)

A random variable X can be standardized by subtracting the mean μ and dividing by the standard deviation σ, resulting in the standardized random variable Z

Z=Xμσ or z=xμσ

The standardized random variable Z follows the standard normal distribution

Zn(0,12)=N(0,12)fZ(z)=e12(x01)212π=12πexp{z22}=n(z0,12)=N(z0,12)

To generalize from univariate random variables to multivariate random vectors, a random vector15

Z=Z1,Z2,,Zp=[Z1Z2Zp]=[Z1Z2Zp]

with p random variable components is said to follow the standard multivariate normal distribution if and only if its joint PDF is given by

fZ(z)=1(2π)p/2exp{zz2}=1(2π)p/2exp{zz2}

(59.1) can be rewritten as the following

fZ(z)=12π12π12πp times exp{z212z222z2p2}=12πexp{z212}12πexp{z222}12πexp{z2p2}=f(z1)f(z2)f(zp)

where

fZi(zi)=12πexp{z2i2}=f(zi)ZiN(0,12)=n(0,12)

fZi(zi)=fZ(z1,,zi1,zi,zi+1,,zp)dz1dzi1dzi+1dzp=f(z1)f(zi1)f(zi)f(zi+1)f(zp)dz1dzi1dzi+1dzp=f(zi)f(z1)dz1f(zi1)dzi1f(zi+1)dzi+1f(zp)dzp=f(zi)(???)=12πexp{z2i2}

covariance matrix[19]

Definition 18.1 covariance matrix of a random vector8

C[X]=Cov[X]=V[X]=E[[XE(X)][XE(X)]]

X=X1,X2,,Xp=[X1X2Xp]=[X1X2Xp]

E[X]=E[X1],E[X2],,E[Xp]=[E[X1]E[X2]E[Xp]]=[E[X1]E[X2]E[Xp]]

XE[X]=[X1E[X1]X2E[X2]XpE[Xp]]

[XE(X)][XE(X)]=[X1E[X1]X2E[X2]XpE[Xp]][X1E[X1]X2E[X2]XpE[Xp]]=[(X1E[X1])(X1E[X1])(X1E[X1])(X2E[X2])(X1E[X1])(XpE[Xp])(X2E[X2])(X1E[X1])(X2E[X2])(X2E[X2])(X2E[X2])(XpE[Xp])(XpE[Xp])(X1E[X1])(XpE[Xp])(X2E[X2])(XpE[Xp])(XpE[Xp])]=[(X1E[X1])2(X1E[X1])(XpE[Xp])(XpE[Xp])(X1E[X1])(XpE[Xp])2]

E[[XE(X)][XE(X)]]=E[(X1E[X1])2(X1E[X1])(XpE[Xp])(XpE[Xp])(X1E[X1])(XpE[Xp])2]=[E[(X1E[X1])2]E[(X1E[X1])(XpE[Xp])]E[(XpE[Xp])(X1E[X1])]E[(XpE[Xp])2]]=[V(X1,X1)V(X1,Xp)V(Xp,X1)V(Xp,Xp)]=[V(X1,X1)V(X1,X2)V(X1,Xp)V(X2,X1)V(X2,X2)V(X2,Xp)V(Xp,X1)V(Xp,X2)V(Xp,Xp)]=[V(X1)V(X1,Xp)V(Xp,X1)V(Xp)]=[V(X1)V(X1,X2)V(X1,Xp)V(X2,X1)V(X2)V(X2,Xp)V(Xp,X1)V(Xp,X2)V(Xp)]=[V(X1)C(X1,Xp)C(Xp,X1)V(Xp)]=[V(X1)C(X1,X2)C(X1,Xp)C(X2,X1)V(X2)C(X2,Xp)C(Xp,X1)C(Xp,X2)V(Xp)]=[σ21σ1pσp1σ2p]=[σ21σ12σ1pσ21σ22σ2pσp1σp2σ2p]=[σ11σ12σ1pσ21σ22σ2pσp1σp2σpp]=[σij]p×p=Σ

XD(μ,Σ)=d(μX,ΣX)=d(E[X],C[X])=d(E[X],V[X])

ZN(μZ,ΣZ)=n(E[Z],V[Z])

E[Z]=[E[Z1]E[Z2]E[Zp]]=[E[Zi]]p×1E[Zi]=zifZi(zi)dzi(???)=zie12z2i2πdzi=0E[Z]=0ZN(μZ=0,ΣZ)=n(0,V[Z])

V(Zi)=(ziμZi)2fZi(zi)dzi(???)=(zi0)2e12z2i2πdzi=1

V(Zi,Zj)ijZi,Zjare independent=0

V[Z]=[V(Z1)V(Z1,Z2)V(Z1,Zp)V(Z2,Z1)V(Z2)V(Z2,Zp)V(Zp,Z1)V(Zp,Z2)V(Zp)]=[σ11σ12σ1pσ21σ22σ2pσp1σp2σpp]=[σ21σ12σ1pσ21σ22σ2pσp1σp2σ2p]???=[100010001]=Ip×p=Ip=I

ZN(μZ,ΣZ)=n(E[Z],V[Z])=N(0,I){μZ=E[Z]=0=[0]p=[0]p×1ΣZ=V[Z]=I=Ip=Ip×p

Z=[Z1Z2Zp]=[X1μ1σ1X2μ2σ2Xpμpσp]=[1σ10001σ20001σp][X1μ1X2μ2Xpμp]=[1σ10001σ20001σp]([X1X2Xp][μ1μ2μp])=B1(Xμ)X=BZ+μ

X=BZ+μ=T(Z)

by V[AX+b]=AV[X]A

Σ=ΣX=V[X]=V[BZ+μ]=BV[Z]B=BIB=BB

Consider two infinitesimal volumes of p-dimensional parallelepipeds in the different Rp spaces16

Vx=[x1,x1+dx1]×[x2,x2+dx2]××[xp,xp+dxp]

and

Vz=[z1,z1+dz1]×[z2,z2+dz2]××[zp,zp+dzp]

Their relationship under linear transformation is

Vx=T(Vz)=[T(z1),T(z1)+T(dz1)]×[T(z2),T(z2)+T(dz2)]××[T(zp),T(zp)+T(dzp)]

and

dxi=jxizjdzj

For examples in 2 dimension,

[dx1dx2]=[x1z1x1z2x2z1x2z2][dz1dz2] Two element infinitesimal one-directional vectors of Z transformed into another space of X are

T(dz1)=[x1z1x1z2x2z1x2z2][dz10]=[x1z1dz1x2z1dz1]

and

T(dz2)=[x1z1x1z2x2z1x2z2][0dz2]=[x1z2dz2x2z2dz2]

Their corresponding area(volume) in the space of X is

AxdAx=Axdx1dx2=T(Az)dx1dx2=Az|[T(dz1)T(dz2)]|=Az|x1z1dz1x1z2dz2x2z1dz1x2z2dz2|=Az|x1z1x1z2x2z1x2z2|dz1dz2=Az|J|dAz

To generalize for volumes in p dimension,

VxdVx=Vxdx1dx2dxp=T(Vz)dx1dx2dxp=Az|[T(dz1)T(dz2)T(dzp)]|=Vz|[xizjdzj]p×p|=Vz|[xizj]p×p|dz1dz2dzp=Vz|J|dVz

i.e.

VxdVx=Vz|J|dVz

where J is a Jacobian matrix

J=[xizj]p×p=xz

or |J| is a Jacobian determinant or simply Jacobian

|J|=|xizj|p×p=|xz|

The probability of the same event should be invariant under transformation.

VxfX(x)dVx=VxfX(x)dx1dx2dxp=T(Vz)fX(x)dx1dx2dxp=VzfZ(z)dVz=VzfZ(z)dz1dz2dzp

i.e.

VxfX(x)dVx=VzfZ(z)dVz

{VxfX(x)dVx=VzfZ(z)dVz???VxdVx=Vz|J|dVz???

Z=B1(Xμ)z=B1(xμ)X=BZ+μx=Bz+μJ=[xizj]p×p=xz=B|J|=|xizj|p×p=|xz|=|B|

VzfZ(z)dVz???=VxfX(x)dVx=VxdVxfX(x)???=Vz|J|dVzfX(x(z))=VzfX(x(z))|J|dVzfZ(z)=fX(x(z))|J|fX(x(z))=|J|1fZ(z)???=|J|11(2π)p/2exp{zz2}fX(x)=|J|1fZ(z(x))???,???=|B|1fZ(B1(xμ))=|B|1(2π)p/2exp{12[B1(xμ)][B1(xμ)]}=|B|1/2|B|1/2(2π)p/2exp{12(xμ)(B1)B1(xμ)}=|B|1/2|B|1/2(2π)p/2exp{12(xμ)(B)1B1(xμ)}=|BB|1/2(2π)p/2exp{12(xμ)(BB)1(xμ)}???=|Σ|1/2(2π)p/2exp{12(xμ)Σ1(xμ)}=(|Σ|(2π)p)1/2exp{12(xμ)Σ1(xμ)}

Definition 18.2 probability density function (PDF) of multivariate normal distribution (= multivariate Gaussian distribution)

N(xμ,Σ)=fX(x)=(|Σ|(2π)p)1/2exp{12(xμ)Σ1(xμ)}

Definition 18.3 correlation coefficient

ρij=σijσiiσjj=σijσ2iσ2j=σijσiσj=V(Xi,Xj)V(Xi)V(Xj)=R(Xi,Xj)

59.1 bivariate normal distribution

p=2 is the case of bivariate normal distribution

Σ=[σij]2×2=[σ11σ12σ21σ22]=[σ21σ12σ21σ22]=[σ21σ1σ2ρ12σ2σ1ρ21σ22]=[σ21σ1σ2ρσ2σ1ρσ22]

ρ12=ρ=ρ21

|Σ|=|σ21σ1σ2ρ12σ2σ1ρ21σ22|=σ21σ22(1ρ12ρ21)=σ21σ22(1ρ2)

[abcd]1=1|abcd|[dbca]

Σ1=1|Σ|[σ22σ1σ2ρσ2σ1ρσ21]=1σ21σ22(1ρ2)[σ22σ1σ2ρσ2σ1ρσ21]=1(1ρ2)[1σ21ρσ1σ2ρσ2σ11σ22]

N(x=[x1x2]|μ=[μ1μ2],Σ=[σ21σ1σ2ρσ2σ1ρσ22])=(|Σ|(2π)p=2)1/2exp{12(xμ)Σ1(xμ)}=(σ21σ22(1ρ2)(2π)2)1/2exp{12[x1μ1x2μ2]Σ1[x1μ1x2μ2]}=12πσ1σ21ρ2exp{12[x1μ1x2μ2]1(1ρ2)[1σ21ρσ1σ2ρσ2σ11σ22][x1μ1x2μ2]}=12πσ1σ21ρ2exp{12(1ρ2)[x1μ1x2μ2][1σ21ρσ1σ2ρσ2σ11σ22][x1μ1x2μ2]}=12πσ1σ21ρ2exp{12(1ρ2)[(x1μ1σ1)22ρ(x1μ1σ1)(x2μ2σ2)+(x2μ2σ2)2]}

Definition 18.4 probability density function (PDF) of bivariate normal distribution (= bivariate Gaussian distribution)

\begin{aligned} & \mathcal{N}\left(\begin{bmatrix}x_{1}\\ x_{2} \end{bmatrix} \middle| \begin{bmatrix}\mu_{1}\\ \mu_{2} \end{bmatrix},\begin{bmatrix}\sigma_{1}^{2} & \sigma_{1}\sigma_{2}\rho\\ \sigma_{2}\sigma_{1}\rho & \sigma_{2}^{2} \end{bmatrix}\right)\\ = & \dfrac{1}{2\pi\sigma_{1}\sigma_{2}\sqrt{1-\rho^{2}}}\exp\left\{ \frac{-1}{2\left(1-\rho^{2}\right)}\left[\left(\dfrac{x_{1}-\mu_{1}}{\sigma_{1}}\right)^{2}-2\rho\left(\dfrac{x_{1}-\mu_{1}}{\sigma_{1}}\right)\left(\dfrac{x_{2}-\mu_{2}}{\sigma_{2}}\right)+\left(\dfrac{x_{2}-\mu_{2}}{\sigma_{2}}\right)^{2}\right]\right\} \end{aligned}

references

8.
15.
ccjou. 多變量常態分布. (2014).
16.