5.3 Bivariate Descriptive
In this section, we consider graphical and numerical descriptive statistics for summarizing two or more data series.
5.3.1 Scatterplots
The contemporaneous dependence properties between two data series {xt}Tt=1 and {yt}Tt=1 can be displayed graphically in a scatterplot, which is simply an xy-plot of the bivariate data.
Figure 5.25 shows the scatterplots between the Microsoft and S&P 500 monthly and daily returns created using:
par(mfrow=c(1,2))
plot(coredata(sp500MonthlyRetS),coredata(msftMonthlyRetS),
main="Monthly returns", xlab="S&P500", ylab="MSFT", lwd=2,
pch=16, cex=1.25, col="blue")
abline(v=mean(sp500MonthlyRetS))
abline(h=mean(msftMonthlyRetS))
plot(coredata(sp500DailyRetS),coredata(msftDailyRetS),
main="Daily returns", xlab="S&P500", ylab="MSFT", lwd=2,
pch=16, cex=1.25, col="blue")
abline(v=mean(sp500DailyRetS))
abline(h=mean(msftDailyRetS))

Figure 5.25: Scatterplot of Monthly returns on Microsoft and the S&P 500 index.
The S&P 500 returns are put on the x-axis and the Microsoft returns on the y-axis because the “market”, as proxied by the S&P 500, is often thought as an independent variable driving individual asset returns. The upward sloping orientation of the scatterplots indicate a positive linear dependence between Microsoft and S&P 500 returns at both the monthly and daily frequencies.
◼
pairs()
plots
all pair-wise scatterplots in a single plot. For example, to plot
all pair-wise scatterplots for the GWN, Microsoft returns and S&P
500 returns use:
merge(gwnMonthly,msftMonthlyRetS,sp500MonthlyRetS)
dataToPlot =pairs(coredata(dataToPlot), col="blue", pch=16, cex=1.25, cex.axis=1.25)

Figure 5.26: Pair-wise scatterplots between simulated GWN, Microsoft returns and S&P 500 returns.
The top row of Figure 5.26 shows the scatterplots between the pairs (MSFT, GWN) and (SP500, GWN), the second row shows the scatterplots between the pairs (GWN, MSFT) and (SP500, MSFT), the third row shows the scatterplots between the pairs (GWN, SP500) and (MSFT, SP500). The plots in the lower triangle are the same as the plots in the upper triangle except the axes are reversed.
◼
5.3.2 Sample covariance and correlation
For two random variables X and Y, the direction of linear dependence is captured by the covariance, σXY=E[(X−μX)(Y−μY)], and the direction and strength of linear dependence is captured by the correlation, ρXY=σXY/σXσY. For two data series {xt}Tt=1 and {yt}Tt=1, the sample covariance, ˆσxy=1T−1T∑t=1(xt−ˉx)(yt−ˉy), measures the direction of linear dependence, and the sample correlation, ˆρxy=ˆσxyˆσxˆσx, measures the direction and strength of linear dependence. In (5.10), ˆσx and ˆσy are the sample standard deviations of {xt}Tt=1 and {yt}Tt=1, respectively, defined by (5.4).
When more than two data series are being analyzed, it is often convenient to compute all pair-wise covariances and correlations at once using matrix algebra. Recall, for a vector of N random variables X=(X1,…,XN)′ with mean vector μ=(μ1,…,μN)′ the N×N variance-covariance matrix is defined as: Σ=var(X)=cov(X)=E[(X−μ)(X−μ)′]=(σ21σ12⋯σ1Nσ12σ22⋯σ2N⋮⋮⋱⋮σ1Nσ2N⋯σ2N). For N data series {xt}Tt=1, where xt=(x1t,…,xNt)′, the sample covariance matrix is computed using: ˆΣ=1T−1T∑t=1(xt−ˆμ)(xt−ˆμ)′=(ˆσ21ˆσ12⋯ˆσ1Nˆσ12ˆσ22⋯ˆσ2N⋮⋮⋱⋮ˆσ1Nˆσ2N⋯ˆσ2N), where ˆμ is the N×1 sample mean vector. Define the N×N diagonal matrix: ˆD=(ˆσ10⋯00ˆσ2⋯0⋮⋮⋱⋮00⋯ˆσN). Then the N×N sample correlation matrix ˆC is computed as: ˆC=ˆD−1ˆΣˆD−1=(1ˆρ12⋯ˆρ1Nˆρ121⋯ˆρ2N⋮⋮⋱⋮ˆρ1Nˆρ2N⋯1).
The scatterplots of Microsoft and S&P 500 returns in Figure 5.25
suggest positive linear relationships in the data. We can confirm
this by computing the sample covariance and correlation using the
R functions cov()
and cor()
. For the monthly returns,
we have
cov(sp500MonthlyRetS, msftMonthlyRetS)
## MSFT
## SP500 0.00298
cor(sp500MonthlyRetS, msftMonthlyRetS)
## MSFT
## SP500 0.614
Indeed, the sample covariance is positive and the sample correlation shows a moderately strong linear relationship. For the daily returns we have
cov(sp500DailyRetS, msftDailyRetS)
## MSFT
## SP500 0.000196
cor(sp500DailyRetS, msftDailyRetS)
## MSFT
## SP500 0.671
Here, the daily sample covariance is about twenty times smaller than the monthly covariance (recall the square-root-of time rule), but the daily sample correlation is similar to the monthly sample correlation.
When passed a matrix of data, the cov()
and cor()
functions can also be used to compute the sample covariance and correlation
matrices ˆΣ and ˆC, respectively.
For example,
cov(msftSp500MonthlyRetS)
## MSFT SP500
## MSFT 0.01030 0.00298
## SP500 0.00298 0.00228
cor(msftSp500MonthlyRetS)
## MSFT SP500
## MSFT 1.000 0.614
## SP500 0.614 1.000
The function cov2cor()
transforms a sample covariance matrix
to a sample correlation matrix using (5.13):
cov2cor(cov(msftSp500MonthlyRetS))
## MSFT SP500
## MSFT 1.000 0.614
## SP500 0.614 1.000
◼
The R package corrplot contains functions for visualizing
correlation matrices. This is particularly useful for summarizing
the linear dependencies among many data series. For example, Figure
5.27 shows the correlation plot from the corrplot
function corrplot.mixed()
created using:
merge(gwnMonthly, msftMonthlyRetS, sp500MonthlyRetS)
dataToPlot = cor(dataToPlot)
cor.mat =corrplot.mixed(cor.mat, lower="number", upper="ellipse")

Figure 5.27: Correlation plot created with corrplot()
.
The color scheme shows the magnitudes of the correlations (blue for positive and red for negative) and the orientation of the ellipses show the magnitude and direction of the linear associations.
◼
5.3.3 Sample cross-lag covariances and correlations
The dynamic interactions between two observed time series {xt}Tt=1and {yt}Tt=1 can be measured using the sample cross-lag covariances and correlations
ˆγkxy=^cov(Xt,Yt−k),ˆρkxy=^corr(Xt,Yt−k)=ˆγkxy√ˆσ2xˆσ2y. When more than two data series are being analyzed, all pairwise cross-lag sample covariances and sample correlations can be computed at once using matrix algebra. For a time series of N data series {xt}Tt=1, where xt=(x1t,…,xNt)′, the sample lag k cross-lag covariance and correlation matrices are computed using: ˆΓk=1T−1T∑t=k+1(xt−ˆμ)(xt−k−ˆμ)′,ˆCk=ˆD−1ˆΓkˆD−1, where ˆD is defined in (5.12).
Consider computing the cross-lag covariance and correlation matrices
(5.14) and (5.15) for k=0,1,…,5
between Microsoft and S&P 500 monthly returns. These matrices may
be computed using the R function acf()
as follows:
acf(coredata(msftSp500MonthlyRetS), type="covariance",
Ghat =lag.max=5, plot=FALSE)
acf(coredata(msftSp500MonthlyRetS), type="correlation",
Chat =lag.max=5, plot=FALSE)
names(Ghat)
## [1] "acf" "type" "n.used" "lag" "series" "snames"
Here, Ghat
and Chat
are objects of class "acf"
for which there are print and plot methods. The acf
components
of Ghat
and Chat
are 3-dimensional arrays containing
the cross lag matrices (5.14) and (5.15),
respectively. For example, to extract ˆC0 and
ˆC1 use:
# Chat0
$acf[1,,] Chat
## [,1] [,2]
## [1,] 1.000 0.614
## [2,] 0.614 1.000
# Chat1
$acf[2,,] Chat
## [,1] [,2]
## [1,] -0.1931 -0.0135
## [2,] -0.0307 0.1232
The print method shows the sample autocorrelations of each variable as well as the pairwise cross-lag correlations:
Chat
##
## Autocorrelations of series 'coredata(msftSp500MonthlyRetS)', by lag
##
## , , MSFT
##
## MSFT SP500
## 1.000 ( 0) 0.614 ( 0)
## -0.193 ( 1) -0.031 (-1)
## -0.114 ( 2) -0.070 (-2)
## 0.193 ( 3) 0.137 (-3)
## -0.139 ( 4) -0.030 (-4)
## -0.023 ( 5) 0.032 (-5)
##
## , , SP500
##
## MSFT SP500
## 0.614 ( 0) 1.000 ( 0)
## -0.014 ( 1) 0.123 ( 1)
## -0.043 ( 2) -0.074 ( 2)
## 0.112 ( 3) 0.101 ( 3)
## -0.105 ( 4) 0.073 ( 4)
## -0.055 ( 5) -0.010 ( 5)
These values can also be visualized using the plot method:
plot(Chat, lwd=2)

Figure 5.28: Sample cross-lag correlations between Microsoft and S&P 500 returns.
Figure 5.28 shows the resulting four-panel plot. The top-left and bottom-right (diagonal) panels give the SACFs for Microsoft and S&P 500 returns. The top-right plot gives the cross-lag correlations ˆρkmsft,sp500 for k=0,1,…,5 and the bottom-left panel gives the cross-lag correlations ˆρksp500,msft for k=0,1,…,5. The plots show no evidence of any dynamic feedback between the return series.
◼