Sec 1 Example of Time Series Data

library(astsa)

1.1 Characteristics of Time Series

1.1.1 Practical data

判斷以下範例為何不是 (weakly) stationary

Example 1.1 (p.2)

plot(jj, type="o", ylab="Quarterly Earnings per Share")





Example 1.2 (p.3)

plot(globtemp, type="o", ylab="Global Temperature Deviations")





Example 1.3 (p.3) Voice record

plot(speech)





Example 1.5 (p.5)

par(mfrow = c(2,1)) # set up the graphics
plot(soi, ylab="", xlab="", main="Southern Oscillation Index\nindex of ocean ccurrent 洋流")
plot(rec, ylab="", xlab="", main="Recruitment\nindex of fish amount")





Example 1.6 (p.5) MRI data

par(mfrow=c(2,1))
ts.plot(fmri1[,2:5], col=1:4, ylab="BOLD", main="Cortex")
ts.plot(fmri1[,6:9], col=1:4, ylab="BOLD", main="Thalamus & Cerebellum")
mtext("Time (1 pt = 2 sec)",side = 1,line = 2)





Example 1.7 (p.6)

par(mfrow=c(2,1))
plot(EQ5, main="Earthquake")
plot(EXP6, main="Explosion")





1.1.2 White Noise (WN)

Example 1.9 (p.10)

set.seed(0921)
w = rnorm(500,0,1) # 500 N(0,1) variates
v = filter(w, sides=2, filter=rep(1/3,3)) # moving average
par(mfrow=c(2,1))
plot.ts(w, main="white noise")
plot.ts(v, ylim=c(-3,3), main="moving average")

WtN(0,1)

  • sides=2, center at lag 0

Vt=13(Wt1+Wt+Wt+1)

  • sides=1, for past value only
Vt=13(Wt+Wt1+Wt2)

Example 1.10 (p.11)

set.seed(0921)
w = rnorm(550,0,1) # 50 extra to avoid startup problems
x = filter(w, filter=c(1,-.9), method="recursive")[-(1:50)] # remove first 50
plot.ts(x, main="autoregression")

WtN(0,1) Xt=Wt0.9Wt1 (default sides=1)

Xt and Xt+1 are negatively correlated, means when
  • XtXt+1
  • XtXt+1

1.1.3 Random Walk

Example 1.11 (p.11)

set.seed(154) # so you can reproduce the results
w = rnorm(200,0,1); x = cumsum(w) # two commands in one line
wd = w +.2; xd = cumsum(wd)
plot.ts(xd, ylim=c(-5,55), main="random walk")
lines(x, col=2)
lines(.2*1:200,lty="dashed")
legend('topleft', col=c(1,2,1), lty=c(1,1,2), 
       legend = c(expression(X[d]),expression(X[t]),expression(E(X[t]))))

Xt is not stationary, depends on t.

X_d is not stationary, \because E(X_d) depends on t.

  • W_t\sim N(0,1)
  • X_t=W_1+W_2+...+W_t
    • Var(X_t)=Var(W_1)+Var(W_2)+...+Var(W_t)=1\cdot t=t
  • W_d=W_t+0.2
  • X_d=\sum W_d=\sum W_t+0.2\cdot t
    • E(X_d)=0.2\cdot t
E(aX+b)=aE(X)+b
Var(aX+b)=a^2Var(X)

1.1.4 Noise influence periodic

Example 1.12 (p.12)

cs = 2*cos(2*pi*1:500/50 + .6*pi); w = rnorm(500,0,1)
par(mfrow=c(3,1), mar=c(3,2,2,1), cex.main=1.5)         # help(par) for info
plot.ts(cs, main=expression(x[t]==2*cos(2*pi*t/50+.6*pi)))
plot.ts(cs+w, main=expression(x[t]==2*cos(2*pi*t/50+.6*pi) + N(0,1)))
plot.ts(cs+5*w, main=expression(x[t]==2*cos(2*pi*t/50+.6*pi) + N(0,25)))

The periodic behavior became less obvious when the noise is large.


1.1.5 How lag influence plot

Example 1.24 (p.24) y_t=Ax_{t-l}+w_t

  • l>0, x_t lead y_t
    • y_t=x_{t-1}+w_t
    • x lead y
  • l<0, x_t lag y_t
    • y_t=x_{t+1}+w_t
    • x lag y

Assume w_t is uncorrelated with x_t, the cross-covariance function is \gamma_{yx}(h)=cov(y_{t+h},x_t)=cov(Ax_{t+h-l}+w_{t+h},x_t)\\=cov(Ax_{t+h-l},x_t)=A\gamma_x(h-l)

When h=l, \gamma_{yx}(h)=\gamma_x(0)

x = rnorm(100)
y = lag(x, -5) + rnorm(100)
par(mfrow=c(1,1))
ccf <- ccf(y, x, type='covariance', 
           ylab='CCovF', xlab='lag h', main=expression(paste('l=5 with', hat(gamma)[yx](h))))

  • X\sim N(0,1)
  • y_t=X_{t-5}+W_t, W_t\sim N(0,1)
  • \require{enclose} \enclose{horizontalstrike}{Corr(y_t, X_{t+h})=\big\{\substack{1,\:h=5\\0,\:o.w.}}
  • l=5\to\gamma_{yx}(h)=\gamma_x(h-5)
    • lag\:h=5 \to\gamma_{yx}(5)=\gamma_x(0)=variance(X)=1
    • y_{t+h}=Ax_{t+h-5}+w_{t+h}
      • i=|h-l|=|h-5|
      • h>5\to\gamma_{yx}(h)=cov(Ax_{t+i},x_t)\to x\:leads
      • h<5\to\gamma_{yx}(h)=cov(Ax_{t-i},x_t)\to y\:leads

Example 1.25 (p.28)

par(mfrow=c(1,2))
r = round(acf(soi, 6, plot=FALSE)$acf[-1], 3)
plot(lag(soi,-1), soi, ylab=expression(X[t]), xlab=expression(X[t-1]))
arrows(-1,-1,1,1, col=2, angle = 15, lwd=5)
legend('topleft', legend=r[1])
plot(lag(soi,-6), soi, ylab=expression(X[t]), xlab=expression(X[t-6]))
arrows(-1,1,1,-1, col=2, angle = 15, lwd=5)
legend('topleft', legend=r[6])


1.2 Explore Data

1.2.1 Estimate a linear trend

Example 2.1 (p.45) x_t=a+b\cdot t+W_t

summary(fit <- lm(chicken~time(chicken)))
## 
## Call:
## lm(formula = chicken ~ time(chicken))
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -8.7411 -3.4730  0.8251  2.7738 11.5804 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -7.131e+03  1.624e+02  -43.91   <2e-16 ***
## time(chicken)  3.592e+00  8.084e-02   44.43   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.696 on 178 degrees of freedom
## Multiple R-squared:  0.9173, Adjusted R-squared:  0.9168 
## F-statistic:  1974 on 1 and 178 DF,  p-value: < 2.2e-16
plot(chicken, type="o",ylab="cents per pound")
abline(fit, lwd=2, col=2) # add regression line to the plot
legend('topleft', expression(a+bt), lwd=2, col=2)


1.2.2 Detrend and Differencing

Example 2.4 and 2.5 (p.54, 58) x_t=a+b\cdot t+W_t

  • detrended: X_t-a-b\cdot t
  • first difference: \nabla X_t=X_t-X_{t-1}
fit = lm(chicken~time(chicken), na.action=NULL) # regress chicken on time
par(mfrow=c(2,1), mar=c(3,4,2,1))
plot(resid(fit), type="o", main="detrended")
plot(diff(chicken), type="o", main="first difference")

par(mfrow=c(3,1), mar=c(4,3,3,1)) # plot ACFs
acf(chicken, 48, main="chicken")
acf(resid(fit), 48, main="detrended")
acf(diff(chicken), 48, main="first difference")

  • first difference does better than regression
  • ACF of stationary process decays fast to zero

Example 2.6 (p.58)

par(mfrow=c(2,1), mar=c(4,4,3,1))
plot(diff(globtemp), type="o")
mean(diff(globtemp)) # drift estimate = .008
## [1] 0.007925926
acf(diff(gtemp), 48)

  • look stationary after first-order differencing

1.2.3 Smoothing

Example 2.13 (p.67) lowess

  • lowess(x, y = NULL, f = 2/3)
    • f: the smoother span (band width)
plot(soi)
lines(lowess(soi, f=.05), lwd=2, col=4) # El Nino cycle
lines(lowess(soi), lty=2, lwd=2, col=2) # trend (with default span)
legend('topright', c('seasonality','trend'), col=c(4,2), lty=c(1,2), lwd=2)

  • seasonality: small band width
  • trend: large band width

Example 2.14 (p.68) smooth.spline

  • smooth.spline(x, y, spar = NULL)
    • x: predictor
    • y: response
    • spar: smoothing parameter, typically (but not necessarily) in (0,1]
plot(soi)
lines(smooth.spline(time(soi), soi, spar=.5), lwd=2, col=4)
lines(smooth.spline(time(soi), soi, spar= 1), lty=2, lwd=2, col=2)
legend('topright', c('seasonality','trend'), col=c(4,2), lty=c(1,2), lwd=2)


1.3 Time Series model

Example 3.2 (p.78) AR(1)

  • arima.sim: Simulate from an ARIMA model.
    • order: c(p, d, q) \to AR order, degree of differencing, MA order.
    • ARIMA(0,0,0) is white noise.
  • If AR(2) \to order=c(2,0,0), ar=c(.5, .1)
par(mfrow=c(2,1), mar=c(3,4,2,1))
plot(arima.sim(list(order=c(1,0,0), ar=.9), n=100), ylab="x",
     main=(expression(AR(1)~~~phi==+.9)))
plot(arima.sim(list(order=c(1,0,0), ar=-.9), n=100), ylab="x",
     main=(expression(AR(1)~~~phi==-.9)))

top bottom
\rho(h)=\color{red}{(0.9)}^h, h\geq0 \rho(h)=\color{red}{(-0.9)}^h, h\geq0
positively correlated negatively correlated
look like trend fluctuate rapidly

Example 3.5 (p.82) MA MA(1) model x_t=w_t+\theta w_{t-1} E(x_t)=0\\ \gamma(h)=\Bigg\{\begin{array}{ll} (1+\theta^2)\sigma^2_w,&h=0\\ \theta\sigma^2_w,&h=1\\ 0,&h>1 \end{array}\\ \rho(h)=\Big\{\begin{array}{ll} \frac{\theta}{1+\theta^2},&h=1\\ 0,&h>1 \end{array}

par(mfrow = c(2,1), mar=c(3,4,2,1))
plot(arima.sim(list(order=c(0,0,1), ma=.9), n=100), ylab="x",
     main=(expression(MA(1)~~~theta==+.5)))
plot(arima.sim(list(order=c(0,0,1), ma=-.9), n=100), ylab="x",
     main=(expression(MA(1)~~~theta==-.5)))

top bottom
\theta=0.5\Rightarrow x_t, x_{t-1} are positively correlated \theta=-0.5\Rightarrow x_t, x_{t-1} are negatively correlated
\rho(1)=\frac{0.5}{1+0.5^2}=0.4\Rightarrow weak dependence
  • The series for which \theta=0.5 is smoother than the series for \theta=-0.5.

Example 3.7 (p.84) parameter redundency If we were unaware of parameter redundancy, we might claim the data are correlated when in fact they are not

X_t\sim N(5,1),\text{ fit ARIMA(1,1) to X}\\ X_t=\phi X_{t-1}+W_t+\theta W_{t-1}

set.seed(8675309)
x = rnorm(150, mean=5) # generate iid N(5,1)s
out <- arima(x, order=c(1,0,1)); out # estimation
## 
## Call:
## arima(x = x, order = c(1, 0, 1))
## 
## Coefficients:
##           ar1     ma1  intercept
##       -0.9595  0.9527     5.0462
## s.e.   0.1688  0.1750     0.0727
## 
## sigma^2 estimated as 0.7986:  log likelihood = -195.98,  aic = 399.96
  • \phi= -0.96, \theta= 0.95
  • (1+0.96B)X_t=(1+0.95B)W_t

Example 3.8 (p.86) ARMA to MA

ARMAtoMA(ar = .9, ma = .5, 10) # for a list
##  [1] 1.4000000 1.2600000 1.1340000 1.0206000 0.9185400 0.8266860 0.7440174
##  [8] 0.6696157 0.6026541 0.5423887
plot(ARMAtoMA(ar = .9, ma = .5, 10), 
     ylab = expression(psi [j]))

ARMAtoMA(ar = -.5, ma = -.9, 10) # first 10 pi-weights
##  [1] -1.400000000  0.700000000 -0.350000000  0.175000000 -0.087500000
##  [6]  0.043750000 -0.021875000  0.010937500 -0.005468750  0.002734375

Example 3.11 (p.92) AR(2)

z = c(1,-1.5,.75) # coefficients of the polynomial
(a = polyroot(z)[1]) # = 1+0.57735i,print one root = 1 + i/sqrt(3)
## [1] 1+0.57735i
  • polyroot: 解根

f(x)=1-1.5x+0.75x^2

arg = Arg(a)/(2*pi) # arg in cycles/pt
1/arg # =12,the period
## [1] 12

z=x+iy\\ r=\sqrt{x^2+y^2}\\ \phi=Arg(z)\\ x=r\cdot cos(\phi),\;y=r\cdot sin(\phi)

set.seed(90210)
ar2 = arima.sim(list(order=c(2,0,0), ar=c(1.5,-.75)), n = 144)  # simulated data
plot(1:144/12,ar2,type="l",xlab="Time (one unit = 12 obs)")
abline(v=0:12, lty="dotted",lwd=2)

X_t=1.5X_{t-1}-0.75X_{t-2}+W_t

  • has seasonality with period d=12
  • AR(2) process is not stationary
  • MA(q) process is always stationary
ACF = ARMAacf(ar=c(1.5,-.75), ma=0, 50) # theoretical
par(mfrow = c(2,1), mar=c(3,4,3,1))
plot(ACF, type="h", xlab="lag", main="theoretical ACF")
abline(h=0)
acf(ar2, lag.max = 50, main="sample ACF")

  • ARMAacf: theoretical ACF (when model is given)
  • acf: sample ACF (when data is given)
  • ACF has periodic pattern

Example 3.16 (p.98) PACF of AR(p)

ar2.acf = ARMAacf(ar=c(1.5,-.75), ma=0, 24)[-1]
ar2.pacf = ARMAacf(ar=c(1.5,-.75), ma=0, 24, pacf=TRUE)
par(mfrow=c(1,2), mar=c(4,4,1,1))
plot(ar2.acf, type="h", xlab="lag"); abline(h=0)
plot(ar2.pacf, type="h", xlab="lag")
abline(h=0); axis(1, at=c(1:5))

  • PACF: cut off after lag 2

Example 3.29 (p.115) Simulate 50 obs. from MA(1) with \theta_1=0.9

set.seed(2)
ma1 = arima.sim(list(order = c(0,0,1), ma = 0.9), n = 50)
acf(ma1, plot=FALSE)[1] # = .507 (lag 1 sample ACF)
## 
## Autocorrelations of series 'ma1', by lag
## 
##     1 
## 0.507
  • True ACF: \rho(1)=\frac{\theta_1}{1+\theta_1^2}= 0.4972376

1.3.1 Behavior of the ACF and PACF for ARMA models

AR(p) MA(q) ARMA(p, q)
ACF Tails off Cuts off after lag q Tails off
PACF Cuts off after lag p Tails off Tails off

1.4 Fit model with different method

在對資料進行建模前,可以用ACF, PACF初步分析,判斷p, q需要選多少

1.4.1 Preliminary Analysis

Example 3.18 (p.99) Preliminary Analysis of the Recruitment Series

value <- acf2(rec, 48) # will produce values and a graphic

  • ACF decay very fast \to stationary
  • PACF cut-off after lag 2 (its PACF \approx 0 after lag 2) \to AR(2) is suitable for the data
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ACF 0.92 0.78 0.63 0.48 0.36 0.26 0.18 0.13 0.09 0.07 0.06 0.02 -0.04 -0.12 -0.19 -0.24
PACF 0.92 -0.44 -0.05 -0.02 0.07 -0.03 -0.03 0.04 0.05 -0.02 -0.05 -0.14 -0.15 -0.05 0.05 0.01
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
ACF -0.27 -0.27 -0.24 -0.19 -0.11 -0.03 0.03 0.06 0.06 0.02 -0.02 -0.06 -0.09 -0.12 -0.13 -0.11
PACF 0.01 0.02 0.09 0.11 0.03 -0.03 -0.01 -0.07 -0.12 -0.03 0.05 -0.08 -0.04 -0.03 0.06 0.05
33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48
ACF -0.05 0.02 0.08 0.12 0.10 0.06 0.01 -0.02 -0.03 -0.03 -0.02 0.01 0.06 0.12 0.17 0.20
PACF 0.15 0.09 -0.04 -0.10 -0.09 -0.02 0.05 0.08 -0.02 -0.01 -0.02 0.05 0.01 0.05 0.08 -0.04

1.4.2 Fit AR(2) model to Recruitment Series

Example 3.25 (p.110) Fit AR(2) to rec data by OLS

regr = ar.ols(rec, order=2, demean=FALSE, intercept=TRUE)
fore = predict(regr, n.ahead=24)
ts.plot(rec, fore$pred, col=1:2, xlim=c(1980,1990), ylab="Recruitment")

# 95% CI
U = fore$pred+1.96*fore$se; L = fore$pred-1.96*fore$se
xx = c(time(U), rev(time(U))); yy = c(L, rev(U))
polygon(xx, yy, border = 8, col = gray(.6, alpha = .2))
lines(fore$pred, type="p", col=2)

Example 3.28 (p.115) Fit AR(2) to rec data by Yule-Walker

rec.yw = ar.yw(rec, order=2)
rec.yw$x.mean # = 62.26278 (mean estimate)
rec.yw$ar # = 1.3315874, -.4445447 (coefficient estimates)
sqrt(diag(rec.yw$asy.var.coef)) # = .04222637, .04222637 (standard errors)
rec.yw$var.pred # = 94.79912 (error variance estimate)
rec.pr = predict(rec.yw, n.ahead=24)
# 95% CI
U=rec.pr$pred+1.96*rec.pr$se
L=rec.pr$pred-1.96*rec.pr$se
minx=min(rec,L);maxx=max(rec,U)
ts.plot(rec, rec.pr$pred, xlim=c(1980,1990),ylim=c(minx,maxx))
lines(rec.pr$pred,col="red",type="o")
lines(U, col="blue", lty="dashed")
lines(L, col="blue", lty="dashed")

Example 3.31 (p.120) Fit AR(2) to rec data by MLE

rec.mle = ar.mle(rec, order=2)
rec.mle$x.mean # 62.26
rec.mle$ar # 1.35, -.46
sqrt(diag(rec.mle$asy.var.coef)) # .04, .04
rec.mle$var.pred # 89.34

1.4.3 compare AR(2) model’s coef. estimated by OLS, YW, and MLE

OLS YW MLE
1.354, -0.463 1.332, -0.445 1.351, -0.461
  • Its coefficient estimator is similar

1.5 Analysis of GNP Data

Example 3.39, 3.40 (p.136, 139)

plot(gnp)

val <- acf2(gnp, 50)

  • ACF looks not stationary
gnpgr = diff(log(gnp)) # growth rate
  • diff(log(X_t))=logX_t-logX_{t-1}=log\frac{X_t}{X_{t-1}}, \;growth
    • diff(log(X_t))>0, \uparrow
    • diff(log(X_t))<0, \downarrow
plot(gnpgr, ylab="GNP growth rate", lwd=2)
abline(h=mean(gnpgr), lty=2, col=2, lwd=2)
legend("topright", legend = "average growth rate", lty=2, lwd=2, col=2)

val <- acf2(gnpgr, 24)

  • ACF cut-off after lag 2 \Rightarrow MA(2)
  • PACF cut-off after lag 1 \Rightarrow AR(1)
1 2 3 4 5 6 7 8 9 10 11 12
ACF 0.35 0.19 -0.01 -0.12 -0.17 -0.11 -0.09 -0.04 0.04 0.05 0.03 -0.12
PACF 0.35 0.08 -0.11 -0.12 -0.09 0.01 -0.03 -0.02 0.05 0.01 -0.03 -0.17
13 14 15 16 17 18 19 20 21 22 23 24
ACF -0.13 -0.10 -0.11 0.05 0.07 0.10 0.06 0.07 -0.09 -0.05 -0.10 -0.05
PACF -0.06 0.02 -0.06 0.10 0.00 0.02 -0.04 0.01 -0.11 0.03 -0.03 0.00

1.5.1 Diagnostics

out1 <- sarima(gnpgr, 1, 0, 0) # AR(1)

out2 <- sarima(gnpgr, 0, 0, 2) # MA(2)

  • In Ljung-Box statistic, both model’s p-value>0.05, \therefore AR(1) is suitable for gnpgr
ARMAtoMA(ar=.35, ma=0, 10) # prints psi-weights
##  [1] 3.500000e-01 1.225000e-01 4.287500e-02 1.500625e-02 5.252187e-03
##  [6] 1.838266e-03 6.433930e-04 2.251875e-04 7.881564e-05 2.758547e-05

1.5.2 model selection

Both AR(1) and MA(2) are suitable. Then, which one is better?

  • Use AIC, AICC or BIC, the smaller the better.
model AIC AICc BIC
AR(1) -6.44694 -6.4466932 -6.4009579
MA(2) -6.4501328 -6.4496369 -6.3888233
  • Since MA(2) model’s AIC, AICC and BIC are smaller than AR(1) model’s, MA(2) is better.

1.6 Model Diagnostics

Example 3.41 (p.140) Diagnostics for the Glacial Varve Series

sarima(log(varve), 0, 1, 1, no.constant=TRUE) # ARIMA(0,1,1)

  • ACF outside the CI, and p-value$$0.05, \therefore ARIMA(0,1,1) is not suitable for log(varve)
sarima(log(varve), 1, 1, 1, no.constant=TRUE) # ARIMA(1,1,1)

  • all ACF inside the CI, and all p-value above 0.05, \therefore ARIMA(1,1,1) is suitable for log(varve).

1.7 Regression with Autocorrelated Errors

Example 3.44 (p.144) Mortality, Temperature and Pollution

  • 配適模型查看溫度(temperature)、顆粒物水平(particulate levels)與心血管死亡率(cardiovascular mortality)的關係

M_t=\beta_1+\beta_2t+\beta_3T_t+\beta_4T^2_t+\beta_5P_t+x_t

trend = time(cmort); temp = tempr - mean(tempr); temp2 = temp^2
fit <- lm(cmort~trend + temp + temp2 + part, na.action=NULL)
round(summary(fit)$coef, 3) |> knitr::kable()
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2831.490 199.557 14.189 0
trend -1.396 0.101 -13.820 0
temp -0.472 0.032 -14.941 0
temp2 0.023 0.003 7.990 0
part 0.255 0.019 13.541 0
acf2(resid(fit), 52) # implies AR2

sarima(cmort, 2,0,0, xreg=cbind(trend,temp,temp2,part))

  • ACF outside the CI, but the p-value are above 0.05, \therefore AR(2) is not good enough for cmort.

Example 3.45 (p.145) Regression with Lagged Variables

R_t=\beta_0+\beta_1S_{t−6}+\beta_2D_{t−6}+\beta_3D_{t−6}S_{t−6}+wt

dummy = ifelse(soi<0, 0, 1)
fish = ts.intersect(rec, soiL6=lag(soi,-6), dL6=lag(dummy,-6), dframe=TRUE)
fit <- lm(rec ~ soiL6*dL6, data=fish, na.action=NULL)
round(summary(fit)$coef, 3) |> knitr::kable()
Estimate Std. Error t value Pr(>|t|)
(Intercept) 74.479 2.865 25.998 0.000
soiL6 -15.358 7.401 -2.075 0.039
dL6 -1.139 3.711 -0.307 0.759
soiL6:dL6 -51.244 9.523 -5.381 0.000
attach(fish)
## The following object is masked from package:astsa:
## 
##     rec
plot(resid(fit))

acf2(resid(fit)) # indicates AR(2)

intract = soiL6*dL6 # interaction term
sarima(rec,2,0,0, xreg = cbind(soiL6, dL6, intract))

  • ACF outside the CI, and the p-value are below 0.05, \therefore AR(2) is not suitable.

1.8 Multiplicative Seasonal ARIMA Models

  • 先一階差分再看季節性 * 季節性資料優先處理季節性,再看需不需要差分

Example 3.46 (p.146) A Seasonal AR Series

(1 − 0.9\cdot B^{12})x_t = w_t

set.seed(666)
phi = c(rep(0,11),.9)
sAR = arima.sim(list(order=c(12,0,0), ar=phi), n=37)
sAR = ts(sAR, freq=12)
layout(matrix(c(1,1,2, 1,1,3), nc=2))
par(mar=c(3,3,2,1), mgp=c(1.6,.6,0))
plot(sAR, axes=FALSE, main='seasonal AR(1)', xlab="year", type='c')
Months = c("J","F","M","A","M","J","J","A","S","O","N","D")
points(sAR, pch=Months, cex=1.25, font=4, col=1:4)
axis(1, 1:4); abline(v=1:4, lty=2, col=gray(.7))
axis(2); box()
ACF = ARMAacf(ar=phi, ma=0, 100)
PACF = ARMAacf(ar=phi, ma=0, 100, pacf=TRUE)
plot(ACF,type="h", xlab="LAG", ylim=c(-.1,1)); abline(h=0)
plot(PACF, type="h", xlab="LAG", ylim=c(-.1,1)); abline(h=0)

  • Theoretical ACF tails off at lag 12, 24, 36, …
  • Theoretical PACF cut-off at lag 12
acf2(sAR, max.lag = 35)

  • Those patterns are also observed in the sample ACF and PACF.

Example 3.47 (p.145) seasonal ARMA

X_t-.8X_{t-12}=W_t-.5W_{t-1}

phi = c(rep(0,11),.8)
ACF = ARMAacf(ar=phi, ma=-.5, 50)[-1] # [-1] removes 0 lag
PACF = ARMAacf(ar=phi, ma=-.5, 50, pacf=TRUE)
par(mfrow=c(1,2))
plot(ACF, type="h", xlab="LAG", ylim=c(-.4,.8)); abline(h=0)
axis(1, at=0:50, label=F); axis(1, at=seq(0,50,by=10), lwd=2)
plot(PACF, type="h", xlab="LAG", ylim=c(-.4,.8)); abline(h=0)
axis(1, at=0:50, label=F); axis(1, at=seq(0,50,by=10), lwd=2)

  • PACF cut-off at lag 12, so AR(1)_{12} is suitable.
  • ACF cut-off at lag 1, so MA(1) is suitable.

1.9 Periodogram

Example 4.1 (p.167) A Periodic Series

x1 = 2*cos(2*pi*1:100*6/100) + 3*sin(2*pi*1:100*6/100)
x2 = 4*cos(2*pi*1:100*10/100) + 5*sin(2*pi*1:100*10/100)
x3 = 6*cos(2*pi*1:100*40/100) + 7*sin(2*pi*1:100*40/100)
x = x1 + x2 + x3

par(mfrow=c(2,2), mar=c(4,4,2,1))
plot.ts(x1, ylim=c(-10,10), main=expression(omega==6/100~~~A^2==13))
plot.ts(x2, ylim=c(-10,10), main=expression(omega==10/100~~~A^2==41))
plot.ts(x3, ylim=c(-10,10), main=expression(omega==40/100~~~A^2==85))
plot.ts(x, ylim=c(-16,16), main="sum")

when there is many periodic, it is hard to see

  • A: height of the peak
    • for x_1, A^2=2^2+3^2=13, the maximum and minimum values that x_1 will attain are \pm\sqrt{13}=\pm3.61
  • \omega=\frac jn: j = repeated times

Example 4.2 (p.169) Estimation and the Periodogram

P = Mod(2*fft(x)/100)^2; Fr = 0:99/100
plot(Fr, P, type="o", xlab="frequency", ylab="scaled periodogram")
abline(v=0.5, col="red", lty=2)
points(x=Fr[7], y=P[7], col="red", pch=1, cex=3);text(x=Fr[7], y=P[7], labels = "6/100", pos=3, col="red")
points(x=Fr[11], y=P[11], col="red", pch=1, cex=3);text(x=Fr[11], y=P[11], labels = "10/100", pos=3, col="red")
points(x=Fr[41], y=P[41], col="red", pch=1, cex=3);text(x=Fr[41], y=P[41], labels = "40/100", pos=1, col="red")

  • the plot is symmetric: I(\omega_j)=I(\omega_{n-j})
  • period: T=\frac nj
    • T_1=\frac {100}6, T_2=\frac{100}{10}, T_3=\frac{100}{40}
  • \omega_j=\frac jn
    • \omega_1=\frac 6{100}, \omega_2=\frac {10}{100}, \omega_3=\frac {40}{100}

Example 4.7 (p.176) spectral density

par(mfrow=c(3,1), mar=c(4,4,1,1))
arma.spec(main="White Noise")
arma.spec(ma=.5, main="Moving Average")
arma.spec(ar=c(1,-.9), main="Autoregression")

  • Top: White Noise
  • Middle: MA(1), \theta_1=0.5
  • Bottom: AR(2), \phi_1=1, \phi_2=-0.9

Example 4.10 (p.182) Spectral ANOVA

periodogram is symmetric

\omega_1=\frac15,\omega_2=\frac25

SS=2\cdot I(\omega_j)

x = c(1, 2, 3, 2, 1)
c1 = cos(2*pi*1:5*1/5); s1 = sin(2*pi*1:5*1/5)
c2 = cos(2*pi*1:5*2/5); s2 = sin(2*pi*1:5*2/5)
omega1 = cbind(c1, s1); omega2 = cbind(c2, s2)
anova(lm(x~omega1+omega2)) # ANOVA Table
## Analysis of Variance Table
## 
## Response: x
##           Df  Sum Sq Mean Sq F value Pr(>F)
## omega1     2 2.74164 1.37082     NaN    NaN
## omega2     2 0.05836 0.02918     NaN    NaN
## Residuals  0 0.00000     NaN
  • 2.74=2\cdot I(\frac15)=2\cdot1.37

\downarrow I(0) I(\frac15) I(\frac25) I(\frac35) I(\frac45)

Mod(fft(x))^2/5 # the periodogram (as a check)
## [1] 16.20000000  1.37082039  0.02917961  0.02917961  1.37082039

Example 4.13 (p.187) CI

  • n=480, \because periodogram is symmatric, only need to print half of them
  • 1 unit = 40
par(mfrow=c(2,1), mar=c(4,4,1,1))
soi.per = mvspec(soi, log="no")
abline(v=1/4, lty=2)
rec.per = mvspec(rec, log="no")
abline(v=1/4, lty=2)

  • the peak appear at j=10, 40

\frac{2I(\omega_j)}{f(\omega_j)}\sim\chi^2(2)

\text{95% CI for }f(\omega_j)=[\frac{2I(\omega_j)}{\chi^2_.975}, \frac{2I(\omega_j)}{\chi^2_.025}]

  • \downarrow I(\omega_j)
soi.per$spec[40] # 0.97223; soi pgram at freq 1/12 = 40/480
soi.per$spec[10] # 0.05372; soi pgram at freq 1/48 = 10/480
## [1] 0.9722312
## [1] 0.05372962
  • \downarrow CI: huge interval is a useless interval
U = qchisq(.025,2) # 0.05063
L = qchisq(.975,2) # 7.37775

2*soi.per$spec[10]/L # 0.01456
2*soi.per$spec[10]/U # 2.12220

2*soi.per$spec[40]/L # 0.26355
2*soi.per$spec[40]/U # 38.40108
## [1] 0.0145653
## [1] 2.122207
## [1] 0.2635573
## [1] 38.40108