6 General ARMA models

6.1 MA(q) process: Definition and properties

A moving average (MA) process of order q is a linear combination of the current white noise term

The q most recent past white noise terms and is defined by xt=wt+β1wt1++β1wtq

Here wt is white noise with zero mean and variance σ2w.

6.1.1 MA equation with backshift operator

The above equation can be rewritten in terms of the backward shift operator B

xt=(1+β1B+β2B2++βqBq)wt=ϕq(B)wt

Here ϕq is a polynomial of order q.

MA processes consist of a finite sum stationary white noise terms

They are stationary and hence have a time-invariant mean and autocovariance

6.1.2 Mean and variance of MA process

The mean and variance for xt are easy to derive

The mean is just zero because it is a sum of terms that all have a mean of zero

The variance is σ2w(1+β21++β2q)

Each of the white noise terms has the same variance and the terms are mutually independent.

6.1.3 Autocorrelation function and MA process

The autocorrelation function for an MA(q) process can readily be implemented in R

Note that the function takes the lag k and the model parameters βi for i=0,1,,q, with β0=1

6.1.4 R codes

The autocorrelation function is computed in two stages using a for loop

rho <- function(k, beta) {
q <- length(beta) - 1
if (k > q) ACF <- 0 else {
s1 <- 0; s2 <- 0
for (i in 1:(q-k+1)) s1 <- s1 + beta[i] * beta[i+k]
for (i in 1:(q+1)) s2 <- s2 + beta[i]^2
ACF <- s1 / s2}
ACF}

The first loop generates a sum (s1) for the autocovariance

The second loop generates a sum (s2) for the variance with the division of the two sums giving the returned autocorrelation (ACF)

Using the code

Using the code above for the autocorrelation function, correlograms for a range of MA(q) processes can be plotted against lag

The code in the next slide provides an example for an MA(3) process with parameters β1=0.7,β2=0.5, and β3=0.2

6.1.5 Autocovariance function

beta <- c(1, 0.7, 0.5, 0.2)
rho.k <- rep(1, 10)
for (k in 1:10) rho.k[k] <- rho(k, beta)
plot(0:10, c(1, rho.k), pch = 4, ylab = expression(rho[k]))
abline(0,0)

The plot above is the autocovariance function for an MA(3) process with parameters β1=0.7,β2=0.5,β3=0.2

Negative correlations at lags 1 and 3

The code below can be used to simulate the MA(3) process

Plot the correlogram of the simulated series

An example time plot and correlogram are show in the figure

As expected, the first three autocorrelations are significantly different from 0

Other statistically significant correlations are attributable ro random sampling variation

In the correlogram plot, 1 in 20 (5%) of the sample correlations for lags greater than 3

Here the underlying population correlation is zero are expected to be statistically significantly different from zero at the 5% level because multiple t-test results are being shown on the plot

set.seed(1)
b <- c(0.8, 0.6, 0.4)
x <- w <- rnorm(1000)
for (t in 4:1000) {
for (j in 1:3) x[t] <- x[t] + b[j] * w[t - j]
}
 plot(x, type = "l")

 acf(x)

6.1.6 Fitted MA models

Model fitted to simulated series

An MA(q) model can be fitted to data in R using the arima function with the order function parameter set to c(0,0,q)

Unlike the function ar, the function arima does not subtract the mean by default and estimates an intercept

MA models cannot be expressed in a multiple regression form, the parameters are estimated with a numerical algorithm

The function arima minimises the conditional sum of squares to estimate values of the parameters and will either return these if method=c("CSS") is specified or use them as initial values for MLE

In the following code, a moving average model, x.ma, is fitted to the simulated series of the last section

Looking at the parameter estimates, it can be seen that the 95% confidence interval

approximated by coeff.±2 s.e. of coeff) contain the underlying parameter values (0.8, 0.6 and 0.4) that were used in the simulations.

Furthermore, the intercept is not significantly different from its underlying parameter value of zero

x.ma  <- arima(x, order=c(0,0,3))
x.ma 

Call:
arima(x = x, order = c(0, 0, 3))

Coefficients:
         ma1     ma2     ma3  intercept
      0.7898  0.5665  0.3959    -0.0322
s.e.  0.0307  0.0351  0.0320     0.0898

sigma^2 estimated as 1.068:  log likelihood = -1452.41,  aic = 2914.83

Here is the acf

acf(x.ma$res[-1])

6.1.7 Stationarity

A stochastic process is said to be stationary if it’s properties are unaffected by a change of time origin.Time origin might mean the particular time period which represents the time range. It might be the starting time period or the center value of time. In other words, joint probability distribution of any set of times does not change shape or affected by any means if there is a shift in the time axis. Why we talk about joint probability distribution here. This is because every value in every time period is a particular value of a random variable. For example, Yt is random variable, same as Yt1 is a random variable. When a time-series consists of these two variables, the distribution of this time series is a joint distribution.

Strict stationarity

A time series model xt is strictly stationary if the joint distribution of x1,,xtn is the same as the joint distribution of x1+m,,xt+m for all t1,,tn and m, so that the distribution is unchanged after an arbitrary time shift.

Strict stationarity implies that the mean and variance are constant in time

Autocovariance Cov(xt,xs) only depends on lag k=|ts| and can be written γ(k)

Weak stationarity

If a series is not not strictly stationary but the mean and variance are constant in time and autocovariance only depends on the lag, then the series is called second-order stationary.

Usually stationarity is a very strict condition which requires that the whole distribution must be unchanged at any time period. It means that all the moments in any order should be equal. But this is usually impractical to find. As a result, we impose a softer version of this which we call . Weak stationarity is a situation when we have just mean, variance are equal. We also have covariance which does not change we time difference between variables are the same. This means:

Cov(Y1,Y5)=Cov(Y17,Y21)

This is also called covariance stationary. We can formally express the covariance stationary in the following way:

E(Yt)=μ<Var(Yt)=σ2Cov(Yt,Ytk)=E[(YtE(Yt1))(YtkE(Ytk))]=E[(Ytμ)(Ytkμ)]=γkk=1,2,3,

Here, equation and implies that weakly stationary process is characterized by constant finite mean and variance. On the other hand, implies that covariance between any two time periods does not depend on the time period they are in, rather it depends on the time lag between these two variables.

If the white noise is Gaussian, the stochastic process is commpletely defined by the mean and covariance structure, in the same way as any normal distribution is defined by its mean and variance-covariance matrix.

First step in any should be to check whether there is any evidence of a trend or seasonal effects

If there is we will have to remove them

It is often reasonable to treat the time series of residuals as a realisation of a stationary error series.

Therefore, these models in this topic are often fitted to residual series arising from regression analyses

6.1.8 Autocorreation function (ACF)

Autocovariance

Since here we are trying to find covariance between two variables where one is the lag version of the same variable, we call these . Here we define the as: γk=Cov(Yt,Ytk)=Cov(Ytk,Yt)
When k=0, γk reduces to:

γ0=Cov(Yt,Yt)=Var(Yt)

Autocorreation

One problem with covariance its value is not independent of the units in which the variables are measured. For example, Cov(Y1,Y5) and Cov(Y12,Y16) might not equal just because units are different. For example, let’s say Yt is measuring the CPI. In early time periods, CPI usually had lower values but suddenly it got a shift in later time periods. It’s better to use correlation measure which is unit free, obviously we will use which standardizes to compare across different series. Here we will derive ρk which is the k-th order autocorreation:

ρk=Cov(Yt,Ytk)Var(Yt)Var(Yt1)=γkγ0(by(???))

Obviously, here ρ0=Cov(Yt,Yt)γ0=Var(Yt)γ0=γ0γ0=1. And also as usual, $-1_k $ which is standard for autocorrelation measures.

Autocorrelation Function(ACF)

Autocorrelations as a function of lag order (k) is called autocorrelation function or ACF. ACF plays a major role in understanding a time series. It tells us how deep is the memory of the underlying stochastic process. If it has long memory, as in the case of AR(1), we would see that value of ρk does not diminish to zero with increasing value of k as quickly as the MA(1) does.We already have seen that in section that covariance between two periods, in a MA(1) process, reduces to zero if lag is just more than one period. Therefore by looking into the ACF we can have a fair idea about the underlying time series stochastic process.

ACF of AR(1) and MA(1)

We have just previously seen that we can express autocorrelation of order k as : ρk=γkγ0 In the case of AR(1), we have seen that in one of the previous equations that: Cov(Yt,Ytk)=ρk=θkσ21θ2=θkγ0And, we also have seen thatVar(Yt)=γ0=σ21θ2So, we can say that, in the case of AR(1)ρk=γkγ0=θk\cancelγ0\cancelγ0=θk

Ikn the case of MA(1), we also have seen previously that : γ1=Cov(Yt,Yt1)=ασ2,andγ0=Var(Yt)=σ2(1+α2)Therefore, from the above two we can writeρ1=γ1γ0=α\cancelσ2\cancelσ2(1+α2)=α1+α2

And in the case of $k $, we have also seen that γk=0k. Therefore,

ρk=0k

Implication of the above derivation is that, a shock in MA(1) will only last two period, Yt and Yt1 while a shock in AR(1) will affect all future observations with a decreasing effect, since |θ|<1.

Graphical Representation

Graphs of simulatated AR(1) and MA(1) goes in here.

6.2 Formulating ARMA process

In this section, we start to define more general autoregressive and moving average process. First we define a moving average of order q: yt=εt+α1εt1+α2εt2++αqεtq Here, yt=Ytμ and εt is a white noise process. It also means that the demeaned series yt is a weighted combination (can we say weighted average also?) of q+1 white noise terms. Here is q+1 terms since the q starts from the second term. On the other hand, an autoregressive process of order p, which is denoted as AR(p), can be expressed as:

yt=θ1yt1+θ2yt2+θ3yt3++θpytp+εt

Obviously it is very much possible to combine the MA(q) and AR(p) and come up with a ARMA(p,q) specification of the following form: yt=θ1yt1+θ2yt2++θpytp+εt+αεt1++αqεtq Now, the next million dollar question is when to choose an AR, MA or ARMA? Verbeek remains quite vague about this at this point. He says, AR and MA basically same time of series. It’s just a matter of parsimony. What’s the meaning of this? We have previously seen that AR(1) can be expressed as infinte lag order of MA series. So, what does parsimony means here? Does this mean that if we weant a smaller model then we choose AR(1) over infinite order MA? Verbeek says it will be clear later. At this point, it is difficult to know. We will postpone the discussion for now, come back to it later. Now we will talk briefly on .

6.2.1 Lag operator

In the notation of time series model, it is often convenient to use lag operator(L) or backshift operator(B) which some author use. But we will stick to L here. Let’s see a use: Lyt=yt1L2yt=L.Lyt=L.yt1=yt2Lqyt=ytq
There are other relationships involving L, such as L01 and L1=yt+1. Use of this notions makes life much simpler in specifying a long time series specification such as an ARMA model quite concisely. For example, let’s start with an AR(1) model:

yt=θyt1+εt=θL.yt+εtytθLyt=εt(1θL)yt=εt

We can refer to as follows: here, yt and it’s one period lag yt1 , on the right hand side, is a combination with weights of 1 and θ(L) and equals a white noise process (εt). In general, we can write AR(p) as follows: θ(L)yt=εt Here θ(L) is a polynomial of order p . I think θ(L) is some kind of function of L. We could write it like f(L). Writing it like θ(L) creates some confusion because there is also θ in the function. Now, θ(L) can be expressed as: θ(L)=1θLθ2L2θpLp We can interpret lag polynomial as a filter. When we apply it to a time series, it produces a new series.

6.2.2 Characteristics of Lag Polynomial

Suppose we apply a lag polynomial to a time series. Then we apply another lag polynomial on the top of that series. This is equivalent to applying product of two lag polynomial on the original series. We can also define the inverse of a filter, which is the inverse of polynomial. The inverse of θ(L) is θ1(L) and we can write θ(L)θ1(L)=1 . Now the following statement I really don’t understand.

If θ(L) is a finite order polynomial, then θ1L is a infinite order polynomial in L. It also goes onto claim that (1θL)1=j=0θjLj

provided that |θ|<1. Now the question is how to prove that claim? Now let’s start with simply j=0θj.

j=0=θ0+θ1+=1θ1θprovided that|θ|<1=11θ Now let’s see this version: j=0θjLj=θ0L0+θ1L1+θ2L2+=1+θL+θ2L2+=1(θL)1θL=1θL1θL=11θLby|θ|<1=(1θL)1So we find that,j=0θjLj=11θL=(1θL)1

Now, we have seen in that (1θL)yt=εt. From this it follows that: (1θL)1(1θL)yt=(1θL)1εtyt=(1θL)1εtor,yt=j=0θjLjεtFrom, the previous definition of Lag operator(L)Lεt=εt1L2εt=εt2Ljεt=εtjTherefore, we can writeyt=j=0θjεtj

6.3 From MA(1) to AR()}

It corresponds to same derivation where we have shown that AR(1) corresponds to MA() series. We have said previously that MA(1) can also be transformed into a AR series but could not show it quite mathematically. Armed with lag operator (L), we can try to show it here. We know that in MA(1)yt=εt+αεt1=εt+αLεt=(1+αL)εt(1+αL)1yt=εt

Now, let’s see how we can define (1+αL)1.

(1+αL)1=1(1+αL)=1(αL)1(αL)=αL+(αL)2+(αL)3+=j=0(αL)jTherfore, we write that (1+αL)1=j=0(αL)jBasically, we see thatεt=j=0(α)j(L)jyt=j=0(α)jytjNow, let's figure out εt1. This is important since we write MA(1)=αεt1+εt. What we will do is put the value of εt1 and keep εt as it isyt1=εt1+αεt2=εt1+αLεt1=εt1(1+αL)or,εt1=(1+αL)1yt1=j=0(αL)jyt1=j=0(α)jLjyt1=j=0(α)jytj1So, we can writeεt1=j=0(α)jytj1Now getting back to the MA(1),yt=αεt1+εt=αj=0(α)jytj1+εt

We here require that in the MA(1) which is (1+αL) is invertible. It can be shown that it is invertible only if |α|<1. AR representation is very convenient when we think that current behavior is determined by past behavior. MA representation is quite useful for determining variance and covariance of the process. It is not quite clear how so but it might be clear in the subsequent analysis.

6.4 Parsimonious representation of ARMA

We have seen previously the following ARMA representation: yt=θ1yt1+θ2yt2++θp+εt+αεt1++αqεtqor,ytθ1yt1θpytp=(α(L))εor,θ(L)yt=α(L)εtNow, if we think lag polynomical in AR(1) is invertible, thenyt=θ1(L)α(L)εtOn the other hand, if lag polynomial in MA(1) is invertible, then we can writeα1(L)θ(L)yt=εt

6.5 Invertibility of Lag Polynomial

6.5.0.1 Generalization

We have seen that for the first order lag polynomical 1θL is invertible if |θ|<1. But it is not clear at this point what we mean by invertibility. Is this means whether we can find the value of 11θL? We can find the value of this expression even though |θ|>1, it’s just negative. Now, whatever invertibility is, let’s move onand try to generalize this condition. First start with a second order polynomial. Then we will move onto higher degree of polynomial.

6.5.1 Second Order Polynomial

Let’s consider this second order polynomial:

1θ1Lθ2L2 How to solve this type of question? Well this is typically a equation of the following form: ax2+bx+c Here a=θ2,b=θ1,c=1. Now the typical solution for this type of equation is: x=b±b24ac2a

Now let's think about a numerical example with the following example:  

x28x+15=x23x5x+15=x(x3)5(x3)=(x3)(x5)=(3x)(5x)=15(1x/3)(1x/5)

Now let’s put the above numerical example into a general mathematical form:

1θ1Lθ2L2=(1φ1L)(1φ2L)

Now if we look into and compare it with , we find that φ1=1/3 and φ2=1/5. But the presence of 15 complicates the matter. Anyway, let’s whether we can resolve the issue or not. Author says that we can write φ1+φ2=θ1.It sounds like for the usual quadratic equation formula in there is some relation.

Now look into to relate itto the numerical example we have. If we think φ1=1/3 and φ2=1/5, then φ1+φ2=1/3+1/5=5+315=8/15. Now we see what complicates the situation, its the 15. If we compare and , then we can say that θ1=8. Then φ1+φ2=8/15 does not quite equal to 15. But if we multiply it by 15 then \cancel158/\cancel15=8.

The relationship is also such that φ1φ2=θ2. Now let’s check whether this holds. We have 1/31/5=1/15. But again we have to multiply by 15 to get the desired result. Hence we get 1 which is equivalent to θ2=1 which is the coefficient of x2 in .Ok that is fine.

6.5.1.1 Condition for invertibility

But the part that I don’t understand is the condition for invertibility in the polynomial. It says that the polynomial of second order in will be invertible if both of the first order polynomials are also invertible. The first order polynomials are: (1φ1L) and (1φ2L) . These two have to be also invertible for the whole quadratic polynomial to be invertible . This implies that both φ1 and φ2 has to be less than one, that is : |φ1|<1and|φ2|<1

Why this is so is not clear. Have to look into greater details. Anyways, author moves on to say that these requirements can also be formulated in terms of the so-called . Does it mean that charateristic equation is another way to mention this requirements of invertibility? Anyway, let’s first specify the characteristic equation: (1φ1z)(1φ2z)=0 When we look into this equation, the first thing that comes to mind is that why we are suddenly using z instead of L as we have done in .Well, my guess is that, it has been probably done to give a more generic look of the chracteristic equation. That’s the only explanation I can find here.

Solution of the equation in can be expressed by two values of z: z1 and z2. These are called .Why this is called characteristic equation or characteristic roots? Now the invertibility condition requires that |φi|<1 which translates into |zi>1|. If any solution that satisfies |zi|1 will result into a non-invertible polynomial. If |zi| is exactly equal to 1, then that solution is referred to as .

Invertibility with lag polynomial coefficients Here the author discusses an easier to detect the presence of unit root in a lag polynonmial. Remember, here we are talking about lag polynomial not the equation itself. It may create some confusion in the beginning. It certainly did in my case. Now the statement that confused me is the following:

We can detect the presence of unit root by noting that the polynomial θ(z) evaluated at z=1 is zero if pj=1θj=1. Thus the presence of a first unit root can be verified by checking whether the sum of polynomial coefficients equals one. If the sum exceeds one, the polynomial is not invertible.

Now have a look at the above statement and try to evaluate it with the following example. We are considering the following AR(2) model:

yt=1.2yt10.32yt2+εt

Now having a look at this equation , we might think that 2j=1=1.2+(0.32)=0.70 which is less than one. This might work just fine. We can express this equation a little bit more elaborately in the lag polynomial form:

yt=1.2Lyt0.32L2yt+εtyt(11.2L+0.32L2)=εtWe can also write this asyt(10.8L0.4L+0.32L2)=εtyt[(10.8L)0.4L(10.8L)]=εtyt(10.8L)(10.4L)=εtThis corresponds to the characteristics equation of the following form:11.2z+0.32z2=(10.4z)(10.8z)=0

Now let’s revisit the following statement and try to relate it with the above statement: We can detect the presence of unit root by noting that the polynomial θ(z) evaluated at z=1 is zero if pj=1θj=1. Thus the presence of a first unit root can be verified by checking whether the sum of polynomial coefficients equals one. If the sum exceeds one, the polynomial is not invertible.

Now let’s check what would be the value of characteristics equation when evaluated at z=1.

11.2z+0.32z2=11.2(1)+0.32(1)2=11.2+0.32=10.70=0.30

We find the value of characteristics equation becomes 0.30 when z=1. When this value will become 0? Now to see that let’s consider the equation when this value might become zero. How about this equation:

yt=1.2yt10.2yt2+εt

Here some of the coefficeints in the lag polynomial will be 1.20.2=1. Now what is wrong with that? Now the left hand side of the chracteristics equation with z=1 will be: 11.2z+0.2z=11.2+0.2=0 Here we see that when sum of the coefficients of the polynomial is 1, then value of the polynomial is zero when z=1.

Consequences of Invertibility Now what’s the consequence of this? It’s still not clear to me. In this case of a simple example it is obvious. For example: yt=1.2yt1+εt

Here, lag polynomial is much simpler: 11.2L. Sum of the lag polynomial here is 1.2 where there is only first degree lag, so we have just one term here. It is greater than 1, so it must be invertible. The real signficance of invertibility is that the series becomes non-stationary. At least that what I understood. The border line case is the value of lag coefficient is 1.

yt=yt1+εt

The issue of whether lag polynomials are invertible or not is important for serveral reasons.For moving average models, it is important for estimation and predictions. For the autoregressive models, we already mentioned, it’s ensures stationarity.

Common Roots Now here we will talk about common roots. Let’s have a look into the ARMA(2,1) process of the following form: yt=θ1yt1+θ2yt2+εt+α1εt1yt=θ1L.yt+θ2L2yt+εt+α1L.εtyt[1θ1Lθ2L2]=εt[1+α1L]yt(1φ1L)(1φ2L)=εt(1+α1L)Now, if α1=φ1yt(1+α1L)(1φ2L)=εt(1+α1L)yt\cancel(1+α1L)(1φ2L)=εt\cancel(1+α1L)yt(1φ2L)=εt

In the above we see that we start with a process ARMA(2,1) and end up with a ARMA(21,11) or AR(1) process. Because fo the cancelling root, it seem they are equivalent which is wrong. In general. becasue of one canceling root ARMA(p,q) can be written as ARMA(p1,q1).

An example can better illustrate the situation but above explantion might suffice here. The issue is that with a common cancelling root it is difficult to identify which one is the real underlying stochastic process.

A time series will often have well-defined components: trend and seasonal patterns

A well chosen linear regression may account for these non-stationary components

In this case the residuals from the fitted model should not contain noticeable trend or seasonal patterns

But unfortunately that does not happen always

Residuals will usually be correlated in time, this is not accounted for in the fitted regression model

This happens because similar values may cluster together in time

Adjacent observations may be negatively correlated, for example high monthly sales figure may be followed by an unusually low value because customers have supplies left over from the previous month

In this topic, we consider stationary models that may be suitable for residual series that contain no obvious trends of seasonal cycles

The fitted stationary models may then be combined with the fitted regression model to improve forecasts

The autoregressive models that were introduced often provide satisfactory models for the residual time series.

6.6 Mixed models: The ARMA process

Definition

An AR(p) process can be described in the following manner:

xt=α1xt1+α2xt2++αpxtp+wt

Here wt is white noise and the αi are the model parameters.

A useful class of models are obtained when AR and MA terms are added together in a single expression

A time series xt follows an autoregressive moving average (ARMA) process of order (p,q), denoted ARMA(p,q) when

x_t=\alpha_1x_{t-1}+\alpha_2x_{t-2}+\ldots+\alpha_px_{t-p}+w_t+ \beta_1w_{t-1)+\beta_2w_{t-2}+\ldots+\beta_qw_{t-q}

Here wt is white noise

The above equation can be represented in terms of the backward shift operator B and rearranged in the more concise polynomial form

θp(B)xt=ϕq(B)wt

The following points should be noted about an ARMA(p,q) process:

The process is stationary when the roots of θ all exceed unity in absolute value

The process is invertible when the roots of ϕ all exceed unity in absolute value

The AR(p) model is the special case ARMA(p,0)

The MA(q) model is the special case ARMA(0,q)

Parameter parsimony : When fitting to data, an ARMA model will often be more parameter efficient - require fewer parameters - than a single MA or AR model

Parameter redundancy: when θ and ϕ share a common factor, a stationary model can be simplified. For example, the model (112B)(113)xt=(112)wt$ can be written as (113)xt=wt

ARMA models: Empirical analysis

Simulation and fitting

The ARMA process can be simulated using the R function arima.sim

This function takes a list of coefficients representing the AR and MA parameters.

An ARMA(p,q) model can be fitted using the arima function with the order function parameter set to c(p,0,q)

The fitting algorithm proceeds similarly to that for a MA process.

The fitting algorithm proceeds similarly to that for an MA process

Below data from an ARMA(1,1) process are simulated for α=-0.6 and β=0.5

As expected, the sample estimates of α and β are close to the underlying model parameters

set.seed(1)
x <- arima.sim(n = 10000, list(ar = -0.6, ma = 0.5))
coef(arima(x, order = c(1, 0, 1)))
         ar1          ma1    intercept 
-0.596966371  0.502703368 -0.006571345 

Exchange rate series

Previously, we have found that a simple MA(1) model failed to provide an adequate fit to the exchange rate series.

In the code below, fitted MA(1), AR(1) and ARMA(1,1) models are compared using the AIC

ARMA(1,1) model provides the best fit to the data, followed by AR(1) model, along with MA(1) model providing the poorest fit.

The figure indicates that the residuals of the fitted ARMA(1,1) model have small autocorrelations

This is consistent with a realisation of white noise and supports the use of the model

x.ma <- arima(x.ts, order = c(0, 0, 1))
x.ar <- arima(x.ts, order = c(1, 0, 0))
x.arma <- arima(x.ts, order = c(1, 0, 1))

Now let’s find the AIC’s

AIC(x.ma)
AIC(x.ar)
AIC(x.arma) 
acf(resid(x.arma))

Electricity production series

Let’s consider the Australian electricity production series

The data exhibit a clear positive trend and a regular seasonal cycle

The variance increases with time

This suggests a log-transformation may be appropriate

A regression model is fitted to the logarithms of the original series in the code below

www <- "http://www.massey.ac.nz/~pscowper/ts/cbe.dat"
CBE <- read.table(www, header = T)
Elec.ts <- ts(CBE[, 3], start = 1958, freq = 12)
Time <- 1:length(Elec.ts)
Imth <- cycle(Elec.ts)
Elec.lm <- lm(log(Elec.ts) ~ Time + I(Time^2) + factor(Imth))
acf(resid(Elec.lm))

The correlogram of the residuals appears to cycle with a period of 12 months suggesting that the monthly indicator variables are not sufficient to account for the seasonality in the series

When we study ARIMA, we will find that this can be accounted for using a non-stationary model with a stochastic seasonal component.

In the meantime, the best fitting ARMA(p,q) model can be chosen using the smallest AIC either by trying a range of combinations of p and q in the arima function

Or using a for loop with upper bounds on p and q

In each step of the for loop, the AIC of the fitted model is commpared with the currently stored smallest value

If the model is found to be an improvement, that is it has smaller AIC value, then the new value and model are stored

To start with, best.aic is initialised to infinity Inf

After the loop is complete, the best model can be found in best.order

In this case, the best model turns out to be an AR(2) model

best.order <- c(0, 0, 0)
 best.aic <- Inf
 for (i in 0:2) for (j in 0:2) {
fit.aic <- AIC(arima(resid(Elec.lm), order = c(i, 0,
j)))
if (fit.aic < best.aic) {
best.order <- c(i, 0, j)
best.arma <- arima(resid(Elec.lm), order = best.order)
best.aic <- fit.aic
}}

Now let’s see the best order

best.order 

Now time for the ACF

acf(resid(best.arma))

The predict function can be used to both to forecast future values from the fitted regression model

Forecast the future errors associated with the regression model suing the ARMA model fitted to the residuals from the regression

These two forecasts can then be summed to give a forecasted value of the logarithm for electricity production

This would need to be anti-logged and perhaps adjusted using a bias correction factor

predict is a generic R function, it works in different ways for different input objects and classes

For a fitted regression model of class lm, the predict function requires the new set of data to be in the form of a data frame

For a fitted ARMA model of class arima, the predict function requires just the number of time steps ahead for the desired forecast

In the latter case, predict produces an object that has both the predicted values and their standard errors

These two values can be extracted using pred and se

In the code below, the electricity production for each month of the next three years is predicted

 new.time <- seq(length(Elec.ts), length = 36)
 new.data <- data.frame(Time = new.time, Imth = rep(1:12,
3))
 predict.lm <- predict(Elec.lm, new.data)
 predict.arma <- predict(best.arma, n.ahead = 36)
 elec.pred <- ts(exp(predict.lm + predict.arma$pred), start = 1991,
freq = 12)
 ts.plot(cbind(Elec.ts, elec.pred), lty = 1:2)

The plot of the forecasted values suggests that the predicted values for winter may be underestimated by the fitted model

This may be due to the remaining seasonal autocorrelation in the residuals

This problem will be tackled in ARIMA topic

Wave tank data

The data in the file wave.dat are the surface height of water (mm), relative to the still water level

This is measured using a capcitance probe positioned at the centre of a wave tank

The continuous voltage signal from this capacitance probe was sampled every 0.1 second over a 39.6 second period.

The objective is to fit a suitable ARMA(p,q) model that can be used to generate a realsistic wave input to a mathematical model for an ocean-going tugboat in a computer simulation

The results of the computer simulation will be compared with tests using a physical model of the tugboat in the wave tank

The pacf suggest that p should be at least 2

The best fitting ARMA(p,q) model based on a minimum variance of residuals was obtained with both p and q equal to 4. The acf and pacf of the residuals from this model are consistent with the residuals being a realisation of white noise.

www <- "http://www.massey.ac.nz/~pscowper/ts/wave.dat"
wave.dat <- read.table(www, header = T)
attach (wave.dat)
layout(1:3)
plot (as.ts(waveht), ylab = 'Wave height')
acf (waveht)
pacf (waveht)
wave.arma <- arima(waveht, order = c(4,0,4))
acf (wave.arma$res[-(1:4)])
pacf (wave.arma$res[-(1:4)])
hist(wave.arma$res[-(1:4)], xlab='height / mm', main='')