Chapter 1 Toolbox
This section deals with fundamental concepts in time series analysis. Time series analysis typically uses different data structures than the standard data.frame
and tibble
of other analyses. The fable
package provides a framework for piped “tidy” modeling that improves on the conventional lm(formula, data)
framework. The subsequent modeling chapters make use of standard benchmarking methods, prediction intervals, and accuracy evaluation methods presented here.
1.1 R Structures
Use a tsibble
object to work with time series data. A tsibble
, from the package of the same name, is a time-series tibble
. Unlike the older, more common ts
, zoo
, and xts
objects, a tsibble
preserves the time index, making heterogeneous data structures possible. For example, you can re-index a tsibble
from monthly to yearly analysis, or include one or more features per time element. Since the tsibble
is relatively new, you will encounter the other frameworks and should at least be familiar with them.
Let’s work with the prison_population.csv
file accompanying Hyndman’s text to create time series structures with each framework. The data set is quarterly prison population counts grouped by several features. In essence, it is several time series (State
x Gender
x Legal
x Indigenous
) within one file.
<- readr::read_csv("https://OTexts.com/fpp3/extrafiles/prison_population.csv")
prison head(prison)
## # A tibble: 6 x 6
## Date State Gender Legal Indigenous Count
## <date> <chr> <chr> <chr> <chr> <dbl>
## 1 2005-03-01 ACT Female Remanded ATSI 0
## 2 2005-03-01 ACT Female Remanded Non-ATSI 2
## 3 2005-03-01 ACT Female Sentenced ATSI 0
## 4 2005-03-01 ACT Female Sentenced Non-ATSI 5
## 5 2005-03-01 ACT Male Remanded ATSI 7
## 6 2005-03-01 ACT Male Remanded Non-ATSI 58
dim(prison)
## [1] 3072 6
ts, zoo, and xts
ts is the base R time series package. The ts
object is essentially a matrix of observations indexed by a chronological identifier. Because it is a matrix, any descriptive attributes need to enter as numeric, perhaps by one-hot encoding, or pivoting the data (yuck). But since there is only one row per time observation, the descriptive attributes cannot really act as grouping variables. Another limitation of a ts
is that it does not recognize irregularly spaced time series.
Define a ts
object with ts(x, start, frequency)
where frequency
is the number of observations in the seasonal pattern: 7 for daily observations with a week cycle; 5 for weekday observations in a week cycle; 24 for hourly observations in a day cycle, 24x7 for hourly observations in a week cycle, etc. prison
is quarterly observations starting with 2005 Q1. Had the series started at 2005 Q2, you’d specify start = c(2005, 2)
. I’ll pull a single time series from the file with a filter()
statement.
<- prison %>%
prison_ts filter(State == "ACT" & Gender == "Male" & Legal == "Remanded" & Indigenous == "ATSI") %>%
arrange(Date) %>%
select(Count) %>%
ts(start = 2005, frequency = 4)
str(prison_ts)
## Time-Series [1:48, 1] from 2005 to 2017: 7 7 9 9 12 9 8 7 6 11 ...
## - attr(*, "dimnames")=List of 2
## ..$ : NULL
## ..$ : chr "Count"
zoo (Zeileis’s ordered observations) provides methods similar to those in ts, but also supports irregular time series. A zoo
object contains an array of data values and an index attribute to provide information about the data ordering. zoo was introduced in 2014.
xts (extensible time series) extends zoo. xts
objects are more flexible than ts
objects while imposing reasonable constraints to make them truly time-based. An xts
object is essentially a matrix of observations indexed by a time object. Create an xts
object with xts(x, order.by)
where order.by
is a vector of dates/times to index the data. You can also add metadata to the xts
object by declaring name-value pairs such as born = as.POSIXct("1899-05-08").
library(xts)
<- prison %>%
x filter(State == "ACT" & Gender == "Male" & Legal == "Remanded" & Indigenous == "ATSI") %>%
arrange(Date)
<- xts(x$Count, order.by = x$Date,
prison_xts State = "ACT", Gender = "Male", Legal = "Remanded", Indigenous = "ATSI")
str(prison_xts)
## An 'xts' object on 2005-03-01/2016-12-01 containing:
## Data: num [1:48, 1] 7 7 9 9 12 9 8 7 6 11 ...
## Indexed by objects of class: [Date] TZ: UTC
## xts Attributes:
## List of 4
## $ State : chr "ACT"
## $ Gender : chr "Male"
## $ Legal : chr "Remanded"
## $ Indigenous: chr "ATSI"
tsibble
A tsibble
object is a tibble uniquely defined by key
columns plus a date index
column. This structure accommodates multiple series, and descriptive attribute columns. The date index can be a Date
, period
, etc. (see tsibble() help file
). Express weekly time series with yearweek()
, monthly time series with yearmonth()
, or quarterly (like here) with yearquarter()
.
<- prison %>%
prison_tsibble mutate(Date = yearquarter(Date)) %>%
rename(Qtr = Date) %>%
tsibble(key = c(State, Gender, Legal, Indigenous), index = Qtr)
head(prison_tsibble)
## # A tsibble: 6 x 6 [1Q]
## # Key: State, Gender, Legal, Indigenous [1]
## Qtr State Gender Legal Indigenous Count
## <qtr> <chr> <chr> <chr> <chr> <dbl>
## 1 2005 Q1 ACT Female Remanded ATSI 0
## 2 2005 Q2 ACT Female Remanded ATSI 1
## 3 2005 Q3 ACT Female Remanded ATSI 0
## 4 2005 Q4 ACT Female Remanded ATSI 0
## 5 2006 Q1 ACT Female Remanded ATSI 1
## 6 2006 Q2 ACT Female Remanded ATSI 1
A tsibble
behaves like a tibble
, so you can use *tidyverse** verbs. The only thing that will trip you up is that tsibble
objects are grouped by the index - group_by()
operations only group non-index columns while retaining the index. Use index_by()
if you need to summarize at a new time level (e.g., year).
# Group by State retains the Qtr index column.
%>%
prison_tsibble group_by(State) %>%
summarize(sum_Count = sum(Count))
## # A tsibble: 384 x 3 [1Q]
## # Key: State [8]
## State Qtr sum_Count
## <chr> <qtr> <dbl>
## 1 ACT 2005 Q1 178
## 2 ACT 2005 Q2 183
## 3 ACT 2005 Q3 187
## 4 ACT 2005 Q4 204
## 5 ACT 2006 Q1 190
## 6 ACT 2006 Q2 190
## 7 ACT 2006 Q3 165
## 8 ACT 2006 Q4 179
## 9 ACT 2007 Q1 172
## 10 ACT 2007 Q2 160
## # ... with 374 more rows
# But you can change the Qtr index aggregation level with index_by()
%>%
prison_tsibble index_by(Year = ~ year(.)) %>%
group_by(State) %>%
summarise(sum_Count = sum(Count))
## # A tsibble: 96 x 3 [1Y]
## # Key: State [8]
## State Year sum_Count
## <chr> <dbl> <dbl>
## 1 ACT 2005 752
## 2 ACT 2006 724
## 3 ACT 2007 673
## 4 ACT 2008 697
## 5 ACT 2009 668
## 6 ACT 2010 842
## 7 ACT 2011 946
## 8 ACT 2012 1085
## 9 ACT 2013 1146
## 10 ACT 2014 1336
## # ... with 86 more rows
# Add group_by_key() to retain the key
%>%
prison_tsibble group_by_key() %>%
index_by(Year = ~ year(.)) %>%
summarize(sum_Count = sum(Count))
## # A tsibble: 768 x 6 [1Y]
## # Key: State, Gender, Legal, Indigenous [64]
## # Groups: State, Gender, Legal [32]
## State Gender Legal Indigenous Year sum_Count
## <chr> <chr> <chr> <chr> <dbl> <dbl>
## 1 ACT Female Remanded ATSI 2005 1
## 2 ACT Female Remanded ATSI 2006 3
## 3 ACT Female Remanded ATSI 2007 3
## 4 ACT Female Remanded ATSI 2008 9
## 5 ACT Female Remanded ATSI 2009 6
## 6 ACT Female Remanded ATSI 2010 5
## 7 ACT Female Remanded ATSI 2011 6
## 8 ACT Female Remanded ATSI 2012 2
## 9 ACT Female Remanded ATSI 2013 4
## 10 ACT Female Remanded ATSI 2014 4
## # ... with 758 more rows
# If you don't care about the time index, convert the tsibble back to a tibble.
%>%
prison_tsibble as_tibble() %>%
group_by(State) %>%
summarize(sum_Count = sum(Count))
## # A tibble: 8 x 2
## State sum_Count
## <chr> <dbl>
## 1 ACT 12003
## 2 NSW 491924
## 3 NT 58193
## 4 QLD 290794
## 5 SA 101606
## 6 TAS 24367
## 7 VIC 234734
## 8 WA 220256
1.2 Fitting Models
Consider whether you are fitting a model for explanatory variable inference or for predictive purposes. If explanation is your goal, your workflow will be fitting a model, verifying the model assumptions related to inference, then summarizing the model parameters. If prediction is your goal, your workflow will be comparing multiple models by cross-validating the results against a hold-out data set, then making predictions.
Fit a model using fabletools::model()
.1 You can even fit multiple models at once. Let’s fit a simple model using the GAFA stock prices data set in tsibbledata. GAFA is daily stock prices from 2014-2018 for several companies. We’ll work with Google. The data is indexed by date, but we’ll re-index to “trading day” since the trading days are irregularly spaced.
<- tsibbledata::gafa_stock %>%
goog filter(Symbol == "GOOG", year(Date) >= 2015) %>%
# re-index on trading day since markets not open on weekends, holidays
arrange(Date) %>%
mutate(trading_day = row_number()) %>%
update_tsibble(index = trading_day, regular = TRUE)
head(goog)
## # A tsibble: 6 x 9 [1]
## # Key: Symbol [1]
## Symbol Date Open High Low Close Adj_Close Volume trading_day
## <chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 GOOG 2015-01-02 526. 528. 521. 522. 522. 1447600 1
## 2 GOOG 2015-01-05 520. 521. 510. 511. 511. 2059800 2
## 3 GOOG 2015-01-06 512. 513. 498. 499. 499. 2899900 3
## 4 GOOG 2015-01-07 504. 504. 497. 498. 498. 2065100 4
## 5 GOOG 2015-01-08 495. 501. 488. 500. 500. 3353600 5
## 6 GOOG 2015-01-09 502. 502. 492. 493. 493. 2069400 6
dim(goog)
## [1] 1006 9
Let’s fit a naive model (projection of last value) to Google’s Close
from 2015, then predict values from Jan 2016. We’ll create an 80:20 train-test split (test
should be ~20% or at least as long as the anticipated forecast.).
# Segment with `filter()`, or `group_by()` + `slice()`.
<- goog %>% filter(year(Date) == 2015)
goog_train <- goog %>% filter(yearmonth(Date) == yearmonth("2016 Jan"))
goog_test
# Train model
<- goog_train %>%
goog_mdl model(mdl_naive = NAIVE(Close))
# Generate predictions (forecast)
<- goog_mdl %>%
goog_fc forecast(new_data = goog_test)
The autoplot()
and autolayer()
functions take a lot of the headache out of plotting the results, especially since forecast()
tucks away the confidence intervals in a distribution
list object.
# Consider using autoplot() + autolayer()...
# goog_fc %>%
# autoplot(color = "goldenrod") +
# autolayer(goog_train, Close, color = "goldenrod")
<- goog_fc %>%
goog_fc_2 mutate(mu = map_dbl(Close, ~pluck(.x, "mu")),
sigma = map_dbl(Close, ~pluck(.x, "sigma")),
ci_025 = qnorm(.025, mu, sigma),
ci_100 = qnorm(.100, mu, sigma),
ci_900 = qnorm(.900, mu, sigma),
ci_975 = qnorm(.975, mu, sigma)) %>%
select(trading_day, Date, Close, mu, sigma, ci_025:ci_975)
bind_rows(goog_train, goog_test) %>%
ggplot(aes(x = Date)) +
geom_line(aes(y = Close), color = "goldenrod") +
geom_line(data = goog_fc_2, aes(y = mu), color = "goldenrod", size = 1) +
geom_ribbon(data = goog_fc_2,
aes(ymin = ci_025, ymax = ci_975),
fill = "goldenrod", alpha = .2) +
geom_ribbon(data = goog_fc_2,
aes(ymin = ci_100, ymax = ci_900),
fill = "goldenrod", alpha = .2) +
theme_light() +
theme(legend.position = "none") +
labs(title = "20d naive forecast from model fit to CY-2015",
caption = "Shaded area is 80%- and 95% confidence interval.",
x = "Trading Day", y = "Closing Price")
1.3 Evaluating Models
Evaluate the model fit with residuals diagnostics.2 broom::augment()
adds three columns to the model cols: .fitted
, .resid
, and .innov
. .innov
is the residual from the transformed data (if no transformation, it just equals .resid
).
<- goog_mdl %>% broom::augment() goog_mdl_aug
Innovation residuals should be independent random variables normally distributed with mean zero and constant variance (the normality and variance conditions are only required for inference and prediction intervals). Happily, feasts
has just what you need.
%>% gg_tsresiduals() +
goog_mdl labs(title = "Residuals Analysis")
The autocorrelation plot above supports the independence assumption. The histogram plot tests normality (it is pretty normal, but the right tail is long). The residuals plot tests mean zero and constant variance. You can carry out a portmanteau test test on the autocorrelation assumption. Two common tests are the Box-Pierce and the Ljung-Box. These tests check the likelihood of a combination of autocorrelations at once, without testing any one correlation - kind of like an ANOVA test. The Ljung-Box test statistic is a sum of squared \(k\)-lagged autocorrelations, \(r_k^2\),
\[Q^* = T(T+2) \sum_{k=1}^l(T-k)^{-1}r_k^2.\]
The test statistic has a \(\chi^2\) distribution with \(l - K\) degrees of freedom (where \(K\) is the number of parameters in the model). Use \(l = 10\) for non-seasonal data and \(l = 2m\) for seasonal data. If your model has no explanatory variables, \(K = 0.\)
%>% features(.var = .innov, features = ljung_box, lag = 10, dof = 0) goog_mdl_aug
## # A tibble: 1 x 4
## Symbol .model lb_stat lb_pvalue
## <chr> <chr> <dbl> <dbl>
## 1 GOOG mdl_naive 7.91 0.637
The p-value is not under .05, so do not reject the assumption of no autocorrelation - i.e., the assumption of white noise.
1.4 Evaluating Accuracy
Some forecasting methods are extremely simple and surprisingly effective. The mean method projects the historical average, \(\hat{y}_{T+h|T} = \bar{y}.\) The naive method projects the last observation, \(\hat{y}_{T+h|T} = y_T.\) The seasonal naive method projects the last seasonal observation, \(\hat{y}_{T+h|T} = y_{T+h-m(k+1)}.\) The drift method projects the straight line from the first and last observation, \(\hat{y}_{T+h|T} = y_T + h\left(\frac{y_T - y_1}{T-1}\right).\) The plot below of tsibbledata::aus_production
shows the four forecast benchmarks.
::aus_production %>%
tsibbledata# same thing as
# filter(Quarter >= yearquarter("1970 Q1") & Quarter <= yearquarter("2004 Q4")) %>%
filter_index("1995 Q1" ~ "2007 Q4") %>%
model(Mean = MEAN(Beer),
Naive = NAIVE(Beer),
SNaive = SNAIVE(Beer),
Drift = RW(Beer ~ drift())) %>%
forecast(h = 8) %>%
ggplot(aes(x = Quarter)) +
geom_line(aes(y = .mean, color = .model), size = 1) +
geom_line(data = tsibbledata::aus_production %>% filter_index("1995 Q1" ~ "2009 Q4"),
aes(y = Beer),
color = "darkgrey", size = 1) +
theme_light() +
guides(color = guide_legend(title = "Forecast")) +
labs(title = "Simple forecast methods are useful benchmarks.",
x = NULL, y = NULL,
caption = "Source: Quarterly beer production (ML) from tsibbledata::aus_production.")
Evaluate the forecast accuracy with the test data (aka, “hold-out set,” and “out-of-sample data”). The forecast error is the difference between the observed and forecast value, \(e_{T+h} = y_{T+h} - \hat{y}_{t+h|T}.\) Forecast errors differ from model residuals in the data (train vs test) and because forecast values are (usually) multi-step forecasts which include prior forecast values as inputs.
There are a few benchmark metrics to evaluate a fit based on the errors.
- MAE. Mean absolute error, \(mean(|e_t|)\)
- RMSE. Root mean squared error, \(\sqrt{mean(e_t^2)}\)
- MAPE. Mean absolute percentage error, \(mean(|e_t / y_t|) \times 100\)
- MASE. Mean absolute scaled error, \(MAE/Q\) where \(Q\) is a scaling constant calculated as the average one-period change in the outcome variable (error from a one-step naive forecast).
The MAE and RMSE are on the same scale as the data, so they are only useful for comparing models fitted to the same series. MAPE is unitless, but does not work for \(y_t = 0\), and it assumes a meaningful zero (ratio data). The MASE is most useful for comparing data sets of different units.
Use accuracy()
to evaluate a model. Comparing the naive, drift, and mean methods for forecasting the Google stock price, the naive model wins on all measures.
<- goog_train %>%
goog_mdl model(Naive = NAIVE(Close),
Drift = RW(Close ~ drift()),
Mean = MEAN(Close))
<- goog_mdl %>%
goog_fc forecast(new_data = goog_test)
ggplot() +
geom_line(data = goog_fc, aes(x = trading_day, y = .mean, color = .model)) +
geom_line(data = bind_rows(goog_train, goog_test), aes(x = trading_day, y = Close)) +
theme_light() +
scale_y_continuous(labels = scales::dollar_format()) +
labs(title = "Comparison of three models.",
subtitle = "Google stock price predicted over 19 days.",
x = "Trading Day", y = "Closing Price", color = "Forecast")
# Or just this.
# autoplot(goog_fc, bind_rows(goog_train, goog_test), level = NULL)
accuracy(goog_fc, goog) %>%
select(.model, RMSE, MAE, MAPE, MASE)
## # A tibble: 3 x 5
## .model RMSE MAE MAPE MASE
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Drift 53.1 49.8 6.99 6.99
## 2 Mean 118. 117. 16.2 16.4
## 3 Naive 43.4 40.4 5.67 5.67
There is a better way of evaluating a model than with a single test set. Time series cross-validation breaks the dataset into multiple training sets by setting the cutoff at varying points and then setting the test set to be a single steps ahead of the horizon. Function stretch_tsibble()
creates a tsibble of initial size .init
then appends additional data sets of increasing size .step
.
<- goog %>%
goog_train_cv filter(year(Date) == 2015) %>%
stretch_tsibble(.init = 3, .step = 1) %>%
# move .id next to the other key col
relocate(Date, Symbol, .id)
# 250 keys!
goog_train_cv
## # A tsibble: 31,875 x 10 [1]
## # Key: Symbol, .id [250]
## Date Symbol .id Open High Low Close Adj_Close Volume trading_day
## <date> <chr> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int>
## 1 2015-01-02 GOOG 1 526. 528. 521. 522. 522. 1447600 1
## 2 2015-01-05 GOOG 1 520. 521. 510. 511. 511. 2059800 2
## 3 2015-01-06 GOOG 1 512. 513. 498. 499. 499. 2899900 3
## 4 2015-01-02 GOOG 2 526. 528. 521. 522. 522. 1447600 1
## 5 2015-01-05 GOOG 2 520. 521. 510. 511. 511. 2059800 2
## 6 2015-01-06 GOOG 2 512. 513. 498. 499. 499. 2899900 3
## 7 2015-01-07 GOOG 2 504. 504. 497. 498. 498. 2065100 4
## 8 2015-01-02 GOOG 3 526. 528. 521. 522. 522. 1447600 1
## 9 2015-01-05 GOOG 3 520. 521. 510. 511. 511. 2059800 2
## 10 2015-01-06 GOOG 3 512. 513. 498. 499. 499. 2899900 3
## # ... with 31,865 more rows
<- goog_train_cv %>%
goog_fc # Fit a model for each key
model(RW(Close ~ drift())) %>%
# 8 forecast rows per model = 250 * 8 = 2,000 rows
forecast(h = 8) %>%
# Capture the forecast period for comparison
group_by(.id) %>%
mutate(h = row_number()) %>%
ungroup()
%>%
goog_fc accuracy(goog %>% filter(year(Date) == 2015), by = c("h", ".model")) %>%
select(h, RMSE)
## # A tibble: 8 x 2
## h RMSE
## <int> <dbl>
## 1 1 11.3
## 2 2 16.9
## 3 3 20.7
## 4 4 23.7
## 5 5 26.4
## 6 6 28.7
## 7 7 31.2
## 8 8 33.3