4.1 Non-Business-Day Gaps
4.1.1 Different Functions for Autoregressive Models
mods_fulton_test <- list()
mods_fulton_test[[1]] <- tsi_fulton %>%
{.$p.log} %>%
ar.ols(aic = FALSE, order.max = 1, demean = F, intercept = T)
mods_fulton_test[[2]] <- tsi_fulton %>%
{.$p.log} %>%
ar.mle(aic = FALSE, order.max = 1, demean = T, intercept = F)
# There is no way for `ar.mle` to estimate the intercept directly.
mods_fulton_test[[3]] <- tsi_fulton %>%
{.$p.log} %>%
arima(order = c(1, 0, 0), include.mean = T, method = "ML")
# There is no way for `arima` to estimate the intercept directly.
mods_fulton_test[[4]] <- tsi_fulton %>%
{.$p.log} %>%
gets::arx(mc = T, ar = 1, mxreg = NULL, qstat.options = c(1,1))
mods_fulton_test[[5]] <- tsi_fulton %>%
model(fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)))
mods_fulton_test[[6]] <- tsi_fulton %>%
{.$p.log} %>%
arima(order = c(1, 0, 0), include.mean = T, method = "CSS")| hwi_mods | intercept | ar1 |
|---|---|---|
| 1 | -0.03854 | 0.7628 |
| 2 | -0.04351 | 0.7580 |
| 3 | -0.04351 | 0.7580 |
| 4 | -0.03854 | 0.7628 |
| 5 | -0.04351 | 0.7580 |
| 6 | -0.03855 | 0.7628 |
4.1.2 Fill Non-Business-Day Gaps with NA
tsi_fulton_t <- read_csv("~/GitHub/TidySimStat/data/Fulton.csv") %>%
# mutate(p = exp(LogPrice)) %>%
mutate(t = ymd(Date)) %>%
mutate(p.log = LogPrice) %>%
mutate(i = row_number()) %>%
select(t, i, p.log) %>%
as_tsibble(index = t)When t is used as the index of tsibble, non-business-day (NBD) gaps will be detected. fill_gaps() can be used to fill NA values for NBD. See Handle implicit missingness with tsibble for details.
| t | i | p.log |
|---|---|---|
| 1991-12-02 | 1 | -0.43078 |
| 1991-12-03 | 2 | 0.00000 |
| 1991-12-04 | 3 | 0.07232 |
| 1991-12-05 | 4 | 0.24714 |
| 1991-12-06 | 5 | 0.66433 |
| 1991-12-07 | NA | NA |
| 1991-12-08 | NA | NA |
| 1991-12-09 | 6 | -0.20651 |
| 1991-12-10 | 7 | -0.11583 |
| 1991-12-11 | 8 | -0.25987 |
| 1991-12-12 | 9 | -0.11713 |
| 1991-12-13 | 10 | -0.34208 |
| 1991-12-14 | NA | NA |
| 1991-12-15 | NA | NA |
If fill_gaps() is not used, when fable::ARIMA is applied, there will be an error stating that the data contains implicit gaps in time. You should check your data and convert implicit gaps into explicit missing values.
| .model | term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|---|
| fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | ar1 | 0.80098 | 0.05130 | 15.614 | 8.903e-30 |
| fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | constant | -0.03826 | 0.01836 | -2.083 | 3.951e-02 |
However, because the existence of NA values, the above result turns out to be different from what we obtained in section 1.
| .model | term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|---|
| fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | ar1 | 0.75805 | 0.06287 | 12.058 | 6.601e-22 |
| fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | constant | -0.04351 | 0.02324 | -1.872 | 6.379e-02 |
Whether to fill the NBD gaps depends on the context. Forecasting time series with data on weekdays only, CrossValidated. More advanced calendars may be used. Calendarise self-defined date-times (e.g. business days and time) and respect structural missingness.
4.1.3 How to Visualize without NBD Gaps Filled
So if we don’t want to fill NBD gaps, a nex index must be added like that in section 1. It is hard to use autoplot() with t specified as x axis. ggplot() must be used.