4.1 Non-Business-Day Gaps
4.1.1 Different Functions for Autoregressive Models
mods_fulton_test <- list()
mods_fulton_test[[1]] <- tsi_fulton %>%
{.$p.log} %>%
ar.ols(aic = FALSE, order.max = 1, demean = F, intercept = T)
mods_fulton_test[[2]] <- tsi_fulton %>%
{.$p.log} %>%
ar.mle(aic = FALSE, order.max = 1, demean = T, intercept = F)
# There is no way for `ar.mle` to estimate the intercept directly.
mods_fulton_test[[3]] <- tsi_fulton %>%
{.$p.log} %>%
arima(order = c(1, 0, 0), include.mean = T, method = "ML")
# There is no way for `arima` to estimate the intercept directly.
mods_fulton_test[[4]] <- tsi_fulton %>%
{.$p.log} %>%
gets::arx(mc = T, ar = 1, mxreg = NULL, qstat.options = c(1,1))
mods_fulton_test[[5]] <- tsi_fulton %>%
model(fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)))
mods_fulton_test[[6]] <- tsi_fulton %>%
{.$p.log} %>%
arima(order = c(1, 0, 0), include.mean = T, method = "CSS")
hwi_mods | intercept | ar1 |
---|---|---|
1 | -0.03854 | 0.7628 |
2 | -0.04351 | 0.7580 |
3 | -0.04351 | 0.7580 |
4 | -0.03854 | 0.7628 |
5 | -0.04351 | 0.7580 |
6 | -0.03855 | 0.7628 |
4.1.2 Fill Non-Business-Day Gaps with NA
tsi_fulton_t <- read_csv("~/GitHub/TidySimStat/data/Fulton.csv") %>%
# mutate(p = exp(LogPrice)) %>%
mutate(t = ymd(Date)) %>%
mutate(p.log = LogPrice) %>%
mutate(i = row_number()) %>%
select(t, i, p.log) %>%
as_tsibble(index = t)
When t
is used as the index of tsibble
, non-business-day (NBD) gaps will be detected. fill_gaps()
can be used to fill NA
values for NBD. See Handle implicit missingness with tsibble for details.
t | i | p.log |
---|---|---|
1991-12-02 | 1 | -0.43078 |
1991-12-03 | 2 | 0.00000 |
1991-12-04 | 3 | 0.07232 |
1991-12-05 | 4 | 0.24714 |
1991-12-06 | 5 | 0.66433 |
1991-12-07 | NA | NA |
1991-12-08 | NA | NA |
1991-12-09 | 6 | -0.20651 |
1991-12-10 | 7 | -0.11583 |
1991-12-11 | 8 | -0.25987 |
1991-12-12 | 9 | -0.11713 |
1991-12-13 | 10 | -0.34208 |
1991-12-14 | NA | NA |
1991-12-15 | NA | NA |
If fill_gaps()
is not used, when fable::ARIMA
is applied, there will be an error stating that the data contains implicit gaps in time. You should check your data and convert implicit gaps into explicit missing values.
.model | term | estimate | std.error | statistic | p.value |
---|---|---|---|---|---|
fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | ar1 | 0.80098 | 0.05130 | 15.614 | 8.903e-30 |
fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | constant | -0.03826 | 0.01836 | -2.083 | 3.951e-02 |
However, because the existence of NA
values, the above result turns out to be different from what we obtained in section 1.
.model | term | estimate | std.error | statistic | p.value |
---|---|---|---|---|---|
fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | ar1 | 0.75805 | 0.06287 | 12.058 | 6.601e-22 |
fable::ARIMA(p.log ~ 1 + pdq(1, 0, 0)) | constant | -0.04351 | 0.02324 | -1.872 | 6.379e-02 |
Whether to fill the NBD gaps depends on the context. Forecasting time series with data on weekdays only, CrossValidated. More advanced calendars may be used. Calendarise self-defined date-times (e.g. business days and time) and respect structural missingness.
4.1.3 How to Visualize without NBD Gaps Filled
So if we don’t want to fill NBD gaps, a nex index must be added like that in section 1. It is hard to use autoplot()
with t
specified as x axis. ggplot()
must be used.