Chapter 12 Transform a data frame to tsibble object

Readings: FPP3, Section 2.1

Convert the data frame into a time series tsibble object.

# install.packages("tsibble")
library(tsibble) # Reference: https://tsibble.tidyverts.org/articles/intro-tsibble.html

esales <- arrow::read_feather("data/esales.feather")
esales %>%
  dplyr::select(date, sales_GWh = value) -> esales_tbl
esales_tbl %>% as_tsibble(index = date) -> elsales_tbl_ts

print(elsales_tbl_ts)
## # A tsibble: 233 x 2 [1D]
##    date       sales_GWh
##    <date>         <dbl>
##  1 2001-01-01     9576.
##  2 2001-02-01     7820.
##  3 2001-03-01     8070.
##  4 2001-04-01     7153.
##  5 2001-05-01     7224.
##  6 2001-06-01     8264.
##  7 2001-07-01     8896.
##  8 2001-08-01     9404.
##  9 2001-09-01     7753.
## 10 2001-10-01     7272.
## # … with 223 more rows

12.1 Time indexing

See R4DS ch. 16.

Depending on how dates and times are recorded in your raw data, you may face more or less work to organize them into form(s) suitable as tsibble index variable.

The lubridate and hms packages may be valuable.

# install.packages("feasts"), Reference: https://feasts.tidyverts.org/
library(feasts)

elsales_tbl_ts %>%
  mutate(Month = tsibble::yearmonth(date)) %>%
  as_tsibble(index = Month) %>%
  dplyr::select(Month,sales_GWh) -> vaelsales_tbl_ts

print(vaelsales_tbl_ts)
## # A tsibble: 233 x 2 [1M]
##       Month sales_GWh
##       <mth>     <dbl>
##  1 2001 Jan     9576.
##  2 2001 Feb     7820.
##  3 2001 Mar     8070.
##  4 2001 Apr     7153.
##  5 2001 May     7224.
##  6 2001 Jun     8264.
##  7 2001 Jul     8896.
##  8 2001 Aug     9404.
##  9 2001 Sep     7753.
## 10 2001 Oct     7272.
## # … with 223 more rows

12.2 Running diagnostics on your tsibble

** Ideally, should have exactly one row (i.e., one vector of measured values) for each time interval (index) and each value of the key variables. – May not have any duplicates. – May have missing values

12.2.1 Duplicate values

12.2.2 Missing values

12.2.3 Irregular time intervals