Chapter 12 Transform a data frame to tsibble
object
Readings: FPP3, Section 2.1
Convert the data frame into a time series tsibble
object.
# install.packages("tsibble")
library(tsibble) # Reference: https://tsibble.tidyverts.org/articles/intro-tsibble.html
esales <- arrow::read_feather("data/esales.feather")
esales %>%
dplyr::select(date, sales_GWh = value) -> esales_tbl
esales_tbl %>% as_tsibble(index = date) -> elsales_tbl_ts
print(elsales_tbl_ts)
## # A tsibble: 233 x 2 [1D]
## date sales_GWh
## <date> <dbl>
## 1 2001-01-01 9576.
## 2 2001-02-01 7820.
## 3 2001-03-01 8070.
## 4 2001-04-01 7153.
## 5 2001-05-01 7224.
## 6 2001-06-01 8264.
## 7 2001-07-01 8896.
## 8 2001-08-01 9404.
## 9 2001-09-01 7753.
## 10 2001-10-01 7272.
## # … with 223 more rows
12.1 Time indexing
See R4DS ch. 16.
Depending on how dates and times are recorded in your raw data, you may face more or less work to organize them into form(s) suitable as tsibble
index variable.
The lubridate
and hms
packages may be valuable.
# install.packages("feasts"), Reference: https://feasts.tidyverts.org/
library(feasts)
elsales_tbl_ts %>%
mutate(Month = tsibble::yearmonth(date)) %>%
as_tsibble(index = Month) %>%
dplyr::select(Month,sales_GWh) -> vaelsales_tbl_ts
print(vaelsales_tbl_ts)
## # A tsibble: 233 x 2 [1M]
## Month sales_GWh
## <mth> <dbl>
## 1 2001 Jan 9576.
## 2 2001 Feb 7820.
## 3 2001 Mar 8070.
## 4 2001 Apr 7153.
## 5 2001 May 7224.
## 6 2001 Jun 8264.
## 7 2001 Jul 8896.
## 8 2001 Aug 9404.
## 9 2001 Sep 7753.
## 10 2001 Oct 7272.
## # … with 223 more rows
12.2 Running diagnostics on your tsibble
** Ideally, should have exactly one row (i.e., one vector of measured values) for each time interval (index
) and each value of the key
variables.
– May not have any duplicates.
– May have missing values