6  Dates with lubridate

In the script about data import you have learned about how to parse dates: for parse_date(), dates have to be formatted in a certain standard or you need to provide it with one. This is often tedious. That’s where the lubridate (Grolemund and Wickham 2011) package jumps in: it provides you with parsing functions that are more handy. They all take a character vector and the function’s name is related to the order of the date’s components. The functions recognize non-digit separators and are, therefore, most of the time a hassle-free way to parse dates.

library(lubridate)

Attaching package: 'lubridate'
The following objects are masked from 'package:base':

    date, intersect, setdiff, union
ymd("2000-02-29")
[1] "2000-02-29"
ymd("2000 02 29")
[1] "2000-02-29"
dmy("29.02.2000")
[1] "2000-02-29"
mdy("02-29-2000")
[1] "2000-02-29"

There is also a function for quarters:

yq("2000: Q3")
[1] "2000-07-01"

6.1 Date-times

They also have date-time equivalents:

ymd_hms("2000-02-29 14:00:00")
[1] "2000-02-29 14:00:00 UTC"
mdy_hm("02-29-2000 10.04")
[1] "2000-02-29 10:04:00 UTC"
dmy_h("29.02.2000 10")
[1] "2000-02-29 10:00:00 UTC"

6.2 More handy things

today()
[1] "2022-11-23"
now()
[1] "2022-11-23 13:05:04 CET"

6.3 Manipulating dates

6.3.1 Components

You can also extract singular components from dates using the following functions:

example_datetime <- ymd_hms("2000-02-29 14:00:00")

date(example_datetime)
[1] "2000-02-29"
year(example_datetime)
[1] 2000
month(example_datetime)
[1] 2
day(example_datetime)
[1] 29
hour(example_datetime)
[1] 14
minute(example_datetime)
[1] 0
second(example_datetime)
[1] 0
week(example_datetime)
[1] 9
quarter(example_datetime)
[1] 1
semester(example_datetime)
[1] 1
am(example_datetime)
[1] FALSE
pm(example_datetime)
[1] TRUE
leap_year(example_datetime)
[1] TRUE

6.3.2 Rounding

Sometimes you will also want to round dates – e.g., if you count observations per month or something similar.

floor_date(example_datetime, unit = "month")
[1] "2000-02-01 UTC"
floor_date(example_datetime, unit = "3 months")
[1] "2000-01-01 UTC"
round_date(example_datetime, unit = "year")
[1] "2000-01-01 UTC"
ceiling_date(example_datetime, unit = "day")
[1] "2000-03-01 UTC"
rollback(example_datetime, roll_to_first = FALSE, preserve_hms = TRUE)
[1] "2000-01-31 14:00:00 UTC"
rollback(example_datetime, roll_to_first = TRUE, preserve_hms = FALSE)
[1] "2000-02-01 UTC"

6.4 Time zones

Dealing with time zones is tedious. By default, R sets the time zone of every date you provide it with to UTC (Coordinated Universal Time). However, sometimes you need to change it – e.g., when you deal with flight data. lubridate provides you with some handy functions for doing so. Generally speaking, you will not often work with them.

First, you need to know which arguments you can provide the functions with – or, put differently, the names of the time zones.

head(OlsonNames()) # wrapped it with head() because it's 593 in total
[1] "Africa/Abidjan"     "Africa/Accra"       "Africa/Addis_Ababa"
[4] "Africa/Algiers"     "Africa/Asmara"      "Africa/Asmera"     

If you want to set a new time zone to a date-object – hence, 2 o’clock UTC becomes 2 o’clock CET – use force_tz():

force_tz(example_datetime, tzone = "CET")
[1] "2000-02-29 14:00:00 CET"

If you want to transform your date-time object to a new time zone, preserving its time – for example, for appointments all around the world – use with_tz(). If you use the aforementioned now() function, lubridate will use your computer’s time zone:

with_tz(now(), tzone = "US/Eastern")
[1] "2022-11-23 07:05:04 EST"

6.5 Periods, durations, intervals

You will also want to do some calculations based on the dates and times you have parsed.

6.5.1 Periods

A period can be created using a pluralized name of a time unit.

months(3) + days(5)
[1] "3m 5d 0H 0M 0S"

Another way of doing so – which is suited for automation – is period():

period(num = 5, unit = "years")
[1] "5y 0m 0d 0H 0M 0S"

You can also set multiple arguments:

period(num = 1:5, units = c("years", "months", "days", "hours", "minutes"))
[1] "1y 2m 3d 4H 5M 0S"

6.5.2 Durations

Durations can be used to model physical processes. They are stored in seconds and can be created by prefixing the name of a period:

dweeks(x = 1)
[1] "604800s (~1 weeks)"

Again, there’s a constructor function:

duration(num = 1:5, units = c("years", "months", "days", "hours", "minutes"))
[1] "31557600s (~1 years)"  "5259600s (~8.7 weeks)" "259200s (~3 days)"    
[4] "14400s (~4 hours)"     "300s (~5 minutes)"    

How long do I have to wait until Christmas?

ymd("2022-12-24")-today()
Time difference of 31 days

6.5.3 Intervals

Intervals can be created by using the interval() function or by using the %--% operator.

interval(today(), ymd("2020-12-24"))
[1] 2022-11-23 UTC--2020-12-24 UTC
today() %--% ymd("2020-12-24")
[1] 2022-11-23 UTC--2020-12-24 UTC

You can divide an interval by a duration to determine its physical length:

christmas <- today() %--% ymd("2022-12-24")
christmas/ddays(x = 1)
[1] 31

You can divide an interval by a period to determine its implied length in clock time:

christmas/days(x = 1)
[1] 31

If you want to know its length in seconds, you can also do int_length():

int_length(christmas)
[1] 2678400

There are also some other things you can do with intervals:

Does the start of the winter semester fall within the period between now and Christmas?

ymd("2020-11-04") %within% interval(today(), ymd("2020-12-24"))
[1] FALSE

Reverse the direction of the interval:

int_flip(interval(today(), ymd("2020-12-24")))
[1] 2020-12-24 UTC--2022-11-23 UTC

You can also shift an interval:

today until Christmas –> tomorrow until December 25:

int_shift(christmas, by = days(1))
[1] 2022-11-24 UTC--2022-12-25 UTC