16 Dates and Times
Dates and times are not the most pleasant objects to deal with. At times they can be confusing and even annoying. For instance, time zones, and conventions such as daylight savings time and leap years/seconds are tricky things to watch out for when working with dates and times.
In this chapter, we discuss the base R functions that handle dates and times in their creation, conversion, extraction, and calculation. Additionally, we touch upon feature made available in the package lubridate
, which has made manipulating dates and times more intuitive in terms of conversion, accessing components and calculating time spans. lubridate
is also a member of tidyverse
.
16.1 Date and time classes
There are three basic date and time classes in R: Date
, POSIXct
and POSIXlt
. Date
handles dates without times. POSIXct
(calendar time) and POSIXlt
(local time) represent dates and times.
Date
, POSIXct
Date
and POSIXct
are internally stored as number of days and number of seconds from January 1, 1970.
Below we create a new variable that is the current system time.
It is a POSIXct
object.
## [1] "POSIXct" "POSIXt"
If we remove the class of it, we will see that it is a numeric value, which is exactly the seconds from January the 1st, 1970.
## [1] 1716126509
POSIXlt
POSIXlt
stores dates and times as a list of components: second, minute, hour, day, month, year, time zone etc.
If we convert now
to POSIXlt
and remove the class of it, we will see all the components of it.
## [1] "2024-05-19 21:48:29 CST"
## $sec
## [1] 29.49559
##
## $min
## [1] 48
##
## $hour
## [1] 21
##
## $mday
## [1] 19
##
## $mon
## [1] 4
##
## $year
## [1] 124
##
## $wday
## [1] 0
##
## $yday
## [1] 139
##
## $isdst
## [1] 0
##
## $zone
## [1] "CST"
##
## $gmtoff
## [1] 28800
##
## attr(,"tzone")
## [1] "" "CST" "CDT"
## attr(,"balanced")
## [1] TRUE
16.2 Converting dates and times
There are a couple of situations where we want to convert dates and times.
as.Date()
, as.POSIXct()
, as.POSIXlt()
Most often, we want to convert characters to dates and times. For that we can use the base R functions as.Date()
, as.POSIXct()
, as.POSIXlt()
and strptime()
, and the group of lubridate
functions ymd()
, yq()
, hm()
,ymd_hms()
etc.
For instance, we have a date string date_string
, and we want to convert it to a date object.
## [1] "2019-01-14"
We can also convert it to a datetime object, either with as.POSIXct()
or as.POSIXlt()
.
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 CST"
Because our string comes within an unambiguous format, R is able to recognize its year, month, day, hour, minute, and second components.
In other words, for R to understand what we want, it is important to get the dates and times in the correct input formats.
input formats
as.Date()
, as.POSIXct()
and as.POSIXlt()
accept various input formats.
There are two default input formats: year-month-day hour:minutes:seconds or year/month/day hour:minutes:seconds.
## [1] "2019-01-14"
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 CST"
If the input format is not standard, we need to set the format
argument to map the displayed format.
%b
abbreviated month name%m
month as decimal number (01–12)%c
date and time%d
day of the month as decimal number (01–31)%e
day of the month as decimal number (1–31)%H
hours as decimal number (00–23); strings such as 24:00:00 are accepted for input%I
hours as decimal number (01–12)%M
minute as decimal number (00–59)%S
second as integer (00–61)%OS
seconds including fractional seconds%Y
year with century%y
year without century (00–99)- …
The full list of allowed formats can be found by ?strptime
.
Let’s see some examples. Below we have a string and we want to convert it to a date object. It’s now in a format that’s ambiguous to R, so we need to format it so that as.Date()
can recognize it.
To format the string, we will use the percent symbol, and follow the relevant rules.
%d
day of the month as decimal number (01–31)%b
abbreviated month name%Y
year with century%H
hours as decimal number (00–23); strings such as 24:00:00 are accepted for input%M
minute as decimal number (00–59)%S
second as integer (00–61)
## [1] "2019-01-14"
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 CST"
For another example, we have a T
between the dates and times that usually comes from the timestamps.
If we want to convert it to a datetime object, we need to separate those year, month and days with the slashes, and put a “T” between dates and times. Next, we format the string using the rules below.
%d
day of the month as decimal number (01–31)%m
month as decimal number (01–12)%Y
year with century%H
hours as decimal number (00–23); strings such as 24:00:00 are accepted for input%M
minute as decimal number (00–59)%S
second as integer (00–61)
## [1] "2019-01-14"
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 CST"
strptime()
strptime()
converts characters to POSIXlt
datetime objects.
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 CST"
lubridate
lubridate
provides more intuitive ways to convert characters to dates and times.
ymd()
, ydm()
, mdy()
, myd()
, dmy()
, dym()
, yq()
parse dates with year, month, and day components. The function names suggest the formatting. For instance, ymd()
indicates that the first component is year, the second is month, and the third is day.
## [1] "2019-01-14"
## [1] "2019-01-14"
## [1] "2019-01-14"
hm()
, ms()
, hms()
parse periods with hour, minute, and second components.
## [1] "14H 17M 30S"
## [1] "14H 17M 0S"
## [1] "17M 30S"
ymd_hms()
, ymd_hm()
, ymd_h()
, dmy_hms()
, dmy_hm()
, dmy_h()
, mdy_hms()
, mdy_hm()
, mdy_h()
, ydm_hms()
, ydm_hm()
, ydm_h()
parse datetimes with year, month, and day, hour, minute, and second components.
## [1] "2019-01-14 14:17:30 UTC"
## [1] "2019-01-14 14:17:30 UTC"
## [1] "2019-01-14 14:17:30 UTC"
converting from Unix timestamp
The Unix epoch is the number of seconds that have elapsed since January 1, 1970. To convert the Unix timestamp to a datetime object, we need to set the origin
argument.
date <- c(1304362260, 1216256400, 1311344765, 1331309010, 1297437420, 1417795235)
date <- as.POSIXct(date, origin = "1970-01-01")
date
## [1] "2011-05-03 02:51:00 CST" "2008-07-17 09:00:00 CST" "2011-07-22 22:26:05 CST" "2012-03-10 00:03:30 CST"
## [5] "2011-02-11 23:17:00 CST" "2014-12-06 00:00:35 CST"
If we don’t specify the origin, R will ask us to do that.
16.3 Creating dates and times
There are several methods in R to create dates and times.
creating sequences
First of all, we can generate datetime sequences using seq()
. We’ve used it to generate a sequence of numbers.
## [1] "2019-01-14" "2019-02-14" "2019-03-14" "2019-04-14" "2019-05-14" "2019-06-14" "2019-07-14" "2019-08-14"
## [9] "2019-09-14" "2019-10-14" "2019-11-14" "2019-12-14" "2020-01-14"
from
specifies the starting date. to
specifies the ending date. by
specifies the step, which can be month and other values, such as week.
## [1] "2019-01-14" "2019-01-21" "2019-01-28" "2019-02-04" "2019-02-11" "2019-02-18" "2019-02-25" "2019-03-04"
## [9] "2019-03-11" "2019-03-18" "2019-03-25" "2019-04-01" "2019-04-08" "2019-04-15" "2019-04-22" "2019-04-29"
## [17] "2019-05-06" "2019-05-13" "2019-05-20" "2019-05-27" "2019-06-03" "2019-06-10" "2019-06-17" "2019-06-24"
## [25] "2019-07-01" "2019-07-08"
combining date and time components
We can also use paste()
to concatenate characters and then convert them to date and time objects.
date <- c("14jan2019", "14feb2019", "14mar2019")
time <- c("14:17:30", "15:17:30", "16:17:30")
as.POSIXct(paste(date, time), format = "%d%b%Y %H:%M:%S")
## [1] "2019-01-14 14:17:30 CST" "2019-02-14 15:17:30 CST" "2019-03-14 16:17:30 CST"
Other options are lubridate
functions make_datetime()
and make_date()
, which create datetime objects from numeric components.
## [1] "2019-01-14 14:17:30 UTC"
16.4 Extracting dates and times components
Another common task in date and time manipulation is to extract date and time components.
obtaining POSIXlt elements
For instance, to access the components of POSIXlt
objects, we can use the $
operator to subset the elements we need.
## [1] "2024-05-19 21:48:29 CST"
## [1] "CST"
POSIXlt
objects store dates and times components of second, minute, hour, day, month, year, time zone etc. as a list.
strptime()
Additionally, strptime()
returns POSIXlt
objects. We can also use $
to obtain strptime()
components. The outputs are integers.
## [1] 119
## [1] 0
## [1] 14
## [1] 1
## [1] 13
## [1] 14
## [1] 17
## [1] 30
strftime()
We can also extract components from strftime()
, if the string is in standard unambiguous format. The outputs are characters.
## [1] "2019"
## [1] "14:17:30"
## [1] "14:17"
## [1] "17:30"
weekdays()
, months()
, quarters()
weekdays()
, months()
, and quarters()
are base R functions to extract parts of a POSIXt
or Date
object.
## [1] "Sunday"
## [1] "May"
## [1] "Q2"
lubridate
accessor functions
lubridate
offers a group of accessor functions to extract components from the datetime objects. These include year()
, month()
, week()
, date()
, day()
, mday()
(day of the month), wday()
(day of the week), hour()
, minute()
, second()
, and tz()
(time zone).
## [1] 2024
## [1] 5
## [1] Sun
## Levels: Sun < Mon < Tue < Wed < Thu < Fri < Sat
## [1] 29.82323
16.5 Difference between datetime objects
We can calculate time difference on datetime objects directly.
x <- strptime("2019-01-14 14:17:30", "%Y-%m-%d %H:%M:%S")
y <- strptime("2018-12-14 18:10:12", "%Y-%m-%d %H:%M:%S")
x - y
## Time difference of 30.8384 days
The base R function difftime()
calculates a difference of two datetime objects and returns a difftime
object. It has more control over how the differences are calculated.
## Time difference of 30.8384 days
## Time difference of 740.1217 hours
## Time difference of 44407.3 mins
## Time difference of 2664438 secs
## Time difference of 30.8384 days
## Time difference of 4.405486 weeks
difftime
objects can be converted to numeric objects with as.numeric()
.
## [1] 740.1217
## [1] 44407.3
lubridate
also has its own way of handling a time span, defined as a duration, period or interval.
16.6 Working with time zones
Time zones are stored as character strings of datetime objects in R. The time zone is an attribute that only controls printing.
R relies on the user’s operating system to interpret time zone names. We can get the complete list of all time zone names with OlsonNames()
, a database originally compiled by Arthur Olson. These names take the form “Country/City”.
## [1] "Africa/Abidjan" "Africa/Accra" "Africa/Addis_Ababa" "Africa/Algiers" "Africa/Asmara"
## [6] "Africa/Asmera"
POSIXct
and POSIXlt
classes contain the time zone attribute.
## [1] "2019-01-14 14:17:30 GMT"
## [1] "2019-01-14 14:17:30 GMT"
with_tz()
, force_tz()
If we want to display a different time zone of the current datetime object, or if we want to set a different time zone for the datetime object, we can use two lubridate
functions with_tz()
and force_tz()
. The functions provide ways to change time zones.
with_tz()
displays the datetime in a different time zone, while the actual time has not been changed.
For instance, below x
is a POSIXlt
object, and its current time zone is CST. To display it in a different time zone, such as New York time, we use with_tz()
and set the tzone
, which is the time zone to be displayed.
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 01:17:30 EST"
To actually set a datetime object to a different time zone, we can use force_tz()
. The values given to the argument tzone
comes from the Olson time zone database.
## [1] "2019-01-14 14:17:30 CST"
## [1] "2019-01-14 14:17:30 EST"