10.3 Dates and times with lubridate

The previous section has shown that base R provides basic support for handling dates and times, but as the corresponding classes and functions can be confusing, this remains challenging. To facilitate working with dates and times, the lubridate package (Spinu et al., 2020) provides a more coherent and user-friendly framework. This section illustrates key lubridate commands and concepts.

As we only need lubridate in those sessions in which we are dealing with dates and times, the package is not part of the core tidyverse. Hence, we need to load it in addition to the core tidyverse packages when we want to use it:

Just as base R provided two separate functions for obtaining the current date and time — specifically, Sys.Date() and Sys.time() — the lubridate package provides two corresponding functions:

To learn about the internal representation of both objects, we can inspect their class in R:

We see that the lubridate package uses the two key classes discussed in Section 10.2.2:

  • today() returns the current date (as a “Date” object).
  • now() returns the current calendar time (as a date-time, i.e., “POSIXct” object).

Both the today() and now() functions also accept a tzone argument for specifying a time zone. To see what time zones are used by default, we can apply the tz() function to the results of both functions:

Thus, both today() and now() are convenient short-cuts, but should be handled with care when dealing with different time zones.

10.3.1 Parsing dates and times

When learning how to read and retrieve (elements of) dates and times in base R, we spent a lot of time and effort on conversion specifications (e.g., strings like "%Y-%m-%d" or "%H:%M:%S") that specified parsing and formatting instructions in the so-called POSIX standard (see Section 10.2.2). Although it is good to know POSIX, as it is widely used and powerful, it can also seem a bit cumbersome and clumsy. Thus, lubridate adopts a more intuitive approach to parsing dates and times.

To create new dates or times, lubridate provides functions that parse (i.e., read or scan) them from various other objects. Specifically, we can read dates or times

  1. from character strings (representing dates or times);
  2. from variables (denoting date or time components);
  3. from other types (i.e., date or time objects).

The next sections introduce the lubridate functions for each of these object types.

1. Read from character strings

In Chapter 6 on Importing data, we encountered some readr functions that parse character vectors into dates or times (see the parse_date(), parse_datetime() and parse_time() functions in Section 6.2.1).

The lubridate package provides even simpler tools for reading in dates and times. The function names are combinations of the initial letter of basic date and time components:

  • date components are: y year, m month, d day
  • time components are: h hour, m minute, s seconds

The order of these components in the function name determines how the arguments (provided as strings) are interpreted. Here are some examples:

  • Dates from strings: Without any further specification, a date-denoting string like "02 04 06" would be highly ambiguous (see Table 10.1 of Section 10.2.2). To read this string into a date, lubridate allows us to use a combination of d, m, and y to indicate which date-related element each numeric component describes:
  • Dates with times (i.e., date) from strings:
  • Times from strings:

Note that the particular representation of date and time objects (e.g., as the columns of a tibble) varies with the functions that created the corresponding variable (column):

2. Read from date and time variables

Many datasets already contain variables that denote date components (i.e., values for years, months, weeks, or days) or time components (i.e., values for hours, minutes, or seconds). Given our skills in dealing with Text data (from Chapter 9) we could first paste these variables into a character string and than parse this string into a date or time variable. However, the lubridate package also provides more direct functions for converting such variables into dates or calendar times:

  • make_date() expects inputs to year, month, and day arguments to create an object of the “Date” class:

Note that make_date() accepts a variety of input types and fills in default values for missing elements.

  • make_datetime() expects the same inputs as make_date() (i.e., year, month, and day arguments), plus additional inputs to its time-related arguments (hour, min, sec, and a time zone tz, which defaults to UTC) to create a calendar time (i.e., “POSIXct”) object:

Note that make_datetime() is less flexible than make_date() in expecting that all its arguments are numeric. The lubridate additionally includes a make_difftime() function for creating difftime objects in various units of time (see the section on durations below).

To demonstrate these functions for creating dates or times from variables, we need some data that contains date and time variables. Lacking such a dataset, we can create one. The following code snippet does this by working backwards: We first use the sample_time() function from ds4psy (to draw random samples of calendar times within a specific range of time) and then use a dplyr pipe to extract its date- and time-related components.46

Just copy and run this code chunk and note that the resulting tibble dt_tb contains numeric columns that contain date and time components:

Table 10.1: Data containing typical date and time variables.
yr mt dy hr mi sc
2020 1 1 5 22 30
2020 2 10 3 15 39
2020 3 7 8 17 17
2020 3 31 19 5 35
2020 6 4 6 35 19
2020 6 4 22 25 1

Given this data, we can use make_date() for creating dates, or make_datetime() for creating calendar times:

To evaluate our success, we can verify whether the datetime column in dt_tb managed to re-construct the original vector dt_org created above:

Ooops — this is awkward: Why do we not get out the original times dt_org that we fed into the table? A first hypothesis could be that calendar times (of the “POSIXct” class) are numeric objects and hence some differences may be due to rounding. We can check this by using the round_date() function to round both our original vector (i.e., dt_org) and the newly created one (dt_tb$datetime) to the same units (e.g., “sec”):

So rounding did not diminish the difference, and the discrepancies observed here are far too large to be due to rounding differences. A clue to solving this puzzle is provided by computing the time difference between our newly created times dt_tb$datetime and the original times dt_org:

This shows that our new calendar times dt_tb$datetime are either 1 or 2 hours (i.e., 3600 seconds or 7200 seconds) later than our original times dt_org. To detect the source of this difference, let’s look more closely at both vectors again:

We can see that both vectors show the same dates and times, but for different time zones. Specifically, make_datetime() used “UTC” by default, whereas our original vector dt_org automatically used the current setting of our local system (here: “Europe/Berlin”):

Thus, to prevent such problems, we need to be explicit about the appropriate time zone when calling the make_datetime() function:

Note that setting tz = "" is a shortcut for using our local system time zone in make_datetime(), rather than its “UTC” default. We could have been even more explicit by stating tz = Sys.timezone() or tz = "Europe/Berlin".

As both vectors now used the same time zone (i.e., Europe/Berlin), they should contain the same date-time points:

This is reassuring — and good that we compared our result to the original date-time vector. The important lesson to learn here is:

  • Always watch out for time zones when working with times.

We will reconsider this issue in Exercise 2 (see Section 10.6.2).

3. Read from dates or times

Given that R distinguishes between dates (e.g., of class “Date”, see Section 10.2.3) and dates with times (e.g., calendar times of class “POSIXct”, see Section 10.2.4), it is often necessary to switch between these formats. Let’s first re-create a date and a date-time object to work with:

The lubridate functions as_date() and as_datetime() facilitate such conversions:

  • as_date() converts date-times (i.e., calendar times) into dates.
  • as_datetime() converts dates (of class “Date”) into date-times (if possible).

Converting date-times into dates is straightforward, as it merely drops the time-related information:

Converting dates into date-times is trickier, as dates lack information about times:

We see that tnow_2 and feb_29 are calendar dates (i.e., date with times of type “POSIXct”), but still seem to lack time information. Nevertheless, the time information is there, but the default time of the date was set to “00:00:00 UTC”. This becomes apparent when explicating the time object by supplying a more detailed format argument:

Again, we see: Always watch out for time zones when working with times.

When the as_date() and as_datetime() functions receive numeric inputs, they add a corresponding number of increments to the Unix epoch at “1970-01-01 00:00:00 UTC” (see Wikipedia: Unix_time for details). Note that date increments are interpreted as days, whereas time increments are interpreted as seconds:

10.3.2 Get and set date and time components

Having succeeded in creating date-time objects (from strings, other variables, or a date), we can ask additional questions:

  • How can we get or set individiual date and time components (of date-time objects)?

To illustrate this, we can use our tnow scalar, which is an object of the “POSIXct” class:

Actually, we have already encountered a pretty nifty way of retrieving individual date and time components (see the conversion functions of the POSIX standard in Section 10.2.4 above, or evaluate ?strptime):

However, using the format() function with a format argument according to the POSIX standard is pretty geeky. If lubridate lives up to its name, its functions should flow a bit more fluently.

Setting date and time components

Interestingly, the same lubridate functions that get date and time components can also be used to set those components:

What if we re-set a date component that depends on the date?

Thus, we can use the same functions that get information from dates and date-times to set its elements. However, beware that setting date-time components can have unintended consequences. As date-time components are not independent of each other, setting some components typically affects other components.

10.3.3 Working with time spans

In Section 10.2.1, we distinguished between time points (often called instants or moments) and time spans (aka. durations, intervals, or time periods). However, different usages of these time span terms actually imply different concepts. To enable accurate computations, we need to distinguish more carefully between the different types of time spans.

In fact, lubridate implements its own ontology of time spans.
Beyond time points (i.e., a particular instant or moment in time), the package distinguishes between 3 types of time spans:

  1. durations are time spans in exact numbers of seconds
  2. periods are time spans in human units (e.g., days, months, years)
  3. intervals are time spans with a given start and end point in time

These different time span concepts were inspired by the Joda Time project (Colebourne & O’Neill, 2010) (see the original article by Grolemund & Wickham (2011) for background information).

Essentially, both durations and intervals express physical time spans (a specific number of seconds), whereas periods express time spans in human units that may vary based on context (e.g., not every day has the same number of hours and not every month or year have the same number of days). Intervals are durations that are anchored in calendar time (i.e., intervals have start and end points that are real date-times).

We will consider each type of time span to see how they are created and find out what we can do with them. To motivate our explorations, consider the following example:

On Tuesday, September 11, 2001, the terrorist group al-Qaeda attacked several targets in the United States in a coordinated fashion. At 08:46 a.m., five hijackers crashed an American Airlines plane into the northern facade of the World Trade Center in New York City. Many remember the vivid images of this particular event (a so-called dread risk event), even though it happened many years ago. This raises the question:

  • How long ago did the 911-attacks take place?

Please take a moment (or rather: some time span) to think about potential answers to this question: What would you accept as an informative answer? How does this answer depend on when or where the question is being asked? What kind of accuracy would you expect? And which temporal unit(s) would an answer be expressed in?

1. Durations

As a first approach for answering the question “How long ago did the 911-attacks take place?”, we can enter the particular time point of this event and subtract it from now() to compute a time difference object in R:

The time difference td represents a duration as an R object of class “difftime”, which is automatically displayed as a count of “days”. The corresponding difftime() function (see Section 10.2.4) offers a range of units varying from “secs” to “weeks”, which are all rather limited in this case:

For time spans exceeding a few months, the duration class provided by lubridate is a better alternative: The lubridate notion of duration measures time spans as the number of elapsed seconds.

Durations are internally defined as a special class of object and record time spans in numeric form (as numbers of seconds):

There are several constructor functions (all starting with d) that facilitate defining durations:

Note that all these definitions internally create “Duration” objects that denote numbers of elapsed seconds, but are printed in a more human-readable fashion. The dmonths() function is flagged (with ?) as it’s underlying notion is a bit tricky. We just learned that durations are defined as time spans measuring an exact number of seconds — but how many seconds are there in a month? The answer clearly depends on the month in question (e.g., July is longer than June, and both are longer than February) and can only be determined when the particular month is not known.47 Thus, dmonths(1) (evaluating to 2629800) can only be an estimate and should be handled with care in practical applications.

As they are numbers, durations can be used in arithmetic expressions:

However, we need to keep in mind that they represent abstract time spans (in numbers of seconds). Thus, adding durations to date-time objects (i.e., calendar times of the “POSIXct” class) can yield unexpected results:

We see that adding a duration of 10 hours or 1 day to t1 seemingly created a difference of 11 or 25 hours (in calendar time, but note the switch of time zone, due to daylight saving time, DST. Similarly, adding a year’s worth of seconds to t2 moved the date back by a day (due to 2020 being a leap year). Both results are correct, of course, if we really meant to add time spans as a specific number of seconds (i.e., durations).

But as we often mean something else when thinking “ten hours later”, “tomorrow”, or “next year”, durations are rather limited when calculating time spans in human units. But that’s ok — for that’s exactly what periods are for.

2. Periods

When asking our original question:

  • How long ago did the 911-attacks take place?

receiving the number 6873.928) as its answer would be precise, but probably not satisfy us. This is because we typically do not think about longer periods of time in terms of an exact number of seconds. Instead, we tend to provide counts of various units of time so that their sum fills out the period of time we are dealing with.

In lubridate, periods are time spans that are expressed in human common-sense units of time (e.g., hours, days, months, years). Importantly, a period varies in its length (when expressed as durations, i.e., number of seconds, except periods defined in seconds) based on its context. For instance, the leap year 2020 is 366 days long (as it contains a February 29, 2020), whereas the year 2021 is only 365 days long. Flexible periods turn into fixed time spans (of various lengths) when added to a specific time point (date-time or calendar time).

As a consequence of their nature, periods are suited to set and track the change in the “clock time” between two events (date-times).

Periods are expressed and measured in common time units (ranging from seconds to years) and provide each unit as integer values (though seconds can be non-integers). Periods are created by simple constructor functions (that are all plural versions of the desired time unit):

As they objects of class “Period” are numbers, they can be used in computing arithmetic expressions:

When computing with periods, each unit is applied separately. The distribution of periods among units is non-trivial (e.g., the duration of some days, months, or years are longer than others), but this complexity is hidden from us. In fact, as we tend to represent dates and times in terms of periods (at least as long as we think of calendar time), using periods in calculations typically yields more intuitive results than adding durations (see above):

Thus, when reckoning with times and dates in various human-based units (like days, weeks, or months), periods are most likely the type of time span that we want to use.

3. Intervals

In lubridate, intervals are time spans that are bound by two time points that are real date-times (or calendar times). Thus, intervals are durations anchored in date-times (or calendar times) and provide a bridge between durations (i.e., number of seconds) and periods (i.e., common-sense time units) when at least one point in calendar time is known.

A first way of defining an interval requires a time span x (which can be a time difference, duration, or period) and a start date (typically a date-time or “POSIXct” object):

An alternative way of defining an interval uses its start and end points (as date-time objects) and places a special operator %--% between them (using “infix” notation):

Internally, lubridate represents intervals as objects of class “Interval”, which is numeric in nature:

Since an interval is anchored firmly in calendar time, both the exact number of seconds that passed (i.e., a duration) and the number of variable length time units that occurred during the interval (i.e., a period) can be calculated from a given interval. For accurately converting intervals into durations or periods, we can use the as.duration() and as.period() functions:

Multiple transformations between durations, periods, and intervals yield the expected results (except for rounding differences):

but returning from durations or periods to an interval requires specifying a start date (as an anchor):

Intervals can be thought of as lines with given start and end points on a linear axis of time. Thus, we can ask and answer a range of interesting questions when dealing with one or more intervals:

The infix operator x %within% y allows checking whether an interval or date-time x lies within an interval or list of intervals y:

The int_diff() function is similar to the base R function diff(), but returns the intervals that occur between the elements of a vector of date-times:

Divinding time spans

When asking “How long…” or “How old…” questions, we often are looking for answers that express a time span in terms of another one.

For instance, we can determine how many durations or periods fall into a given interval of time by dividing intervals by other time spans. This is straightforward for durations:

— yet may yield unexpected results — and also works for periods:

However, we cannot divide time differences or durations by periods, or periods by durations:

In practical contexts, we often do not care about exact durations, but are primarily interested in the number of completed time periods. These can be computed by dividing time intervals by periods (by using integer division):

Having learned about three different types of time spans and their combinations, we finally are in a position to answer our original question:

  • How long ago did the 911-attacks take place?

In most applied contexts, the following estimates — based on a duration, a period, or an interval — would count as informative answers:

We see once more, that — in R, as in life — many different ways can yield satisfactory results. Which way is best depends on many additional details, but it’s good to know what our options are.

Choosing the right time span

Given three different time spans, which one should we use? As always, this depends on the task that we want to do.

Chapter 16: Dates and times of r4ds recommends to always use the simplest type that solves our problem. When our primary concern is for amounts of time elapsed in terms of seconds, we use durations. When time spans are to be measured in common-sense units, periods typically provide the best solutions. And if we need to measure time spans that are bounded by calendar times, we use intervals, or combine several time spans.

Most everyday questions about time spans can be solved by either computing durations, periods, or intervals, or by dividing time intervals by durations or periods. Keep in mind that not all combinations of the different time spans concepts and arithmetic operations make sense. (Figure 16.1 provides an overview of the arithmetic operations that are allowed between pairs of date/time classes.)

10.3.4 Other reasons to lubridate

This section collects some additional examples of computing with dates and times with lubridate commands.

Checking date and time objects

Given a multiplicity of object types — a “Date” class and two different date-time classes (i.e., the “POSIXct” and “POSIXlt” classes) — it is easy to get confused which type of date or time we are dealing with. Fortunately, the lubridate package provides convenient test functions that verify the class of a date or time object:

Whenever dealing with multiple date-time classes, these functions are very helpful.

Rounding dates and times

When computing with dates, times, and various time spans, our resolution of interest is rarely a specific number of seconds. As we have seen in Section 10.3.3, this issue can often be addressed by performing computations in terms of periods or intervals or by dividing intervals by durations or periods.

For date-times (i.e., objects of the “POSIXct” class), rounding often is an issue as well. To address this concern, lubridate provides a range of conventient rounding functions that allow setting the direction and the unit used for rounding:

Time zone conversions

In Section 10.2.4, we noted the importance of time zones and mentioned that many base R functions include a tz argument for setting them (see Sys.timezone() for your current system setting and OlsonNames() for available options).

When not explicitly specifying any time zone information, any date-times created in R either use our local system setting (here: “Europe/Berlin”, which may or may not include daylight saving time, DST) or default to “UTC” (Coordinated Universal Time). For instance, when scheduling dates for the next four quarters (starting now() in a time zone with DST), the summer dates will automatically include DST information:

The lubridate functions for parsing date-times also have a tz argument. For instance, here are three specific date-time definitions (with different time zones):

When computing their differences (as difftime objects), we realize that t1, t2, and t3, actually denote the same point (instant or moment) in time:

However, when manipulating times (e.g., by creating a new vector), information regarding time zones is often lost (or unified, based on the initial time zone):

The vector t4 also shows that t1, t2, and t3 all denote the same moment in time, a fact that was only obscured by diplaying this time for different locations. However, the vector only shows this time for one particular time zone (specifically, the time zone of its first element tz(t1)). Thus, we can still wonder: How should we best express this particular time?

The need to (re-)introduce time zone information to time objects creates two distinct tasks, with corresponding solutions:

  1. Change time zone information by keeping the actual time points the same, but changing their representation (i.e., display fixed time points for a different time zone).

  2. Change time zone information by keeping the representation the same, but changing the actual time points (i.e., display different time points that have the same nominal appearance for a different time zone).

The with_tz() function addresses the first task: It changes time zone information (and thus changes the nominal time display) without changing the underlying point in time that is being represented:

This shows that the three identical times (which were merely expressed differently by t1, t2, and t3) all denote noon on 2020-Dec-24 when expressed in terms of UTC (Coordinated Universal Time).

By contrast, the force_tz() function addresses the second task: It preserves the appearance of its input times (i.e., the nominal time displayed), but changes the actual time points that are being represented:

The difference between both tasks and functions is subtle, but important: with_tz() only changes the appearance of time points, but keeps the time points intact. By contrast, force_tz() preserves the appearance of time, but changes the time actually represented. When converting times into different time zones, we typically only want to change the appearance of time (aka. the “sense” of time, i.e., the particular way in which fixed time points are being displayed to us), rather than the actually denoted point in time (aka. the “referent” or “meaning” of the time displayed). Thus, we typically want to use with_tz(), rather than force_tz(), when converting some given times into a different time zone.

Checking for leap years

A good question to ask is: Is some specific year y a leap year?

Many people can answer this question for the current year (e.g., “Yes, the year 2020 had a February, 29.”). But what about the year 2066? What about the year of Titanic_sinks (i.e., 1912)? And what about the year MCMLXXXIV?

The hard core solution to this problem consists in studying the definition of a leap year and then implementing it into a command or function. The corresponding definition Wikipedia: leap year reads:

…in the Gregorian calendar, each leap year has 366 days instead of 365,
by extending February to 29 days rather than the common 28.
These extra days occur in each year which is an integer multiple of 4
(except for years evenly divisible by 100, which are not leap years
unless evenly divisible by 400).

In R, we could implement this definition as follows:

However, since we have learned about time points and time spans, we can solve such tasks by using heuristics. For instance, we could define the interval from January 1st of year y to January 1st of year y+1 and determine the amount of days (as durations or periods) that fit into this interval:

If the solution is 366, the year y is a leap year, if it is 365 it is no leap year.

Similar solutions can be achieved by rounding dates, measuring and comparing their duration in other time units, or by trying to define the date of February, 29, of year y and checking whether this succeeds:

All these solutions should yield the same result, as long as we can rely on R’s internal date-time definitions, any functions used in our derivation, and our ability to correctly use the corresponding commands and understand their results.

A much simpler solution is finding a function that solves the task. The lubridate actually provides a leap_year() function that gets the job done:

The benefits of using an existing R function are two-fold:

  1. it saves us effort and time, and

  2. it can be used flexibly with other features of our programming language:

Thus, functions are a pretty big deal — which is why we will learn more about them in the next Chapter 11 on writing functions).

The price of using existing functions is that we need to trust that their author(s) knew what they were doing. In the case of the lubridate package (Spinu et al., 2020), its very likely that the authors can be trusted, as the package has been well-established and has been widely used (though also has changed quite a bit over the years). Incidentally, the definition of the leap_year function contains a line:

which looks very much like our leap year definition and initial base R solution from above.

Other functions

The lubridate package defines many other nifty functions:

  • am(dt)/pm(dt): Does a date-time object dt occur am or pm?
  • days_in_month(dt): Get the number of days in the month of dt
  • dst(dt): Get daylight saving time indicator of dt
  • format_ISO8601(dt): Format in ISO8601 character format
  • rollback(dt): Roll back date to last day of previous month
  • date_decimal(n): Converts a decimal number n to the corresponding date

Here are some examples of their results:

We conclude this section with some practice tasks that recapitulate the date and time functionality of the lubridate package.

Practice

Solve the following tasks by using lubridate functions:

  1. Local conventions and names of weekdays:
  • Predict, evaluate, and explain the results of the following commands:
  1. Full circle with date-time-dates:

We learned that the as_date() and as_datetime() functions allow us converting between times and dates.

  • Predict, evaluate, and explain the results of the following commands:

Answer: As time_2 is created from date_1 (i.e., a “Date” object), it lacks the time information of time_1.

  • How can we repair time_2 to match time_1?

Solution

  1. Durations vs. periods:
  • Predict, evaluate, and explain the results of the following commands:
  • Predict, evaluate, and explain the results of the following two commands:

Answer: The command d <- ymd("2020-01-20") assigns d to a particular date (i.e., an instant in time). To this, we add a time span (of 1 year) in two different ways: + years(1) adds the period of 1 year (in human units), yielding the same date a year later. By contrast, + dyears(1) adds the duration of 1 year (as an exact number of seconds). As 2020 is a leap year (i.e., containing a date of “2020-02-29” and a total number of 366 days) both additions yield different results. Thus, when dealing with common-sense units of time, adding periods typically yields intuitively more plausible results.

Note also the default time zone settings to UTC (Coordinated Universal Time).

  • Explain the different results of the following two commands:

Answer: sat_noon is assigned to a particular date-time point (i.e., instant or moment) in time: Sat, 2020-03-28 12:00:00 CET (+0100 from UTC). The tz specification ensures that the time zone is set to CET (i.e., corresponds to a specific location). A difference between adding a duration of ddays(1) and adding a period of days(1) implies that a time shift has occurred. In this case, Germany introduced daylight saving time (DST) on “2020-03-29”: At 2am, the clocks are set forwards by 1 hour. Thus, adding the duration of 1 day (as in + ddays(1)) yields a later time than adding the period of 1 day (as in + days(1)). Again, adding periods yields more predictable results.

  1. Durations, periods, and intervals:
  • Predict, evaluate, and explain the different results of the following two commands:
  • Evaluate and explain the result of the following expression in terms of their notions of time and in common-sense terms:

Answer: We can re-construct the answer in 4 steps:

  • The day_before_yesterday was defined as an interval (see above).
  • Shifting this interval by a period of 2 days yields the interval of today (from 00:00:00 to 24:00:00).
  • Adding a period of 12 hours to the start of today marks a specific date-time point: noon today.
  • Subtracting a duration of 30 seconds yields a date-time point precisely 30 seconds before noon today.

Note that this example involves four different notions of time: Date-time points (i.e., instants, moments, or “POSIXct” objects) and three different types of time spans.

  • Predict, evaluate, and explain the results of the following expressions:

Hint: These examples are inspired by Section 16.4.3 Intervals (Wickham & Grolemund, 2017), which also provides a short explanation. However, note that some definitions seem to have changed.

  1. Leap years in Roman numerals:

In Section 10.3.4 above, we left the leap year question regarding MCMLXXXIV unresolved:

  • Was the year MCMLXXXIV (represented in Roman numerals) a leap year?

Answer: A bit of experimentation with the R utils function as.roman() reveals that the character sequence “MCMLXXXIV” represents the calendar year 1984 in Roman numerals. As it turns out, this happens to be a leap year:

References

Colebourne, S., & O’Neill, B. (2010). Joda-time: Java date and time API. Release, 1(2), 4–1. Retrieved from https://www.joda.org/joda-time/

Grolemund, G., & Wickham, H. (2011). Dates and times made easy with lubridate. Journal of Statistical Software, Articles, 40(3), 1–25. https://doi.org/10.18637/jss.v040.i03

Spinu, V., Grolemund, G., & Wickham, H. (2020). lubridate: Make dealing with dates a little easier. Retrieved from https://CRAN.R-project.org/package=lubridate

Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. Retrieved from http://r4ds.had.co.nz


  1. The advantage of this approach is that we start with a set of date-times dt that we later want to re-create from its components (i.e., by using the make_date() and make_datetime() functions).

  2. In Section 16.4.5 Exercises of r4ds, the first question asks: “Why is there months() but no dmonths()?” Thus, I can only guess that the function dmonths() was absent from earlier versions of lubridate, but then was added later. And since the value of dyears(1) suffers from the same problem (as leap years are a day longer than non-leap years), it seems ok to provide an average for estimation purposes.