10.4 Exercises

ds4psy: Exercises (10: Time data)

Here are some exercises on parsing and manipulating dates and times:

10.4.1 Exercise 1

  1. Use the appropriate lubridate function to parse each of the following dates:

Solution

  1. Use the appropriate lubridate function to parse each of the following date-times:

Hint: Note that t4 contains the time component before the date component. To handle this vector, consider creating a tibble and then using dplyr commands for separating its time and date components, and pasting them in reversed order (date before time).

Solution

Table 10.1: A tibble with t4 separated and mutated into dt.
t4 t d ds dt
8:05 01/01/2020 8:05 01/01/2020 01/01/2020 8:05 2020-01-01 08:05:00
9:20 29/02/2020 9:20 29/02/2020 29/02/2020 9:20 2020-02-29 09:20:00
12:30 24/12/2020 12:30 24/12/2020 24/12/2020 12:30 2020-12-24 12:30:00
23:58 30/12/2020 23:58 30/12/2020 30/12/2020 23:58 2020-12-30 23:58:00
  1. Determine the weekdays of the 7 dates in d4 and t4.

Hint: Combine the 7 dates in a vector, before applying either the base R function weekdays() or the lubridate function wday() to it.

10.4.2 Exercise 2

The data file dt_10.csv (available at rpository.com) contains the birth dates and times of 10 ficticious people. Read the data into a tibble dt_10:

Table 10.2: Data in file dt_10.csv.
name day month year hour min sec
Anna 8 8 1994 11 47 57
Beowulf 1 6 1994 5 35 43
Cassandra 14 11 2000 5 58 6
David 17 1 1991 13 3 12
Eva 21 1 2001 21 33 55
Frederic 19 7 2000 13 47 12
Gwendoline 20 9 1996 8 28 37
Hamlet 5 5 1996 17 7 8
Ian 18 8 1996 8 27 17
Joy 18 12 1990 14 44 35
  1. Use the appropriate lubridate functions to parse the data of birth dob and time of birth tob as new columns of dt_10.

Solution

  1. Use the appropriate lubridate functions to add 2 columns that specify – given each person’s DOB – the weekday dob_wd (from Monday to Sunday) of their birthday and their current age age_fy in full years (i.e., the numeric value of their age, as an integer).

Hint: Their current age can be computed by subtracting their DOB from today’s date today(). One way of computing their age in full years is by dividing the interval() of their current age by a duration() in the unit of “years”.

10.4.3 Exercise 3

  1. Pick at least 4 famous people — some of which are still alive, some of which have already died — and enter their name, area of occupation, date of birth (DOB), and date of death (DOD, if deceased) in a tibble fame, in analogy to the following:
Table 10.3: Basic info on some famous people.
name area DOB DOD
Napoleon Bonaparte politics August 15, 1769 May 5, 1821
Jimi Hendrix music November 27, 1942 September 18, 1970
Michael Jackson music August 29, 1958 June 25, 2009
Frida Kahlo arts July 6, 1907 July 13, 1954
Angela Merkel politics July 17, 1954 NA
Kobe Bryant sports August 23, 1978 January 26, 2020
Lionel Messi sports June 24, 1987 NA
Zinedine Zidane sports June 23, 1972 NA
  1. Use the appropriate lubridate functions to replace the DOB and DOD variables by corresponding dob and dod variables of type date.

Solution

  1. Add two variables to fame that specify the weekday (from Monday to Sunday) of their birth (dob_wd) and – if applicable – of their death (dob_wd).

10.4.4 Exercise 4

Examine the data file dt.csv (available at rpository.com). This file contains the birth dates and study participation times of 1000 ficticious people. Read the data into a tibble dt:

Table 10.4: Head of file dt.csv.
name gender bday bmonth byear height score t_1 t_2
I.G. male 14 12 1968 169 113 2020-01-16 11:00:58 2020-01-16 11:32:21
O.B. male 10 4 1974 181 114 2020-01-17 14:11:07 2020-01-17 15:05:14
M.M. male 28 9 1987 183 108 2020-01-16 10:06:06 2020-01-16 10:51:47
V.J. female 15 2 1978 161 93 2020-01-10 10:06:04 2020-01-10 10:39:48
O.E. male 18 5 1985 164 114 2020-01-20 09:23:51 2020-01-20 10:11:36
Q.W. male 1 3 1968 172 103 2020-01-13 11:10:09 2020-01-13 11:54:07
  1. The time variables t_1 and t_2 indicate the start and end times of each person’s participation in a study. Compute the duration of each person’s participation (in minutes and seconds) and plot the distribution of the resulting durations (e.g., as a histogram).

Solution

  1. The study officially only ran for 5 days — from “2020-01-13” to “2020-01-18” — and should only include participants that responded in less than 1 hour (60 minutes). Add a filter variable that considers these criteria (i.e., allows to filter out other dates and durations beyond 60 minutes).