6.4 Exercises

ds4psy: Exercises 6

The following exercises practice skills in navigating local directories and using essential readr commands for importing and writing data.

6.4.1 Exercise 1

Navigating directories

Find out your current working directory and list all files and folders contained in it.
Change your working directory to a different directory (e.g., a parallel directory data that is located on the same level as your current working directory) and list all the files and folders in the other directory.
Return to your original working directory, but list all files in the other (data) directory.

Please note: If you are doing this exercise in an R Markdown file (.Rmd), it is possible that compiling chunks that contain local paths may yield error messages (in case R runs from a different location). If this happens, simply execute your commands in the Console and set the chunk option to eval = FALSE to stop compiling the files in R Markdown (see Section F.3.3 of Appendix F on Using R Markdown for details).

6.4.2 Exercise 2

Parsing dates and numbers

Look at your ID card and type your birthday as a string as it’s written on the card (including any spaces or punctuation symbols). For instance, if you were Erika Mustermann (see https://de.wikipedia.org/wiki/Personalausweis_(Deutschland)) you would write the character string “12.08.1964”.

Use an appropriate parse_ command to read this character string into R.
Now read out the date in German (i.e., “12. August 1964”) and use another command to parse this string into R.
Use Google Translate to translate this character string into French, Italian, and Spanish and use appropriate R commands to parse these strings into R.

Hint: Consult vignette("locales") for specifying languages.

Use a parse_ command (with an appropriate locale) to parse the following character strings into the desired data format:

"US$1,099.95" as a number;
"EUR1.099,95" as a number.

6.4.3 Exercise 3

A read-write-read cycle

Read in the data in file http://rpository.com/ds4psy/data/data_2.dat into an R object data_2, but by using the command read_delim() rather than by using read_fwf() (as above).

Hint: The variable names should be the same as above, but inspect the file to see its delimiter.

Store the data file as data_2.csv (a csv file that includes variable names) into a directory that is not your current working directory.
Now use a command to re-read the file data_2.csv back into an object data_2b and use the all.equal() function to verify that data_2 and data_2b are equal.

6.4.4 Exercise 4

Reading odd data

The following data files are variants of the data at http://rpository.com/ds4psy/data/falsePosPsy_all.csv:

(See Section B.2 of Appendix B for details on the data and corresponding articles.)

Hint: Define the file paths as R objects saves you from typing them repeatedly later.

Inspect file ex1.dat and read it in two ways (by using either the generic read.csv() or the appropriate variant of read_csv()). How do the data read differ from each other?
Inspect and import the dataset ex2.dat using appropriate command(s).
Inspect and import the dataset ex3.dat using appropriate command(s).
Inspect and import the dataset ex4.dat using appropriate command(s). Specifically, note the encoding of the age variable (aged365) and check whether you can compute participants’ average age (in years) after importing the data.

6.4.5 Exercise 5

Writing data

In Exercise 4 of the previous chapter on tibbles (see Section 5.4.4 of Chapter 5), we created the following summary tibble in different ways (either directly entering it by using tibble commands, or by using dplyr commands to obtain a summary table from the raw data):

Table 6.1: Age-related data from Simmons et al. (2011) [see Exercise 4 of the **tibbles** chapter.]
cond	n	mn_ag	mi_ag	mx_ag	fl_vyng	fl_yng	fl_mid	fl_old
64	25	21.09	18.30	38.24	0	13	10	2
control	22	20.80	18.53	27.23	3	15	3	1
potato	31	20.60	18.18	27.37	1	17	11	2

(See Section B.2 of Appendix B for details on the data and corresponding articles.)

Imagine that you are trying to send this file to a friend who — due to excessive demand for our course — was unable to secure a spot in this course and ended up in a course on the “History of data science”, whose members are encouraged to experiment with software products like MS Excel and SPSS.

Assuming that your friend is currently located in Troy, NY (i.e., in the USA), export the summary as a file that your friend can read with her software.
Read back your file and verify that it contains the same information as your original summary.
Now repeat both steps (i.e., writing and re-reading the summary data) under the assumption that your friend is located in Berlin, Germany.

6.4.6 Exercise 6

Variants of p_info

In this exercise, we re-visit the participant data on positive psychology interventions that we have analyzed before and try to parse some variants of this data. (See Section B.1 of Appendix B for details on the data.)

Load the data at http://rpository.com/ds4psy/data/posPsy_participants.csv into an R object p_info and compute participants’ mean age by intervention, by sex, and by level of education (educ).
Download the file p_info_2.dat (located at http://rpository.com/ds4psy/data/p_info_2.dat) into a local directory (called data) and import it from there into an R object p_info_2.
(Hint: Inspect the file prior to loading it: What is different in this file?)
Recompute the mean age by intervention, by sex, and by level of education (educ). Are they the same as before?

This concludes our set of exercises on importing data.