Topics and terms covered in this chapter.
Text objects are called strings or text strings. We will use the terms text and strings interchangeably.
Text contains characters (i.e., letters, numbers, and other symbols). In R, its data type and mode are also called character.
Larger units of text are often called documents. Loading, reading or searching through text strings is also referred to as parsing.
After working through this chapter, you should be able to:
- understand that text consists of strings,
- combine and dissect strings of text,
- read and write regular expressions,
- use stringr commands to find and replace text strings.
Data used in this chapter.
This chapter assumes that you have read and worked through Chapter 14: Strings of the r4ds textbook (Wickham & Grolemund, 2017). Based on this background, we examine essential commands of base R and the stringr package (Wickham, 2019) in the context of examples and exercises.
Structure your document by inserting headings and empty lines between different parts. Here’s an example how your initial file could look:
--- title: "Chapter 09: Text data" author: "Your name" date: "2020 March 24" output: html_document --- Add text or code chunks here. # Exercises (09: Text data) ## Exercise 1 ## Exercise 2 etc. <!-- The end (eof). -->
Create an initial code chunk below the header of your
.Rmdfile that loads the R packages of the tidyverse (and see Section E.3.3 if you want to get rid of the messages and warnings of this chunk in your HTML output).
Save your file (e.g., as
nr_name.Rmdin the R folder of your current project) and remember saving and knitting it regularly as you keep adding content to it.