Chapter 5 Data Wrangling

WE HAVE ARRIVED. Finally, we get to play with some data……pretty much. We’re really going to practice a little bit of the data cleaning process. The authors call this data wrangling and I totally love it. Frankly, the functions you learn here are the foundation of R. They are so important they have their own little title, The Wickam 6. Let’s get to it.

5.1 Goals

  • Open rmd stub

  • load in tidyverse and your data

  • check out the data with the functions we learned last unit

  • play with a line graph a little

  • Clean data with the dplyr package and the Wickam Six

    • select

    • filter

    • mutate

    • arrange

    • group_by

    • summarise

  • first exposure to pipes

5.2 Tasks

Task 1: Set-up

You should read section 5.1 and load in the babynames dataset described in 5.2. Throughout this unit PAY ATTENTION to the commands you are using and actively try to remember and implement the commands we have learned in the last few units.

Task 2: Trying out the Wickam 6

Complete 5.3-5.13. This will take you a while, it’s ok to take a break, take your time.

Actually type all of the code out, you need the practice

Task 3: Stuff to turn in on StackOverflow

In the rmd stub you use for this chapter take notes using the # in each activity you create code for. When you use a new function, like mutate use the # to describe what the function does in your own words. Post your responses and use the ch3 tag for any questions have throughout.

  1. Post code, similar to that in activity 3, to create a tibble that filters out all names except children named Margaret who were assigned female at birth.

  2. In your own words tell me what a pipe does

  3. In activity 4, suppose I only wanted a tibble with year, name, and prop…what would my code look like? hint use the select function.