7.3 row_number()
Using row_number() with mutate() will create a column of consecutive numbers. The row_number() function is useful for creating an identification number (an ID variable). It is also useful for labeling each observation by a grouping variable.
### Practice Dataset
practice <- 
  tibble(Subject = rep(c(1,2,3),8),
         Date = c("2019-01-02", "2019-01-02", "2019-01-02", 
                  "2019-01-03", "2019-01-03", "2019-01-03",
                  "2019-01-04", "2019-01-04", "2019-01-04", 
                  "2019-01-05", "2019-01-05", "2019-01-05", 
                  "2019-01-06", "2019-01-06", "2019-01-06", 
                  "2019-01-07", "2019-01-07", "2019-01-07",
                  "2019-01-08", "2019-01-08", "2019-01-08",
                  "2019-01-01", "2019-01-01", "2019-01-01"),
         DV = c(sample(1:10, 24, replace = T)),
         Inject = rep(c("Pos", "Neg", "Neg", "Neg", "Pos", "Pos"), 4))Using the practice dataset, let’s add a variable called Session. Each session is comprised of 1 positive day and 1 negative day closest in date. For example, the first observation of Inject = pos and the first observation where Inject = neg will both have a Session value of 1; the second observation of Inject = pos and the second observation of Inject = neg will be session 2). In the code below, you will see three methods for creating Session. Which method produces the result we need?
## Method1
practice %>% 
  mutate(Session = row_number())## # A tibble: 24 x 5
##    Subject Date          DV Inject Session
##      <dbl> <chr>      <int> <chr>    <int>
##  1       1 2019-01-02     9 Pos          1
##  2       2 2019-01-02     4 Neg          2
##  3       3 2019-01-02     7 Neg          3
##  4       1 2019-01-03     8 Neg          4
##  5       2 2019-01-03     8 Pos          5
##  6       3 2019-01-03     3 Pos          6
##  7       1 2019-01-04     3 Pos          7
##  8       2 2019-01-04     3 Neg          8
##  9       3 2019-01-04     7 Neg          9
## 10       1 2019-01-05     6 Neg         10
## # ... with 14 more rows## Method2
practice %>% 
  group_by(Subject, Inject) %>%
  mutate(Session = row_number())## # A tibble: 24 x 5
## # Groups:   Subject, Inject [6]
##    Subject Date          DV Inject Session
##      <dbl> <chr>      <int> <chr>    <int>
##  1       1 2019-01-02     9 Pos          1
##  2       2 2019-01-02     4 Neg          1
##  3       3 2019-01-02     7 Neg          1
##  4       1 2019-01-03     8 Neg          1
##  5       2 2019-01-03     8 Pos          1
##  6       3 2019-01-03     3 Pos          1
##  7       1 2019-01-04     3 Pos          2
##  8       2 2019-01-04     3 Neg          2
##  9       3 2019-01-04     7 Neg          2
## 10       1 2019-01-05     6 Neg          2
## # ... with 14 more rows## Method3
practice %>% 
  group_by(Subject, Inject) %>%
  arrange(Date) %>%  
  mutate(Session = row_number())## # A tibble: 24 x 5
## # Groups:   Subject, Inject [6]
##    Subject Date          DV Inject Session
##      <dbl> <chr>      <int> <chr>    <int>
##  1       1 2019-01-01     7 Neg          1
##  2       2 2019-01-01     7 Pos          1
##  3       3 2019-01-01     1 Pos          1
##  4       1 2019-01-02     9 Pos          1
##  5       2 2019-01-02     4 Neg          1
##  6       3 2019-01-02     7 Neg          1
##  7       1 2019-01-03     8 Neg          2
##  8       2 2019-01-03     8 Pos          2
##  9       3 2019-01-03     3 Pos          2
## 10       1 2019-01-04     3 Pos          2
## # ... with 14 more rows7.3.1 Exercises
- Create a row ID for diamonds where each row is unique and order doesn’t matter 
- Create an ID that relies on the clarity of diamonds where order doesn’t matter 
- Create an ID that represents the price rank of the diamond. - Which diamond is #1 (highest priced diamond in dataset)? 
- Which diamond is ranked #2 in highest price? 
 
- Create an ID that represents price rank within each clarity category. - Of the diamonds with the clarity IF, what is the highest ranked/most expensive diamond? 
- Of the diamonds with the clarity SI2, what is the 2nd most expensive diamond (rank = 2)