17.1 Exploring
17.1.1 naniar
17.1.1.1 Shadown matrices
17.1.1.2 Summaries
17.1.2 Replace a value with NA
It’s worth to mention one dplyr
functions that does the same thing:
dplyr::na_if()
# na_if is particularly useful inside mutate,
# and is meant for use with vectors rather than entire data frames
starwars %>%
select(name, eye_color) %>%
mutate(eye_color = na_if(eye_color, "unknown"))
#> # A tibble: 87 x 2
#> name eye_color
#> <chr> <chr>
#> 1 Luke Skywalker blue
#> 2 C-3PO yellow
#> 3 R2-D2 red
#> 4 Darth Vader yellow
#> 5 Leia Organa brown
#> 6 Owen Lars blue
#> # ... with 81 more rows
article: http://naniar.njtierney.com/articles/replace-with-na.html
df <- tibble::tribble(
~name, ~x, ~y, ~z,
"N/A", 1, "N/A", -100,
"N A", 3, "NOt available", -99,
"N / A", NA, "29", -98,
"Not Available", -99, "25", -101,
"John Smith", -98, "28", -1)
df %>% replace_with_na(replace = list(x = -99))
#> # A tibble: 5 x 4
#> name x y z
#> <chr> <dbl> <chr> <dbl>
#> 1 N/A 1 N/A -100
#> 2 N A 3 NOt available -99
#> 3 N / A NA 29 -98
#> 4 Not Available NA 25 -101
#> 5 John Smith -98 28 -1
df %>% replace_with_na(replace = list(x = c(-99, -98)))
#> # A tibble: 5 x 4
#> name x y z
#> <chr> <dbl> <chr> <dbl>
#> 1 N/A 1 N/A -100
#> 2 N A 3 NOt available -99
#> 3 N / A NA 29 -98
#> 4 Not Available NA 25 -101
#> 5 John Smith NA 28 -1
df %>%
replace_with_na(replace = list(x = c(-99,-98),
z = c(-99, -98)))
#> # A tibble: 5 x 4
#> name x y z
#> <chr> <dbl> <chr> <dbl>
#> 1 N/A 1 N/A -100
#> 2 N A 3 NOt available NA
#> 3 N / A NA 29 NA
#> 4 Not Available NA 25 -101
#> 5 John Smith NA 28 -1
df %>% replace_with_na_all(condition = ~.x == -99)
#> # A tibble: 5 x 4
#> name x y z
#> <chr> <dbl> <chr> <dbl>
#> 1 N/A 1 N/A -100
#> 2 N A 3 NOt available NA
#> 3 N / A NA 29 -98
#> 4 Not Available NA 25 -101
#> 5 John Smith -98 28 -1
17.1.3 janitor
janitor::tabyl()
generates a frequency table and exposing missing values at the same time
17.1.4 sjmisc
when missing variable is factor sjmisc::frq