3.5 Convert numeric to binary
Sometimes you have a numeric variable that takes on values over a range (e.g., BMI, age, etc.) and you would like to create a binary (0/1, yes/no) categorical variable with levels corresponding to specific ranges. For example, let’s convert the variable Age (years) into “Elderly” corresponding to age at least 65 years. This can be left as a 0/1 variable, or converted to a factor.
# As a 0/1 variable
mydat$elderly <- as.numeric(mydat$Age >= 65)
# Examine new variable
table(mydat$elderly, useNA = "ifany")
##
## 0 1
## 375 155
## $`0`
## [1] 42 64
##
## $`1`
## [1] 65 90
# As a factor
mydat$elderly_fac <- factor(mydat$elderly,
levels = 0:1,
labels = c("Age < 65y", "Age 65y+"))
# Examine new variable
table(mydat$elderly_fac, useNA = "ifany")
##
## Age < 65y Age 65y+
## 375 155
## $`Age < 65y`
## [1] 42 64
##
## $`Age 65y+`
## [1] 65 90
In tidyverse
, you would use the following code. The code below uses a few new functions, summarize()
and group_by()
.
# As a 0/1 variable
mydat_tibble <- mydat_tibble %>%
mutate(elderly = case_when(Age < 65 ~ 0,
Age >= 65 ~ 1))
# Examine new variable
mydat_tibble %>%
count(elderly)
# Check range of original variable at levels of new
mydat_tibble %>%
group_by(elderly) %>%
summarize(min = min(Age),
max = max(Age))
# As a factor
mydat_tibble <- mydat_tibble %>%
mutate(elderly = case_when(Age < 65 ~ 0,
Age >= 65 ~ 1)) %>%
mutate(elderly_fac = factor(elderly,
levels = 0:1,
labels = c("Age < 65y", "Age 65y+")))
# Examine new variable
mydat_tibble %>%
count(elderly_fac)
# Check range of original variable at levels of new
mydat_tibble %>%
group_by(elderly_fac) %>%
summarize(min = min(Age),
max = max(Age))