1.9 Missing and infinite values

NOTE: Being aware of missing data is extremely important in any data management or statistical analysis task. R often (but not always) takes the approach of, by default, allowing missing values to cause problems because the alternative of handling them for you might result in you not even knowing there are any missing values. It is good to be intentional in how you handle missing data.

R stores missing values as NA and you can check if a value is missing using is.na(). If you assign the single value NA to an object, R assumes it is of type logical.

x <- NA
x
## [1] NA
is.na(x)
## [1] TRUE
class(x)
## [1] "logical"

You can use an as. function to create a missing value of a specific type. Even for a character missing value, you do not use quotes.

x <- as.numeric(NA)
x
## [1] NA
is.na(x)
## [1] TRUE
class(x)
## [1] "numeric"
x <- as.character(NA)
x
## [1] NA
is.na(x)
## [1] TRUE
class(x)
## [1] "character"

If you include an NA in a vector that has non-missing values, R assumes the NA is of the same type as the others; you do not have to use an as. function in this case.

x <- c(5, NA, 3)
x
## [1]  5 NA  3
is.na(x)
## [1] FALSE  TRUE FALSE
class(x)
## [1] "numeric"

R represents infinite numbers using Inf or -Inf and you can check if a value is infinite using is.infinite().

x <- Inf
x
## [1] Inf
is.infinite(x)
## [1] TRUE
class(x)
## [1] "numeric"

A calculation will produce Inf or -Inf when the result is a number too large for R’s memory to handle. In scientific notation, 1e308 is 1 followed by 308 zeros. On my computer, R can handle a number this large, but not 1e309.

x <- 1e308
x
## [1] 1e+308
is.infinite(x)
## [1] FALSE
class(x)
## [1] "numeric"
x <- 1e309
x
## [1] Inf
is.infinite(x)
## [1] TRUE
class(x)
## [1] "numeric"

R uses NaN for “not a number.” If you want to know more, see the article Difference between NA and NaN in R.