2.2 Data type conversion

One common operation on data is to convert from one data type to another.

One frequently used conversion is between character and numeric. It is because numeric data may contain comma.

To convert string to number, we use as.numeric():

as.numeric("12000")
## [1] 12000
# When converting ``12,000'', need to replace comma
as.numeric(gsub(",", "", "12,000"))
## [1] 12000

Here, the function gsub is to find the comma and replace with null string where gsub stands for general substitution

To convert number to character, we may use as.character():

as.character(1200)
## [1] "1200"

2.2.1 Date

Date data type covers standard calendar date. It is converted from character data type. One complication is that we need to specify the format since there are a lot of possible way to express it.

For day, we use %d for day number (01-31).

For month, we use %m for month number (00-12), %b for abbreviated month (e.g. Jan), and %B for unabbreviated month (e.g. January)

For year, we use %y for two-digit year (e.g., 14) and %Y for four-digit year (e.g., 2014)

as.Date("21Jan2004", "%d%b%Y")
## [1] "2004-01-21"
as.Date("21/01/04", "%d/%m/%y")
## [1] "2004-01-21"
as.Date("21-01-04", "%d-%m-%y")
## [1] "2004-01-21"