3.1 Importing data in base R

base R中有 3 个最主要的数据导入函数：

函数	用途
`read.csv()`	读取以逗号分隔的文件(`sep = ",", header = T`)
`read.delim()`	读取以制表符分隔的文件(`sep = "\t", header = T`)
`read.table()`	读取其他一般性的文件，默认添加表头(`sep = "", header = F`)。`read.csv()`和`read.delim()`可以看做`read.table()`针对于特定情况的封装。

用于导入的函数一般都有大量的参数，上述的三个函数中，比较重要的参数有：

dec: 指定文件中用于表示小数点的符号，很多文件中可能会用;或者,当做小数点，这时不仅需要设置dec，也要适当调整sep。read.csv2()和read.delim2()是对一些设置的封装。
col.names 用于在 header = F 时指定列名，默认情况下将被命名为 “V1”、“V2”、···
row.names用于设置行名，既可以传入一个向量直接指定，也可以选择数据中的某一列作为列名（用一个数字表明该列是第几列）
colClasses:指定列的数据类型。特别是当你希望有些列作为因子，有些列作为字符串，这个参数便很有用。常用的类别有“numeric”,“factor”,“logical”,“character”。如果某列被指定为"NULL"，则该列不会被读入。
skip指定读取数据前跳过的行数
nrows读入的最大行数

我们使用本地数据集hotdogs.txt简要说明这些函数的用法，它没有列名，且用空格分隔：

## 分别使用三个函数
hotdogs1 <- read.table("data/hotdogs.txt", sep = "\t")
head(hotdogs1)
#>     V1  V2  V3
#> 1 Beef 186 495
#> 2 Beef 181 477
#> 3 Beef 176 425
#> 4 Beef 149 322
#> 5 Beef 184 482
#> 6 Beef 190 587

hotdogs2 <- read.csv("data/hotdogs.txt", sep = "\t", header = F)
head(hotdogs2)
#>     V1  V2  V3
#> 1 Beef 186 495
#> 2 Beef 181 477
#> 3 Beef 176 425
#> 4 Beef 149 322
#> 5 Beef 184 482
#> 6 Beef 190 587

hotdogs2 <- read.delim("data/hotdogs.txt", header = F)
head(hotdogs2)
#>     V1  V2  V3
#> 1 Beef 186 495
#> 2 Beef 181 477
#> 3 Beef 176 425
#> 4 Beef 149 322
#> 5 Beef 184 482
#> 6 Beef 190 587

设定 col.names 和 colClasses：

hotdogs <- read.delim("data\\hotdogs.txt", 
                      header = FALSE, 
                       col.names = c("type", "calories", "sodium"),
                       colClasses = c("factor", "NULL", "numeric"))
head(hotdogs)
#>   type sodium
#> 1 Beef    495
#> 2 Beef    477
#> 3 Beef    425
#> 4 Beef    322
#> 5 Beef    482
#> 6 Beef    587