Importing Data File
To read in tabular data (e.g., .txt file), use read.table().
# read.table(file, header=F, sep="", ...)
<- read.table("path/datafile.txt", header = T, sep = "\t") # set working directory data
Some important arguments are:
- header : logical. Does the first row contain column labels?
- sep : the field separator character which can be
- “” : spaces(default)
- “\t” : tab-delimited
- “,” : comma-separated
* Run ?read.table to see specific arguments in the function
If data is in .csv format, use read.csv().
# read.csv(file, header=T, sep=",")
<- read.csv("exampledata.csv", header = T) data
To import data in other formats (such as SPSS data files), use functions in foreign package.
library(foreign)
<- read.spss("path/datafile.sav", to.data.frame = T) data
After importing the data file, you can check whether you have correctly read in the data. Below are some example functions to use.
head(data) # first 6 rows of the data
## ID group score1 score2
## 1 1 1 35 45
## 2 2 1 23 14
## 3 3 1 14 26
## 4 4 1 17 25
## 5 5 1 23 27
## 6 6 1 35 47
1:10, ] # extract the first 10 rows data[
## ID group score1 score2
## 1 1 1 35 45
## 2 2 1 23 14
## 3 3 1 14 26
## 4 4 1 17 25
## 5 5 1 23 27
## 6 6 1 35 47
## 7 7 1 27 37
## 8 8 1 33 50
## 9 9 1 32 15
## 10 10 1 31 37
tail(data) # last 6 rows of the data
## ID group score1 score2
## 15 15 2 39 37
## 16 16 2 45 41
## 17 17 2 31 25
## 18 18 2 40 17
## 19 19 2 25 15
## 20 20 2 32 27
str(data) # structure of thedata frame
## 'data.frame': 20 obs. of 4 variables:
## $ ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ group : int 1 1 1 1 1 1 1 1 1 1 ...
## $ score1: int 35 23 14 17 23 35 27 33 32 31 ...
## $ score2: int 45 14 26 25 27 47 37 50 15 37 ...
summary(data) # summary of data
## ID group score1 score2
## Min. : 1.00 Min. :1.0 Min. :14.00 Min. :14
## 1st Qu.: 5.75 1st Qu.:1.0 1st Qu.:26.50 1st Qu.:23
## Median :10.50 Median :1.5 Median :32.00 Median :27
## Mean :10.50 Mean :1.5 Mean :31.50 Mean :31
## 3rd Qu.:15.25 3rd Qu.:2.0 3rd Qu.:35.25 3rd Qu.:42
## Max. :20.00 Max. :2.0 Max. :51.00 Max. :50
attributes(data) # attributes of object
## $names
## [1] "ID" "group" "score1" "score2"
##
## $class
## [1] "data.frame"
##
## $row.names
## [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
names(data) # variable names
## [1] "ID" "group" "score1" "score2"
dim(data) # dimension of data (number of rows and columns)
## [1] 20 4
dimnames(data) # row and column names
## [[1]]
## [1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15"
## [16] "16" "17" "18" "19" "20"
##
## [[2]]
## [1] "ID" "group" "score1" "score2"
nrow(data) # number of rows
## [1] 20
ncol(data) # number of columns
## [1] 4
class(data) # class of object
## [1] "data.frame"