3 Loading Data in R

Data set can be directly imported or can be entered manually directly into R ans save as a R data file also. Lets see how we can manually enter and save or import different data formats in R Studio.

3.1 Entering Data in R

We can start working in R right away by entering the data in R. To enter numerical data manually, c (stands for ‘column’) command is used.

  age <- c(45, 23, 36, 29)
  

Similarly, categorical data can also be entered using quotation marks.

  gpa <- c("A+", "A", "B+", "B")
  

3.2 Importing CSV file

read command function in R is used to read the data files. To read CSV file, you can simply move the CSV file into the working directory and load the file using read.csv command. You will need the readr package to read CSV file.

library (readr)
#> Warning: package 'readr' was built under R version 4.2.2
  csv1 <- read.csv("exampledata/survey1.csv")
  
 #To view the structure
  str(csv1)
#> 'data.frame':    100 obs. of  28 variables:
#>  $ sex         : int  0 0 0 0 0 0 0 0 1 1 ...
#>  $ height      : num  67 67 67 67 73 73 73 73 70 70 ...
#>  $ shoesize    : num  9.5 9.5 9.5 9.5 13 13 13 13 9 9 ...
#>  $ smoker      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ handed      : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ mothand     : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ fathhand    : int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ hairappt    : int  13 13 13 13 10 10 10 10 20 20 ...
#>  $ songs       : int  20 20 20 20 10 10 10 10 25 25 ...
#>  $ gpa_grade   : int  4 4 4 4 1 1 1 1 2 2 ...
#>  $ gpa         : num  4 4 4 4 3 3 3 3 3.6 3.6 ...
#>  $ exercise    : int  3 3 3 3 6 6 6 6 4 4 ...
#>  $ exercise_cat: int  3 3 3 3 4 4 4 4 3 3 ...
#>  $ polview     : int  4 4 4 4 2 2 2 2 3 3 ...
#>  $ tv          : int  6 6 6 6 15 15 15 15 8 8 ...
#>  $ coffee      : int  0 0 0 0 0 0 0 0 0 0 ...
#>  $ sleep       : num  9 9 9 9 6 6 6 6 6 6 ...
#>  $ drinks      : num  0.5 0.5 0.5 0.5 0 0 0 0 5 5 ...
#>  $ pepsicok    : int  1 1 1 1 0 0 0 0 1 1 ...
#>  $ haircol     : int  3 3 3 3 1 1 1 1 3 3 ...
#>  $ eyecolor    : int  3 3 3 3 3 3 3 3 2 2 ...
#>  $ distance    : num  1 1 1 1 60 60 60 60 80 80 ...
#>  $ distance_cat: int  1 1 1 1 2 2 2 2 3 3 ...
#>  $ books       : int  0 0 0 0 0 0 0 0 1 1 ...
#>  $ studyhrs    : int  3 3 3 3 3 3 3 3 4 4 ...
#>  $ studyhrs_cat: int  1 1 1 1 1 1 1 1 1 1 ...
#>  $ mostint     : int  16 16 16 16 10 10 10 10 6 6 ...
#>  $ leastint    : int  8 8 8 8 13 13 13 13 3 3 ...

Here, csv1 in the name assigned to the CSV file in R environment. You will be using the same variable name whenever you want to work with the csv file you imported.

3.3 Importing SPSS and STATA file

R also has a package called haven which helps us read the SPSS and STATA data files easily in R. After installing the haven package, we use read_sav command to import the SPSS file.

  #Install package
  install.packages('haven')
  
  #Load the package and read SPSS data file
  
  library(haven)
  savdata1 <- read_sav('ancova.sav')
  
  #To verify the file has been imported successfully.
  savdata1
  
  #Load the package and read STATA data file
  
  library(haven)
  dtadata1 <- read_dta('ancovastata.dta')
  
  #To verify the file has been imported successfully.
  dtadata1
  

3.4 Importing Excel File

readxl package is used to read the excel file in R environment.

 #Install package
  install.packages('readxl')
  
  #Load the package and read data
  
  library(readxl)
  xlsdata1 <- read_excel('C:\\Users\\para\\Downloads\\ancova.xls')
  
  #To verify the file has been imported successfully.
  xlsdata1
  

R has comprehensive packages to import from multiple statistical systems. Some packages include foreign, readdta1 etc. Find more about Data Import and Export in R here.