Chapter 4 Packages and Datasets
R packages are collections of functions and datasets developed by the R community. They enhance and improve R functionality.
4.1 Installing a Package
If you want a package but do not see that package under the “Packages” tab on the 4th panel, click “Install.” A new window will appear that will ask you to enter the package you want to install. Enter the package name and click install. Wait for the computer to install the package. Sometimes, it takes a while.
Alternatively, you can install a package by typing the function install.package(“package_name”) in the Source panel. Wait for the computer to install the package.
4.2 Loading a Package
If a package is already installed, there are 2 ways to load a package. One way is to type the function, library(package_name) in the source panel. If you get an error message that says there is no package named “package_name”, that means the package is not installed. You will have to install the package first by following the instructions for “Installing a Package.” After the package is installed, you can now load the package.
Alternatively, to load a package, you can go directly to the ”Package” tab on the 4th panel. Scroll down until you see the package you want and put a check mark on the box to its left. If you do not see the package, that means it has not been installed. Follow the “Installing a Package” instruction. When the package has been installed successfully, put a check mark on the box beside the package to load it.
4.3 Setting Working Directory
Working directory is a file path that sets the location of any files you save from R, datasets you import into R, etc. This is your default file path. The function to set your working directory is: setwd(“PATH”).
- Sample script if you have your files in Dropbox
- setwd(“/home/Deanna/Dropbox/projects/Math211”)
- Sample script if you have your files on the C drive of your PC
- setwd(“C:/Documents and Settings/Deanna/Math211”)
- Example if you have your files on the desktop of a Mac
- setwd(“~/Desktop”)
To make sure you have the right working directory, type the function, getwd( ). Note that the function has no arguments. Look at the result and make sure it is the directory you want.
4.4 Datasets built in R
R has a lot of built-in dataset. On the 4th panel is a tab called Packages. In Packages are several dataset packages such as boot, datasets and MASS. Click any of those packages and you will be taken to a page under the Help tab. The page shows you the datasets available. Click on a particular dataset to view the documentation for more information. To use any of the datasets, be sure there is a check mark beside the package.
4.5 Using “Import Dataset” in RStudio
RStudio makes importing datasets in a matter of clicks. First, go to the 4th panel’s “Files” tab. Click “Upload” and upload the file you want. If the upload is successful, you should see the file appear under the “Files” tab.
Then go to the Environment (3rd) panel. You will see a dropdown arrow called “Import Dataset.” The dropdown menu gives you a selection of what kind of data you want to import - whether it is a text file, excel file or csv file. Enter the file location, select the file and the file will appear in the Data preview area. If the file is correct, click “Import.”
You should now see the file in the Environment panel under the Data heading. On the right of the imported filename, a statement on how many observations and how many variables are in the dataset will appear. In the Source (1st) panel, you will see a new tab with the imported filename. Click that tab to view the dataset.
4.6 Reading Files
CSV Files
If you are importing a CSV (Comma Separated Values) file, use the function read.csv(“filename”).
Excel Files
Be sure that the package called readxl is loaded. Use the function read_excel(“filename”) if the filename is in your working directory or appears in the Files tab on the 4th panel.
- To import xls files: read_excel(“filename.xls”)
- To import xlsx files: read_excel(“filename.xlsx”)
- To import an xls file stored in a particular sheet number (for example, sheet number 2): read_excel(“filename.xls”, sheet = 2)
- To import an xlsx file stored in a particular sheet name (for example, sheet named students): read_excel(“filename.xlsx”, sheet = students)
There are several ways to view your imported file.
- View(filename) - shows the whole dataset in a separate tab in the Source panel
- head(filename) - extracts the first 6 lines of the dataset in the Console panel
- tail(filename) - extracts the last 6 lines of the dataset in the Console panel
- filename - Returns the whole dataset in the Console panel