Chapter 2 Prerequisites and Info

To get started, please note that each Chapter of this Bookdown page can be run independently. Chapter 3 covers genotypic data filtering and generates one of the main files used throughout the workshop. There are various sub-sections which cover data wrangling and subsetting from larger files as well. However, they are there mostly there as a resource to provide users with ideas when formatting their own data. For ready to use files, please feel free to download the /data repo available at my GitHub page: https://github.com/jrod55/PGRP_mapping_workshop_data

These exercises assume R (R Core Team 2021) and R studio (RStudio Team 2020) are already installed. If you have not done so, a site which gives quick and easy to follow instructions for installing R and R studio can be found here

The R packages below are required to run all analysis in this workshop. In addition to R packages, TASSEL is another software tool used. More details about downloading TASSEL can be found on the developers site: https://www.maizegenetics.net/tassel (Bradbury et al. 2007). We will only use TASSEL for a filtering step so it is not absolutely necessary to complete the workshop.

## These packages can be installed from CRAN 
install.packages(c("data.table", "tidyverse","rrBLUP",
                   "simplePhenotypes","rMVP","lme4","pheatmap",
                   "rtracklayer","rrBLUP","scales", "cowplot", "CMplot")
                 
## PCAtools needs to be installed from Bioconductor                  
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("PCAtools")

If any of the packages fail to install through CRAN, you can go to the CRAN package web page and download the package source and attempt to install from source using the command below. Where ‘path_to_file’ represents the full path and file name of downloaded package.

Additionally, many packages are available through both CRAN and Bioconductor. If CRAN options fail, you can try to install packages using Bioconductor using the same syntax as shown in the commands above for the PCAtools package.

install.packages(path_to_file, repos = NULL, type="source")

## Example for rrBLUP
install.packages("~/Downloads/rrBLUP_4.6.1.tar.gz",repos = NULL, type = "source")

Throughout the exercises, we will be changing directories in R to access files needed. Doing this differs depending on the operating system you use. The code below shows how you would do this on a Linux/Mac machine and a Windows machine. The exercises were put together on a machine running Ubuntu, therefore you may need to change the syntax according to the operating system you are using:

## On Linux/Mac OS
setwd("~/path_to_download/PGRP_mapping_workshop/")

## If you are on a Windows machine, it will look something like:
setwd("C:/path_to_download/PGRP_mapping_workshop/")

## If you have troubles, type:
getwd() # Shows you which directory you are currently in

References

Bradbury, Peter J., Zhiwu Zhang, Dallas E. Kroon, Terry M. Casstevens, Yogesh Ramdoss, and Edward S. Buckler. 2007. TASSEL: software for association mapping of complex traits in diverse samples.” Bioinformatics 23 (19): 2633–35. https://doi.org/10.1093/bioinformatics/btm308.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
RStudio Team. 2020. RStudio: Integrated Development Environment for r. Boston, MA: RStudio, PBC. http://www.rstudio.com/.