4.4 R exercise Week 1
R Notebook: Create a new R Notebook for each weekly exercise. Watch the course video “Week 0: Intro to R Notebooks” as needed. At the end, “knit” it to an html file and view it in your browser.
Good news: if you can knit the file, the code can stand by itself (it does not depend on what dyou did in your R session before) and runs without errors. This is a good check. If there are error messages, check the ‘R Markdown’ tab for the code line number and try to fix it.
Task: Import the data set pulsatilla_genotypes, which we’ll use in a later lab (Week 14), into gstudio and convert it to a genind object.
Data: This file contains microsatellite data for adults and seeds of the herb Pulsatilla vulgaris sampled at seven sites. Reference: DiLeo et al. (2018), Journal of Ecology 106:2242-2255. https://besjournals.onlinelibrary.wiley.com/doi/10.1111/1365-2745.12992
The following code copies the file into the ‘downloads’ folder in your R project folder:
file.copy(system.file("extdata", "pulsatilla_genotypes.csv", package = "LandGenCourse"),
paste0(here(), "/downloads/pulsatilla_genotypes.csv"), overwrite=FALSE)## [1] FALSE
Variables:
- ID: Family ID (i.e., mother and her offspring have the same ID)
- OffID: ‘0’ for adults, seeds from the same mom are numbered 1, 2, etc.
- Population: site ID
- Coordinates: X and Y coordinates (Projection info: EPSG Projection 31468)
- Loci: seven diploid microsatellites, each with two columns (1 allele per column)
Hints:
- Load packages: Make sure the following packages are loaded:
gstudio,here,tibbleandadegenet. - View data file: Adapt the code from section 2.c to import the raw data set. The file has a header row. View it. How are the genetic data coded?
- Import data into gstudio: Adapt the code from section 5.b to import the genetic data with ‘gstudio’. The loci are in columns 6 - 19. What setting for
typeis appropriate for this data set? Check the help file forread_population. - Check imported data: Use
stroras_tibbleto check the imported data. Does each variable have the correct data type? Note: there should be 7 variables of typelocus. - Check variable types: Create a bulleted list, like the one above, with the variables. For each variable, list their R data type (e.g., numeric, integer, character, logical, factor, locus). Check the cheatsheet “R markdown language” as needed.
- Convert to genind object: Modify the code from section 5.e to convert the data from
gstudioto agenindobject. Ignore the warning about duplicate labels. Print a summary of thegenindobject.
Question: What is the range of the number of genotyped individuals per population in this dataset?