4.4 R exercise Week 1

R Notebook: Create a new R Notebook for each weekly exercise. Watch the course video “Week 0: Intro to R Notebooks” as needed. At the end, “knit” it to an html file and view it in your browser.

Good news: if you can knit the file, the code can stand by itself (it does not depend on what dyou did in your R session before) and runs without errors. This is a good check. If there are error messages, check the ‘R Markdown’ tab for the code line number and try to fix it.

Task: Import the data set pulsatilla_genotypes, which we’ll use in a later lab (Week 14), into gstudio and convert it to a genind object.

Data: This file contains microsatellite data for adults and seeds of the herb Pulsatilla vulgaris sampled at seven sites. Reference: DiLeo et al. (2018), Journal of Ecology 106:2242-2255. https://besjournals.onlinelibrary.wiley.com/doi/10.1111/1365-2745.12992

The following code copies the file into the ‘downloads’ folder in your R project folder:

file.copy(system.file("extdata", "pulsatilla_genotypes.csv", package = "LandGenCourse"),
          paste0(here(), "/downloads/pulsatilla_genotypes.csv"), overwrite=FALSE)
## [1] FALSE

Variables:

  • ID: Family ID (i.e., mother and her offspring have the same ID)
  • OffID: ‘0’ for adults, seeds from the same mom are numbered 1, 2, etc.
  • Population: site ID
  • Coordinates: X and Y coordinates (Projection info: EPSG Projection 31468)
  • Loci: seven diploid microsatellites, each with two columns (1 allele per column)

Hints:

  1. Load packages: Make sure the following packages are loaded: gstudio, here,tibble and adegenet.
  2. View data file: Adapt the code from section 2.c to import the raw data set. The file has a header row. View it. How are the genetic data coded?
  3. Import data into gstudio: Adapt the code from section 5.b to import the genetic data with ‘gstudio’. The loci are in columns 6 - 19. What setting for type is appropriate for this data set? Check the help file for read_population.
  4. Check imported data: Use str or as_tibble to check the imported data. Does each variable have the correct data type? Note: there should be 7 variables of type locus.
  5. Check variable types: Create a bulleted list, like the one above, with the variables. For each variable, list their R data type (e.g., numeric, integer, character, logical, factor, locus). Check the cheatsheet “R markdown language” as needed.
  6. Convert to genind object: Modify the code from section 5.e to convert the data from gstudio to a genind object. Ignore the warning about duplicate labels. Print a summary of the genind object.

Question: What is the range of the number of genotyped individuals per population in this dataset?