## R Exercise Week 6

Task: Test whether observed heterozygosity of Pulsatilla vulgaris adults depends on census population size. Fit a model at the individual level where you include a random effect for population.

Hints:

• Please install the package inbreedR (to calculate individual measures of heterozygosity) from CRAN, if it is not yet installed.
• You may want to load the packages dplyr and ggplot2. Alternatively, you can use :: to call functions from packages.
2. Import data and extract adults:

• Use the code below to import the data.
• Use dplyr::filter to extract adults with OffID == 0.
Pulsatilla <- read.csv(system.file("extdata","pulsatilla_genotypes.csv",
package = "LandGenCourse"))
3. Calculate multilocus heterozygosity: Use package inbreedR to calculate multilocus heterozygosity for each adult.

• Use the function inbreedR::convert_raw(x), where x is the matrix of genotypes only (no ID or other non-genetic data), with two columns per locus. Check the help file of the function convert_raw.
• Use the function inbreedR::MLH to calculate observed heterozygosity for each individual.
• Add the result as a variable het to the adults dataset.

Example code from inbreedR::MLH help file: data(mouse_msats) genotypes <- convert_raw(mouse_msats) het <- MLH(genotypes)

• Import the file “pulsatilla_population.csv” with the code below.

• Check the dataset.

Pop.data <- read.csv(system.file("extdata", "pulsatilla_population.csv",                           package = "LandGenCourse"))
5. Scatterplot with regression line: Use ggplot2 to create a scatterplot of adult heterozygosity against census population size (population.size), with a regression line.

6. Fit linear mixed model: Adapt code from section 3.c to perform a regression of individual-level observed heterozygosity (response variable) on census population size (predictor), including population as a random effect. Fit the model with REML and print a summary.

7. Test fixed effect: Adapt code from section 2.f to test the fixed effect with function car::Anova.

8. Check residual plots: Adapt code from section 2.d to create residual plots.

Questions: There is one influential point in the regression analysis:

• What was the direction of the relationship, did heterozygosity increase or decrease with census population size?
• Was the fixed effect statistically significant?
• Was the model valid, or was there a problem with the residual plots?
• What would be the main issue, and what remedy could you suggest?