5.4 R Exercise Week 2
Task: Create a bubble plot of the number of genotyped individuals in the dataset pulsatilla_genotypes.csv
, using Latitude/Longitude coordinates.
Hints:
- Load libraries: Load libraries
gstudio
,dplyr
,tibble
andsf
. - Import data: Re-use your code from Week 1 exercise to import the dataset
pulsatilla_genotypes.csv
intogstudio
. Recall that the resulting object is a data.frame. Check the variables with functionstr
. Which variables contain the sites and the spatial coordinates? - Summarize by site: Use the function
group_by
from librarydplyr
to group individuals (rows) by site (using pipe notation:%>%
), and add the functionsummarize
to count the number of genotyped individuals per population (i.e., sampling site). Recall that this can be done with nesting the functionn
withinsummarize
:
summarize(nIndiv = n())
.
Write the result into a new objectPulsatilla
. - Add mean coordinates: You can nest multiple functions within
summarize
and separate them with a comma. E.g., to calculate both sample size and the mean of a variablemyVar
, you could write:
summarize(nIndiv = n(), myMean = n(myVar))
Modify your code to calculate the number of genotyped individuals for each site and their mean X and Y coordinates. Your object ‘Pulsatilla’ should now have three columns, one with the number of individuals and two with the mean coordinates. Display the dataset withas_tibble
to check. - Convert to sf object: Modify code from section 2.a to convert your data frame
Pulsatilla
to ansf
object. Make sure to adjust the variable names for the coordinates (i.e., use the variable names that you assigned in the previous step for the mean X and Y coordinates). - Specify known projection: The correct EPSG number for this dataset is: 31468. You can specify the CRS with:
st_crs(Pulsatilla) <- 31468
. - Transform projection: Adapt code from section 2.c to transform the projection to the “longlat” coordinate system, and write it into an object
Pulsatilla.longlat
. - Create bubble plot: Adapt code from section 4.d to create a bubble plot of the number of individuals per population. Note: you may drop the argument
key.entries
as it has a default. - Save data as R object: Save the object
Pulsatilla.longlat
as an R object using the following code:
saveRDS(Pulsatilla.longlat, file = here::here("output/Pulsatilla.longlat.rds"))
.
We will need it for a later R exercise.
Question: Where on earth are the sites in the Pulsatilla dataset located?