Chapter 6 Test

Put your practice to the test. Here are some excellent cheatsheets to consider for biostats in R, and this is a useful read on good enough practices in scientific computing (Wilson et al. 2017). The goal here was not to become data scientists nor biostatisticians but to encourage you to consider developing and refining your critical thinking skills in the context of evidence, data, and statistical reasoning.

Learning outcomes

  1. Complete fundamental exploratory data analysis on a representative dataset culminating with a fair and reasonable statistical model.
  2. Interpret a statistical analyses that you completed with a focus on relevance, significance, and logic.
  3. Communicate biostatistical work clearly and effectively to others.

Critical thinking

At times in many disciplines of biological research, we need to be open to experimentation that is fair, transparent, and replicable but that is implemented based on available data. This experimentatation can also happen after we have data. It can be an exercise in fitting the most appropriate or parsimonous models (Cottingham, Lennon, and Brown 2005), applying experimental design principles (Ruxton and Colgrave 2018), and of course invoking critical thinking. This is not to say we are going on fishing expeditions, but that that at times, we have only certain data to describe a system and are tasked or obligated to use the best possible evidence we have to infer relevant processes. For instance, we might compile field data, data from online resources or data products for climate or landscapes, or reuse data on traits on genetics and link these different evidence streams together to explore a question. Critical thinking in statistics can be an important framework that we leverage to not only do the statistics and fit models but also ensure that we are able to ask the questions we need to. In summary, we have data and need an answer but have to use open and transparent thinking with statistics to find the best question.

Test adventure time

York University, Keele Campus is a small urban forest mixed with grasslands and open space. The master gardeners measured nearly 7000 trees over the course of two years. These data were recently compiled and published. There are many fascinating and compelling questions to explore that can support evidence-informed decisions and valuation estimates for this place ecologically, environmentally, and from a trait or species-level perspective. This challenge as a summative test is thus relatively more open ended. Given these data, collected and now published, what can we do to enhance our biological and social understanding and appreciation for a university campus that support people, other animals, and plants. Explore the data, define a relevant challenge or set of questions that would benefit the stakeholders or local community or inform our understanding of a biological theory, and demonstrate your mastery of critical thinking in statistics. Submit your work to as PDF including the code, annotation, rationale, interpretation, and outputs from the viz, EDA, and model(s) that supported your thinking.

trees <- read_csv(url(""))  
## # A tibble: 6,951 × 27
##      FID OBJECTID Date   Block Street_or_       Building_C Tree_Tag_N Species_Co
##    <dbl>    <dbl> <chr>  <chr> <chr>                 <dbl>      <dbl> <chr>     
##  1     0        1 9/7/12 A     Stedman Lecture…         22          1 lochon    
##  2     1        2 9/7/12 A     Stedman Lecture…         22          2 lochon    
##  3     2        3 9/7/12 A     Stedman Lecture…         22          3 lochon    
##  4     3        4 9/7/12 A     Stedman Lecture…         22          4 lochon    
##  5     4        5 9/7/12 A     Stedman Lecture…         22          5 lochon    
##  6     5        6 9/7/12 A     Stedman Lecture…         22          6 lochon    
##  7     6        7 9/7/12 A     Stedman Lecture…         22          7 lochon    
##  8     7        8 9/7/12 A     Stedman Lecture…         22          8 lochon    
##  9     8        9 9/7/12 A     Stedman Lecture…         22          9 lochon    
## 10     9       10 9/7/12 A     Stedman Lecture…         22         10 lochon    
## # … with 6,941 more rows, and 19 more variables: Common_Nam <chr>, Genus <chr>,
## #   Species <chr>, DBH <dbl>, Number_of_ <dbl>, Percentage <dbl>,
## #   Crown_Widt <dbl>, Total_Heig <dbl>, Latitude <dbl>, Longitude <dbl>,
## #   Height_to_ <dbl>, Unbalanced <dbl>, Reduced_Cr <dbl>, Weak_Yello <dbl>,
## #   Defoliatio <dbl>, Dead_Broke <dbl>, Poor_Branc <dbl>, Lean <dbl>,
## #   Trunk_Scar <dbl>


item concept description value
1 effective data viz are there figures exploring the data and is the final main figure publishable in terms of legends, labels, axes, appropriateness 10
2 effective EDA is the distribution of and relationship between variables explored 5
3 final data model(s) does the final model(s) address the purpose of study, appropriate, and assumptions including fit of model explored 5
4 annotation and reporting is there annotation in the r-code chunks, reporting in the markdown, and an interpretation even briefly of what you found and why 5
5 total sum of above 25