Chapter 5 Hackathon
All models are wrong but some are useful (Stouffer 2019; Box 1976). Critical thinking with statistics is thus critical to ensure that we effectively support evidence informed decision making in society (Christopher J. Lortie and Owen 2020; Neelen and Kirschner 2020).
Learning outcomes
- Appreciate the challenge of working with data to apply a critical thinking & creative design mindset to statistical solutions.
- Practice your workflow and literate coding before a summative test.
- Refine your thinking and coding for efficiency.
Critical thinking
Efficiency is a fascinating topic in statistics (Craycraft 1999; Kenett, Coleman, and Stewardson 2003; Norman 2003). Here, we can simplify this using the critical thinking criteria we have extensively refined and applied to numerous, tidy challenges. Efficiency = sufficiency (provided it is logical, fair, and accurate). Your plots and statistical models should represent a reasonable and likely description of the data at hand. This section is a formative opportunity for you to evaluate your skills and strengths in logic, efficiency, fair adventuring, workflows, and literate coding prior to the final section - a test. You are provided with a general dataset(s). The adventure is solve a very generalized challenge that is embodied in the evidence.
Adventure time
Candy. Candy. Candy. Take a peek at these sweet data. Contrast Canada and USA candy sales at Halloween. Considering including population density in your model for each country for each year so as not to introduce variation and to be more accurate in estimating meaningful differences.
Canadian Candy
USA Candy & Halloween spending
Human populations
Deeper dive: contrast GLMM model performance, examine temporal effects, or explore GAMs.
library(tidyverse)
<- read_csv(url("https://figshare.com/ndownloader/files/30990820"))
Canada Canada
## # A tibble: 233 × 3
## month year candy
## <dbl> <dbl> <dbl>
## 1 1 1997 101014
## 2 2 1997 101938
## 3 3 1997 136057
## 4 4 1997 105601
## 5 5 1997 119123
## 6 6 1997 107689
## 7 7 1997 113399
## 8 8 1997 113934
## 9 9 1997 109441
## 10 10 1997 146876
## # … with 223 more rows
<- read_csv(url("https://figshare.com/ndownloader/files/25190510"))
USA USA
## # A tibble: 16 × 6
## year total costumes candy decorations cards
## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2005 3.3 1.2 1.2 0.8 0.1
## 2 2006 5 1.8 1.6 1.3 0.3
## 3 2007 5.1 1.8 1.6 1.4 0.3
## 4 2008 5.8 2.1 1.8 1.6 0.3
## 5 2009 4.7 1.7 1.5 1.2 0.3
## 6 2010 5.8 2 1.8 1.6 0.3
## 7 2011 6.9 2.5 2 1.9 0.5
## 8 2012 8 2.9 2.3 2.4 0.6
## 9 2013 7 2.6 2.1 2 0.4
## 10 2014 7.4 2.8 2.2 2 0.4
## 11 2015 6.9 2.5 2.1 1.9 0.3
## 12 2016 8.4 3.1 2.5 2.4 0.4
## 13 2017 9.1 3.3 2.7 2.7 0.4
## 14 2018 9 3.2 2.6 2.7 0.4
## 15 2019 8.8 3.2 2.6 2.6 0.4
## 16 2020 8 2.6 2.4 2.6 0.4
<- read_csv(url("https://figshare.com/ndownloader/files/30993373"))
humans humans
## # A tibble: 249 × 72
## country `1950` `1951` `1952` `1953` `1954` `1955` `1956` `1957` `1958` `1959`
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 Burundi 2 309 2 360 2 406 2 449 2 492 2 537 2 585 2 636 2 689 2 743
## 2 Comoros 159 163 167 170 173 176 179 182 185 188
## 3 Djibou… 62 63 65 66 68 70 71 74 76 80
## 4 Eritrea 822 835 849 865 882 900 919 939 961 983
## 5 Ethiop… 18 128 18 467 18 820 19 184 19 560 19 947 20 348 20 764 21 201 21 662
## 6 Kenya 6 077 6 242 6 416 6 598 6 789 6 988 7 195 7 412 7 638 7 874
## 7 Madaga… 4 084 4 168 4 257 4 349 4 444 4 544 4 647 4 754 4 865 4 980
## 8 Malawi 2 954 3 012 3 072 3 136 3 202 3 271 3 342 3 417 3 495 3 576
## 9 Maurit… 493 506 521 537 554 571 588 605 623 641
## 10 Mayotte 15 16 16 17 18 19 20 21 22 23
## # … with 239 more rows, and 61 more variables: `1960` <chr>, `1961` <chr>,
## # `1962` <chr>, `1963` <chr>, `1964` <chr>, `1965` <chr>, `1966` <chr>,
## # `1967` <chr>, `1968` <chr>, `1969` <chr>, `1970` <chr>, `1971` <chr>,
## # `1972` <chr>, `1973` <chr>, `1974` <chr>, `1975` <chr>, `1976` <chr>,
## # `1977` <chr>, `1978` <chr>, `1979` <chr>, `1980` <chr>, `1981` <chr>,
## # `1982` <chr>, `1983` <chr>, `1984` <chr>, `1985` <chr>, `1986` <chr>,
## # `1987` <chr>, `1988` <chr>, `1989` <chr>, `1990` <chr>, `1991` <chr>, …
Reflection questions
- How does veracity of data from different resources potentially influence your critical thinking?
- Can joining data introduce errors?
- How does the available data bias the inference and interpretation of relative variables on key outcomes?
Book review
Instructions
- Read the key chapters that best support your learning from the text ‘The New Statistics with R’ (Hector 2017).
- Please use the ten simple rules for reviews (Christopher J. Lortie 2019) as your instructions how to do a review.
- Write and submit a short, less than 2000 word review of this text and submit to turnitin.com.
Rubric
item | concept | description | value |
---|---|---|---|
1 | rule 1 the topic | introduce topic, explain necessity, explain scope | 2 |
2 | rule 2 audience | explain audience-level of book and to what extent blend of expertise is needed | 2 |
3 | rule 3 editions | mention different editions or versions and what is changed | 0 |
4 | rule 4 pedagogy | describe pedagogy and structure of chapters | 4 |
5 | rule 5 content | provide a clear overview of what the text covers | 2 |
6 | rule 6 readability | critique the style and clarity of writing | 2 |
7 | rule 7 links | list and explain linkages to concepts and packages | 2 |
8 | rule 8 compare | briefly list what other resources are out there and compare | 2 |
9 | rule 9 commitment | comment on the commitment and effort need to master text | 2 |
10 | rule 10 benefits | list the main benefits of using this text to learn or solve | 2 |
11 | your writing | your writing and coherence are graded for clarity, balance, directness, and convincingness | 5 |
12 | total | sum of above concepts | 25 |