Supplemental Material

Everything in here is meant to help you succeed at the exams in the course. There’s no due date attached to any of it because you aren’t meant to turn any of it in.

If you do want feedback on the work you’ve done feel free to send your work to me as a picture/document, just don’t expect the \(\approx\) 72 hour response time.

The format of this material will mostly be large form word problems with real or simulated data. If this is your last/only math class during your college career this next concept isn’t super helpful to understand, but anyone with more than one math class in their future should pay attention.

Mathematics/Computational courses operate off of a system of exercises and problems

These are gross oversimplifications, many proper mathematicians would argue these lack rigor.

  • Axioms are logical proofs of concepts, operations, and theorems. They are meant to leverage common sense and intuition. Meaning they have to have a core that is understandable in any context.

  • Algorithms are a system of steps for solving a problem/yielding a result. Think of them as mathematical/computational recipes. If you follow them exactly you should yield the appropriate result.

  • Exercises are simplified, in-class examples led by the instructor. Their purpose is to introduce the axioms and algorithms that make up the foundation of the course topics.

  • Problems are higher complexity, generally difficult, student directed puzzles. The axioms will still apply, the algorithms may not. The student is expected to flex their critical thinking and logic skills to either piece together which axioms and algorithms resolve the puzzle, or construct a new algorithm out of pieces of previously learned ones. They are never unsolvable, but you should have never seen them before.

Household Income (Chapter 1-3)

The Census Bureau keeps track of a larger variety of interesting U.S. Household/citizen metrics and an even larger variety of uninteresting metrics.

In order to calculate metrics from Census data, it’s a common practice to split states into counties, counties into districts, residents of the districts into groups based on similarities, and randomly select from those groups in even batches. The result is a sample that is ideally homogeneous.

Below is a table with a sample of \(n=50\) from the original sample size of 320. The sample was taken by evenly dividing the data by year and selecting 5 rows of data at random from each year:

Income (in 1000s of USD) Year Share Proportion State
53 2013 0.0476919 Kansas
6232 2013 0.0539027 Kansas
5410 2013 0.0467958 Kansas
8380 2013 0.0724881 Kansas
6215 2013 0.0537543 Kansas
89 2014 0.0796244 Kansas
57 2014 0.0509154 Kansas
11540 2014 0.0993029 Kansas
5714 2014 0.0491657 Kansas
57 2014 0.0513577 Kansas
119 2015 0.1068316 Kansas
57 2015 0.0513888 Kansas
8421 2015 0.0720238 Kansas
61 2015 0.0551635 Kansas
91 2015 0.0817416 Kansas
9200 2016 0.0781513 Kansas
60 2016 0.0536152 Kansas
6029 2016 0.0512156 Kansas
45 2016 0.0398922 Kansas
5404 2016 0.0459074 Kansas
4725 2017 0.0397657 Kansas
6931 2017 0.0583302 Kansas
58 2017 0.0521292 Kansas
5619 2017 0.0472877 Kansas
49 2017 0.0438035 Kansas
48 2018 0.0429701 Kansas
6932 2018 0.0578984 Kansas
5558 2018 0.0464243 Kansas
59 2018 0.0524877 Kansas
5507 2018 0.0459955 Kansas
9090 2019 0.0752758 Kansas
63 2019 0.0553547 Kansas
48 2019 0.0427709 Kansas
15375 2019 0.1273196 Kansas
8174 2019 0.0676866 Kansas
7146 2020 0.0584022 Kansas
51 2020 0.0444936 Kansas
5030 2020 0.0411120 Kansas
5382 2020 0.0439888 Kansas
9123 2020 0.0745613 Kansas
8982 2021 0.0724300 Kansas
5097 2021 0.0410990 Kansas
9695 2021 0.0781784 Kansas
50 2021 0.0437530 Kansas
155 2021 0.1363234 Kansas
44 2022 0.0378789 Kansas
91 2022 0.0795614 Kansas
16085 2022 0.1279288 Kansas
48 2022 0.0418941 Kansas
44 2022 0.0386058 Kansas

The original sampling method for the Census data was a multi-step process using multiple different sampling methods. Three to be exact. What are those three nested sampling methods?


What was the sampling method I used to create the second, smaller sample set in the table above?


Name the type and subtype of each variable in the table.


Compute the following for Household Income:

  • Mean

  • Median

  • Mode

  • Range

  • Variance

  • Standard Deviation


Justify which measure of center (Mean/Median/Mode) is most representative of the data set.


Why would someone prefer Standard Deviation over Variance as a measure of spread for this data?


The below histogram represents the original sample of \(n=320\):

  • Describe the shape of the histogram

  • Identify the median and mode

  • Identify where the mean you calculated falls on this histogram

    • Does your calculation feel reasonable?
  • Does this histogram follow our rules of histograms?