3.4 Measurement: Distribution(s) of measurements

  • Measurement process: Assign individuals to cells → distribute across cells → distribution
  • Any variable has a set of theoretically possible values but…
    • …for some values there might not be any observations
  • Same is true for value combinations of variables (e.g., remember table on education and income before)
  • Q: For a sample of 1000 individuals how many (theoretically possible) cells do we have for (assuming they are evenly distributed)…
    • …a variable \(education = \{0,1,2,3,4,5,6,7,8,9,10\}\), with 11 values?
    • …three variables, education (11 values), trust (11 values), victimization (2 values)?
    • (…two variables, age (100 values) and body temperature (infinite)?)
  • Variables = Dimensions (be it categorical or quantitativ)!
  • Always ask yourself: What could the distribution look like?

Notes

  • Lessons to be learned
    • Learn to think in terms of cells into which we distribute our observations.
    • Corresponding to the theoretically possible values of the variable(s) we deal with there is a set of cells (could be infinite). However, depending on our observations, some cells may remain empty (e.g., in the upper range of an age variable that goes to 100).
    • We normally deal with several variables. Then the cells simply represent value combinations.
    • We can also think of a data as being distributed in a n-dimensional space, whereby the n is provided by the number of variables. Each variable represents one dimension.
    • If you encounter data in your research it is generally a good strategy to first ask yourself what the theoretical space is (what values/value combinations are possible in principle) and then to contemplate how the actual empirical data is distributed in this space.