Assessing the relationship between two variables

By variable we mean an unknown quantity that can take several values. If you are asked to assess the relationship between two variables, then

Crime

Fifty states in America were investigated in terms of their crime rates and percentage of high school dropouts. The crime rate per 100,000 people included murder, forcible rape, robbery, aggravated assault, burglary, larceny-theft and motor vehicle theft and the state high school dropout rate comprised the percentage of current 16-19 year old people who were not in school and had not finished the 12th grade.

What can we say about the relationship between crime rate and %dropouts? Suppose we tried to fit a straight like through these data.

This line is of the form \[y=\alpha + \beta x\] where \(\alpha\) is the y-intercept and \(\beta\) is the slope, or gradient, of the line.

The black line above is \[\mbox{Crime}=2197+281.8 \times \mbox{Dropout}. \]

But what does this mean?

The next question you can ask is how confident are you with these estimates? In other words, how well does the line fit the data points, and how close are the points to the data. Lastly, how steep is this line? Could we maybe conclude that \(\beta=0\)?

Potatoes

The glucose level in potatoes is dependent on the length of time for which they have been stored. We have data detailing the glucose level in potatoes over time.

The scatterplot of glucose against storage time in weeks shows a curved relationship which is non-monotonic, and so cannot be transformed into a straight line. In other words, we will not be able to find a transformation in the same way as the X-rays example.

Possibly a quadratic curve would describe this relationship adequately. Compare the two fitted curves below.

\(\newline\)

We could write out the quadratic curve shown on the right hand side plot as

\[y = \alpha + \beta x + \gamma x^2.\] This should look familiar to you. Here, \(\alpha\), \(\beta\) and \(\gamma\) are values that define the shape of the curve.

We will return to this example in later weeks.