# Week 7 Decision Table

In this chapter, students will learn to create a \(2 \times 2\) **decision table** or a contingency table based on a test score and a criterion score. From the contingency table, students will be able to understand and obtain the **hit rate**, **sensitivity**, **specificity**, and **base rate** in R.

Test scores are often used for making **screening decisions**. For example, a company can decide whether to hire an applicant or not based on the job screening test score \(X\) and a cut score. In this case, the test score \(X\) is used to predict a future criterion score \(Y\) such as the productivity of workers.

A future criterion score \(Y\) predicted by \(X\) can be dichotomized into an **observed criterion outcome** with two categories, successful/unsuccessful. For example, we can classify workers who produce more than 18 units/hour as successful.

Based on the screening decision (e.g., hire or do not hire) and the observed criterion outcome (e.g., successful or unsuccessful), we can construct the \(2 \times 2\) **decision table**.

## 7.1 Decision Table

Let’s construct a \(2 \times 2\) decision table using the same two data sets. First, import the two data sets and obtain the total scores.

Again, \(X\) is the test score used to predict the future criterion score \(Y\). Let’s assume that we predicted examinees with \(X\) greater than or equal to 13 to be successful in the future. \(Y\) is the actual outcome score and we can categorize examinees who got \(Y\) greater than or equal to 13 as successful.

- Note that the cut scores for the two tests are both 13 in this example. However, the cut off scores do not need to be always the same for the two test scores.

We can create logical vectors `predicted`

and `actual`

from \(X\) and \(Y\) using the cut off score 13.

```
## [1] FALSE TRUE FALSE FALSE
## [5] FALSE FALSE
```

```
## [1] FALSE FALSE TRUE FALSE
## [5] FALSE FALSE
```

The object `predicted`

is a logical vector indicating whether each examinee is predicted to be successful in the future. And the object `actual`

is a logical vector indicating whether each examinee is actually successful or not.

Using `sum()`

function to a logical value returns the number of `TRUE`

values in the vector. The following code returns the number of examinees that are predicted to be successful.

`## [1] 44`

And the following returns the number of examinees that are actually successful.

`## [1] 33`

Using `mean()`

function to a logical value returns the proportion of `TRUE`

s. Therfore, the proportion of predicted success and actual success can be obtained by:

`## [1] 0.44`

`## [1] 0.33`

We can combine the two vectors side to side to check if the predicted outcome matches with the actual outcome.

```
## predicted actual
## [1,] FALSE FALSE
## [2,] TRUE FALSE
## [3,] FALSE TRUE
## [4,] FALSE FALSE
## [5,] FALSE FALSE
## [6,] FALSE FALSE
## [7,] TRUE TRUE
## [8,] FALSE TRUE
## [9,] FALSE FALSE
## [10,] FALSE FALSE
```

The first examinee was predicted to be unsuccessful, and the true outcome was also unsuccessful.

The second examinee was predicted to be successful, however turned out to be an unsuccessful one.

The 8th examinee was predicted to be unsuccessful, but the actual outcome was successful.

Now, we can construct a \(2 \times 2\) **decision table** using `table()`

function.

```
## predicted
## actual FALSE TRUE
## FALSE 47 20
## TRUE 9 24
```

There are four combinations:

```
## predicted
## actual FALSE
## FALSE Correct rejection
## TRUE Miss
## predicted
## actual TRUE
## FALSE False alarm
## TRUE Hit
```

**Hit**: 24 examinees were predicted to be**successful**, and the true outcome was also**successful**.**Correct rejection**: 47 examinees were predicted to be**unsuccessful**, and the true outcome was also**unsuccessful**.**False alarm**: 20 examinees were predicted to be**successful**, but the true outcome was**unsuccessful**.**Miss**: 9 examinees were predicted to be**unsuccessful**, but the true outcome was**successful**.

## 7.2 Hit Rate

The **hit rate** is the **proportion of correct decisions**. In other words, the hit rate is the proportion of predicting the successful outcomes as successful, and the unsuccessful outcomes as unsuccessful.

If our prediction or decision is correct, then the predicted outcome and the actual outcome should be equal. We can obtain the proportion of equals between the two vectors, `predicted`

and `actual`

.

`## [1] 0.71`

The hit rate is 0.71.

Equivalently, hit rate is the proportion of `Hit`

+ `Corect rejection`

from the decision table. The decision table is a \(2 \times 2\) matrix, and we can subset each element of the matrix with `[]`

. The following code chunk calculates the hit rate.

`## [1] 47`

`## [1] 24`

`## [1] 100`

`## [1] 0.71`

## 7.3 Sensitivity and Specificity

The sensitivity is the proportion of true successful outcomes correctly identified. And the specificity is the proportion of true unsuccessful outcomes correctly identified. Therefore,

Sensitivity: \(\frac{\text{Hit}}{\text{Hit+Miss}}\)

Specificity: \(\frac{\text{Correct rejection}}{\text{False alarm + Correct rejection}}\)

Below code obtains the sensitivity and the specificity from the decision table.

`## [1] 24`

`## [1] 9`

`## [1] 0.7272727`

`## [1] 47`

`## [1] 20`

`## [1] 0.7014925`

The sensitivity is 0.7273 and the specificity is 0.7015.

We can use logical operators to calculate the sensitivity and the specificity.

`## [1] 0.7272727`

`## [1] 0.7272727`

`## [1] 0.7014925`