Application: Diagnostic Testing

In this application you will compute probabilities like those in Example 2.3. Choose a disease or medical condition that is interesting or relevant to you personally and for which a diagnostic tests is available.

  1. Find an estimate of the probability that a person truly has the condition. (This is called the prevalence or incidence of the condition.1.) Cite the source (and provide a link) and describe briefly the data upon which the estimate is based. Be sure to clarify what baseline group this probability applies to; e.g., people within a certain age range, with certain risk factors, etc (like “pregnant women in their mid-to-late 30s” in Example 2.3). You will want to maintain a consistent baseline group throughout this problem. Write a sentence interpreting the numerical value of the probability in context. For example, replace the italicized words in the following with the specific details of your context: “Among many people with these characteristics, blank percent of people have the condition.”

  2. Find an estimate of the probability of testing positive for someone who truly has the condition. (This is called the sensitivity of the test. Careful: (100% - sensitivity) might be reported instead.) Cite the source (and provide a link) and describe briefly the data upon which the estimate is based. Be sure the baseline group is consistent with the one from the previous problem. (For example, if your sensitivity is for people who have an identified risk factor but your prevalence is for people in general, then the baseline groups are inconsistent.) Write a sentence interpreting the numerical value of this probability in context.

  3. Find an estimate of the probability of testing negative for someone who truly does not have the condition. (This is called the specificity of the test. Careful: (100% - specificity) might be reported instead.) Cite the source (and provide a link) and describe briefly the data upon which the estimate is based. Be sure the baseline group is consistent with the one from the previous problems. Write a sentence interpreting the numerical value of this probability in context.

  4. Construct a two-way table split by the test result (positive or negative) and status (truly has the condition or truly does not have the condition). Assume a nice round number total of people from your baseline group. Fill in all the counts in the table.

For the remaining parts, assume a person is selected at random from your baseline group and the test is administered.

  1. Use the table to estimate the probability that a person who tests positive truly has the condition. (This is called the precision or positive predictive value.) Write a sentence interpreting the numerical value of this probability in context.

  2. Use the table to estimate the probability that a person who tests positive truly does not have the condition. (This is called the false discovery rate.) Write a sentence interpreting the numerical value of this probability in context.

  3. Use the table to estimate the probability that the test result is correct. (This is called the accuracy.) Write a sentence interpreting the numerical value of this probability in context.

  4. Is a person who tests positive more likely to truly have the condition or not? How many times more likely?

  5. In which case is the person more likely to truly have the condition: if the test result is unknown, or if the test is positive? How many times more likely?

  6. Someone you know who satisfies the characteristics of your baseline group is given this test as part of a routine screening, and it comes back positive. The person knows that you have taken probability and asks for your advice regarding their chances that they actually have the condition. What would you tell them? Write a few sentences explaining, as non-technically as possible, what you would tell the person. What are the main numbers they should pay attention to, and how should they interpret them? Your goal is to give the person the most relevant assessment without overwhelming them with numbers and details. (Of course, your probability assessment should not replace expert medical opinion, but maybe your advice can help the person while they wait for a follow up test.)


  1. Technically, prevalence and incidence are different, but you can treat them interchangeably for this exercise. Prevalence refers to people who have the condition right now, while incidence refers to people who will develop the condition over some period of time.↩︎