1 Probability fundamentals

Product data science interviews tend to allocate 15 minutes to an hour for probability theory problems. You will need to be proficient in the following:

  • Probability axioms, independence, conditional probability, Bayes rule
  • Combinatorics (permutations, combinations)
  • Discrete and continuous random variables, probability mass function, cumulative distribution function, probability density function
  • Distributions (Uniform, Bernoulli, Binomial, Poisson, Geometric)



For a deeper understanding of these topics I would suggest studying the excellent probability and stats book by Pishro-Nik (2014).

1.1 Trailing by two: should we go for two or three?

Easy Seen in a real interviewIndependence, Decision makingTry it


Question: Assume you are a basketball coach, and your team is losing by 2 points. You have the ball for the last shot. The probability of winning in overtime is roughly 50%. The probability of making a three-pointer is 36%. The probability of making a two-pointer is 52%.

  1. Should you go for a two-pointer and take the game to overtime, or should you go for a buzzer-beating three-pointer to win the game?

  2. Assuming everything else stays the same, adjust the probability of making a two-pointer to change the decision. What should that probability be?

Answer: We will use conditional probability and independence to solve this problem and make and data-driven decision.

  1. The probability of winning if you choose a two-pointer will be:

\[ \begin{align} \Pr \text{(win | 2pointer)} &= \Pr \text{(Overtime | 2pointer)} * Pr\text{(win in overtime)} \\ &= 0.52 * 0.5 = 0.26 \end{align} \]

Since 0.26 is less than 0.36, the coach should choose to go for a buzzer-beating three-pointer.

  1. To solve the second part, we need to solve for the two-pointer probability:

\[ \begin{align} \Pr \text{(win | 2pointer)} & \geq 0.36 \Leftrightarrow \\ \Pr \text{(Overtime | 2pointer)} & \geq 0.36/Pr\text{(win in overtime)} \\ & = 0.36/0.5 \\ & = 0.72 \end{align} \]


If you are interested in sports analytics and decision-making, check out the book by Winston, Nestler, and Pelechrinis (2022).


1.2 Red and blue balls

Easy Seen in a real interviewCounting, Combinations, Repetition, BinomialTry it


Question: Assume that there are 100 balls, 70 red and 30 blue.

  1. If you pick 5 with replacement, what’s the probability that all 5 are red?

  2. If you pick 5 without replacement, what’s the probability that all 5 are red?

  3. If you pick 10 without replacement, what’s the probability that 5 are red and 5 are blue?

Answer: This question tests combinations with and without permutations.

  1. Since we have repetition, we can essentially estimate the probability through a binomial distribution \(Bin(5,7/10)\):

\[ \Pr(5 \text{ red with repetition}) = \binom{5}{5} (7/10)^5 \approx 0.17 \]

  1. To get all 5 red, initially, we have 70 options, then 69, then 68, etc. To get any 5 balls, the combinations are 100, then 99, then 98. etc. So we can estimate this probability as:

\[ \Pr(5 \text{ red without repetition}) = \frac{70 * 69 * 68 * 67 * 66}{100 * 99 * 98 * 97 * 96} \]

Alternatively, we can use the binomial coefficient as follows:

\[ \Pr(5 \text{ red without repetition}) = \frac{\binom{70}{5}}{\binom{100}{5}} = \frac{\frac{70!}{5!65!}}{\frac{100!}{5! 95!}} = \frac{70 * 69 * 68 * 67 * 66}{100 * 99 * 98 * 97 * 96} \approx 0.16 \]

You can read the second equation as the number of ways I can choose 5 out of 70 reds over the number of ways that I can choose 5 out of 100 balls.

  1. Similar to b, we can estimate this as follows:

\[ \Pr(5 \text{ red without repitition}) = \frac{\binom{70}{5} \binom{30}{5}}{\binom{100}{10}} \approx 0.004 \]

1.3 Games between two players

Medium Seen in a real interviewRecursive relationshipTry it


Question: Consider a game where two players play a sequential game (e.g., rolling a dice), and at each round, the player who plays has probability \(p\) of winning. What is the probability that the player who plays first wins?

Answer: This is a very simple class of problems that we have seen repeatedly in interview settings. These problems are actually trivial once you solve a few examples, but for many, they might be tricky to solve if you haven’t seen them before.

The main thing to identify in this case is that the game goes to infinity the probability of one of the two players to win goes to 1. However, at each iteration, the playing participant has a probability \(p\) of winning. Keeping that in mind, we can come up with a recursive relationship that estimates the probability of the first player winning:

\[ \Pr\text{(win)} = p + (1-p)^2 * \Pr\text{(win)} \Leftrightarrow \Pr\text{(win)} = \frac{1}{2-p} \]

There are several variations of this sequential game, all of which are solved in the same recursive way. You can find more examples by searching for recursive relationship problems at https://www.madinterview.com.

1.4 Sum of normally distributed random variables

Medium Seen in a real interviewNormal, PDF, CDFTry it


Question: Assume 4 standard normal random variables \(X_1, ..., X_4\). What is the probability of \(2X_1 + 4X_2 > X_3 + X_4\)?

Answer: To solve these types of problems formulaically you need to know how to add normal random variables (https://online.stat.psu.edu/stat414/lesson/26/26.1):

If \(X_1,X_2,...,X_n\) are mutually independent normal rvs with means \(\mu_1,\mu_2,..., \mu_n\) and variances \(\sigma_1^2, \sigma_2^2 ..., \sigma_n^2\), then the linear combination \(Y = \sum_{i=1}^n c_i X_i\) follows the normal distribution with:

\[\begin{equation} \mu = \sum_i^n c_i \mu_i, \;\;\; \sigma^2 = \sum_i^n c_i^2 \sigma_i^2 \end{equation}\]


In this example, we are actually looking for the probability \(\Pr(Y>0)\) given that \(Y = 2X_1 + 4X_2 - X_3 - X_4\). Following the above formula, \(Y\) will be following a normal distribution with:

\[ \begin{align} \mu &= 2 * 0 + 4 * 0 - 0 - 0 = 0 \\ \sigma^2 &= 4 + 16 + 1 + 1 = 22 \end{align} \]

Hence \(Y \sim N(0,22)\). Since the normal distribution is symmetric, and since \(Y\)’s mean is zero:

\[ Pr(Y > 0) \approx Pr(Y\leq 0) = 0.5 \]


We did not need to estimate the variance to solve this question, but we did it for completeness. The interviewer will primarily be looking into a general understanding that summing normal random variables yields a new normal and that the normal is symmetric around its mean.


1.5 Additional Questions

The following probability questions are included in the complete book that defines the bar for product data science:

Question Topics
Paths to destination Counting, Combinations
Documents edited by AI Bayes rule, Conditional probability
Two children I Prior evidence
Consecutive tails Permutations, Repetition
Number of emails Poisson distribution
Two fair die rolls Independence, CDF, PMF
Repeated rolls until 4 Geometric distribution
Unfair coin probability Bayes rule, Conditional probability
Monty Hall {#monty} Bayes rule, Conditional independence, Prior evidence
Sample digits 1-10 Sample from samples, Uniform
Median probability Binomial, Uniform, CDF
Largest number rolled Counting, Permutations, Repetition

References

Pishro-Nik, Hossein. 2014. Introduction to Probability, Statistics, and Random Processes. Kappa Research, LLC Blue Bell, PA, USA.
Winston, Wayne L, Scott Nestler, and Konstantinos Pelechrinis. 2022. Mathletics: How Gamblers, Managers, and Fans Use Mathematics in Sports. Princeton University Press.