| \(H\) | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 | 🔴 |
|---|---|---|---|---|---|---|---|---|---|---|
| \(\neg H\) | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | |
| 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 | 🟢 |
The best way to memorize Bayes' Theorem is by rearranging terms from the definition of conditional probability.
Always remember
\(P(A|B) = \frac{P(A \cap B)}{P(B)}\)This definition allows us to "split" the \(\cap\) to either condition \(A\) or \(B\)
\(P(A \cap B) = P(A|B)P(B)\)\(P(A \cap B) = P(B|A)P(A)\)Using the above properties on \(H\) and \(E\):
\( \begin{aligned} P(H|E) &= \frac{P(H \cap E)}{P(E)} \\ &= \frac{P(E|H) \cdot P(H)}{P(E)} \\ \end{aligned} \)Or:
\(P(H|E) = P(H) \cdot \frac{P(E|H)}{P(E)}\)This theorem allows us to:
A patient is going to have a test of a certain disease.
Let's assume:
The **posterior (probability)**, \(P(H|E)\), which is the probability of hypothesis posterior to evidence, is given by:
\(P(H|E) = P(H) \cdot \frac{P(E|H)}{P(E)}\)Note that to calculate \(P(E)\), the **marginal likelihood**, we need to split into 2 (or more, depending on the question) cases, either H is true or false, and add up the probabilities.
\( \begin{aligned} P(H|E) &= P(H) \cdot \frac{P(E|H)}{P(E|H) \cdot P(H) + P(E|\neg H) \cdot P(\neg H)} \\ &= 1\% \cdot \frac{98\%}{98\% \cdot 1\% + 5\% \cdot 99\%} \\ &\approx 1\% \cdot 16.53 \\ &\approx 16.53\% \end{aligned} \)So, given a positive test result, the probability of having the disease is approximately 16.53%.
This is very counter-intuitive. You are tested positive, but the chance of having the disease is only 16.53%! Why is that?
The reason is that the prior probability \(P(H) = 1\%\) of the hypothesis, or the **base rate**, is actually very low here. Our evidence increased it to \(16.53\) times, but it is still low.
Failing to consider the base rate is called the **base rate fallacy**. This is a very common cognitive bias because, while the probabilities are hard to calculate, "how representative is this event of the hypothesis" is much easier to answer, and we humans are likely to switch to an easier question unconsciously. It is related to a concept called the **representativeness heuristic**.