Conditional Probabilities and Bayes

Conditional Probabilities

Let’s say we are looking at the likelihood that someone knows R in the Bienen School of Music.
We know there are about 700 people in Bienen School of Music, and it’s safe to say that maybe 10 know R.

\(P(knowingR) = \frac{8}{700}\)
- The probability of knowing R in Bienen is 8 out of 700.
\(P(knowingR|thisClass) = \frac{13}{700}\)
- The probability of knowing R in a quarter in which this class is offered, is 13 out of 700.
\(\frac{P(knowingR|thisClass)}{P(knowingR)} = 1.625\)
- What does this tell us? In a quarter that this class is being offered, you’re 62.5% more likely to find someone who knows R.

On the individual level, it’s still quite rare to know R, but at the population level, it goes up quite a bit.

“Conditional probabilities are an essential part of statistics because they allow us to demonstrate how information changes our beliefs.” (Will Kurt)

\(P(colorBlind) = .0425\)
\(P(colorBlind|female) = .005\)
\(P(colorBlind|male) = .08\)
If we pick someone at random, what are the odds that they are male and color blind?

This would be the basic population equation:
- \(P(male, colorBlind) = P(male) * P(colorBlind)\) =
  - .5 * .0425 = .02125
- But this doesn’t really answer the full question.
These are dependent probabilities.
- The true probability of finding a male who is color blind is the probability of picking a male multiplied by the probability that he is color blind.
- \(P(male, colorBlind) = P(male) * P(colorBlind | male)\)
- .5 * .08 = .04
- \(P(A,B) = P(A) * P(B|A)\)

How can we determine the \(P(male|colorBlind)\)?

Kurt figure 8.1

P(robbed | broken window, open front door, missing laptop)
What are the odds that, if you were robbed, you’d come home and find this evidence? 3 out of 10?
\(P(robbed) = \frac{1}{1000}\)
\(P(robbed) * P(brokenWindow, openDoor, missingLaptop|robbed)\)
- = \(\frac{\frac{1}{1000} * \frac{3}{10}}{P(D)}\)

P(kid) = 1/2000
P(door) = 1/30
P(laptop) = 1/365
\(P(H_2) = \frac{1}{2000} * \frac{1}{30} * \frac{1}{365}\)
- = 1 out of 21,9000,000.

\(\frac{P(H_1) x P(D | H_1)}{P(H_2) x P(D | H_2)}\)
\(\frac{\frac{3}{1000}}{\frac{1}{21,900,000}}\)
= 6,570
\(H_1\) explains what we observed 6,750 times better than \(H_2\).
Our original hypothesis explains the data much, much better than the second hypothesis.

	Found Schema	Didn’t Find Schema	Accuracy Rate
Orchestral	36	114	.24
Piano	50	100	.33

[1] 0.95875