Conditional Probability #
Conditional probability updates the probability of an event when new information is available.
It shows up whenever a question says:
- “given that…”
- “among those who…”
- “out of the items that…”
- “if it does not fail immediately…”
Key takeaway: Conditional probability is always:
joint probability ÷ probability of the condition.
The condition must not be an impossible event.
Prior vs posterior #
Prior probability: probability with no condition (before new information)
Posterior probability: probability of the same event after new information (with a condition)
1) Conditioning changes the sample space #
Before you condition, your “universe” is the sample space \( S \) .
After you condition on \( B \) , your “universe” becomes: only the outcomes where \( B \) is true.
So:
- \( P(A) \) : probability of \( A \) in the full sample space
- \( P(A\mid B) \) : probability of \( A \) inside \( B \)
Conditioning “shrinks the universe”.
2) Definition of conditional probability #
For events \( A \) and \( B \) , with \( P(B)>0 \) :
\[ P(A\mid B)=\frac{P(A\cap B)}{P(B)} \]Meaning:
- Numerator \( P(A\cap B) \) : joint probability (“A and B together”)
- Denominator \( P(B) \) : probability of the condition (“B happened”)
Important: you cannot condition on an impossible event. If \( P(B)=0 \) , then \( P(A\mid B) \) is not defined.
3) Multiplication rule (joint probability) #
Start from the definition and rearrange:
\[ P(A\cap B)=P(B)\,P(A\mid B) \]Rearranging the definition gives the multiplication rule:
\[ P(A\cap B)=P(A)\,P(B\mid A) \]This multiplication rule is valid for any two events: independent or dependent.
4) Chain rule for three events #
For \( A,B,C \) :
\[ P(A\cap B\cap C)=P(A)\,P(B\mid A)\,P(C\mid A\cap B) \]Rule of thumb: each new event is conditioned on everything that happened before it.
5) Independence as a special case #
Two events are independent if knowing one does not change the probability of the other.
Equivalent tests:
\[ P(A\mid B)=P(A) \] \[ P(B\mid A)=P(B) \] \[ P(A\cap B)=P(A)\,P(B) \]6) Do not confuse: mutually exclusive vs independent #
Mutually exclusive means they cannot happen together:
\[ A\cap B=\varnothing \]So:
\[ P(A\cap B)=0 \]Mutually exclusive events are almost never independent.
If
\( A\cap B=\varnothing \)and both events are possible, then ( P(A)P(B)>0 )
but ( P(A\cap B)=0 )
. So they cannot be equal.
7) Visual guide: what relationship do events have? #
%%{init: {'theme':'base','themeVariables': {
'fontFamily':'Inter, ui-sans-serif, system-ui',
'primaryColor':'#E8F1FF',
'primaryTextColor':'#1F2937',
'primaryBorderColor':'#A7C7FF',
'lineColor':'#94A3B8',
'tertiaryColor':'#F8FAFC'
}}}%%
flowchart LR
A["Two events A and B"] --> Q{"Can A and B happen together?"}
Q -->|No| M["Mutually exclusive<br/>A ∩ B = ∅"]
Q -->|Yes| R{"Does knowing A change B?"}
R -->|No| I["Independent<br/>P(A ∩ B)=P(A)P(B)"]
R -->|Yes| D["Dependent<br/>Use conditional probability"]
8) Worked patterns #
Pattern A: “fraction of those who passed first also passed second” #
Let:
- \( A \) : passed exam 1
- \( B \) : passed exam 2
If you are given \( P(A\cap B) \) and \( P(A) \) :
\[ P(B\mid A)=\frac{P(A\cap B)}{P(A)} \]Example values: if \( P(A\cap B)=0.35 \) and \( P(A)=0.42 \) ,
\[ P(B\mid A)=\frac{0.35}{0.42}=\frac{35}{42}=\frac{5}{6} \]Pattern B: “given it does not fail immediately” #
Conditioning removes some outcomes.
Example structure: if “fails immediately” bulbs are excluded by the condition, then the sample space becomes only: (partially defective + acceptable)
\[ P(\text{acceptable}\mid \text{not immediate fail}) =\frac{\text{acceptable}}{\text{acceptable}+\text{partial}} \]Pattern C: table-based conditional probability (counts) #
If you have a two-way table (like Age group vs Default Yes/No), then:
- joint count: a cell in the table
- marginal count: row total or column total
Example structure:
\[ P(\text{No default}\mid \text{Middle-aged}) =\frac{\text{No default and Middle-aged}}{\text{Middle-aged total}} \]Reverse conditioning changes the denominator:
\[ P(\text{Middle-aged}\mid \text{No default}) =\frac{\text{No default and Middle-aged}}{\text{No default total}} \]9) Common traps #
Trap 1: mixing up “A given B” and “B given A” #
They are usually different.
\[ P(A\mid B)=\frac{P(A\cap B)}{P(B)} \qquad P(B\mid A)=\frac{P(A\cap B)}{P(A)} \]Same joint probability: different denominators.
Trap 2: adding conditional probabilities #
In general: \( P(C\mid B)+P(C\mid B^c) \) is not equal to 1.
What is true is the weighted version:
\[ P(C)=P(C\mid B)P(B)+P(C\mid B^c)P(B^c) \]Trap 3: complements of independent events #
If \( A \) and \( B \) are independent, then complements remain independent: \( A^c \) and \( B \) are also independent.
A quick check form:
\[ P(B\mid A)=P(B)\ \Rightarrow\ P(B\mid A^c)=P(B) \]Mini-check (self-test) #
- What must be true for \( P(A\mid B) \) to be defined?
- Rewrite \( P(A\cap B) \) using conditional probability.
- If \( A \) and \( B \) are independent, what is \( P(A\mid B) \) ?
- If \( A \) and \( B \) are mutually exclusive and both possible, are they independent?
Answers:
- \( P(B)>0 \)
- \( P(A\cap B)=P(B)P(A\mid B) \)
- \( P(A\mid B)=P(A) \)
- No
What’s next #
- Total probability and Bayes’ theorem
- This is where you combine conditional probability with a partition of the sample space and “reverse” the conditioning.
Home | Conditional Probability & Bayes’ Theorem