Conditional Probability

Conditional Probability #

Conditional probability updates the probability of an event when new information is available.

It shows up whenever a question says:

  • “given that…”
  • “among those who…”
  • “out of the items that…”
  • “if it does not fail immediately…”

Key takeaway: Conditional probability is always:

joint probability ÷ probability of the condition.

The condition must not be an impossible event.


Prior vs posterior #

  • Prior probability: probability with no condition (before new information)

  • Posterior probability: probability of the same event after new information (with a condition)


1) Conditioning changes the sample space #

Before you condition, your “universe” is the sample space \( S \) .

After you condition on \( B \) , your “universe” becomes: only the outcomes where \( B \) is true.

So:

  • \( P(A) \) : probability of \( A \) in the full sample space
  • \( P(A\mid B) \) : probability of \( A \) inside \( B \)

Conditioning “shrinks the universe”.


2) Definition of conditional probability #

For events \( A \) and \( B \) , with \( P(B)>0 \) :

\[ P(A\mid B)=\frac{P(A\cap B)}{P(B)} \]

Meaning:

  • Numerator \( P(A\cap B) \) : joint probability (“A and B together”)
  • Denominator \( P(B) \) : probability of the condition (“B happened”)

Important: you cannot condition on an impossible event. If \( P(B)=0 \) , then \( P(A\mid B) \) is not defined.


3) Multiplication rule (joint probability) #

Start from the definition and rearrange:

\[ P(A\cap B)=P(B)\,P(A\mid B) \]

Rearranging the definition gives the multiplication rule:

\[ P(A\cap B)=P(A)\,P(B\mid A) \]

This multiplication rule is valid for any two events: independent or dependent.


4) Chain rule for three events #

For \( A,B,C \) :

\[ P(A\cap B\cap C)=P(A)\,P(B\mid A)\,P(C\mid A\cap B) \]

Rule of thumb: each new event is conditioned on everything that happened before it.


5) Independence as a special case #

Two events are independent if knowing one does not change the probability of the other.

Equivalent tests:

\[ P(A\mid B)=P(A) \] \[ P(B\mid A)=P(B) \] \[ P(A\cap B)=P(A)\,P(B) \]

6) Do not confuse: mutually exclusive vs independent #

Mutually exclusive means they cannot happen together:

\[ A\cap B=\varnothing \]

So:

\[ P(A\cap B)=0 \]

Mutually exclusive events are almost never independent.

If

\( A\cap B=\varnothing \)

and both events are possible, then ( P(A)P(B)>0 )

but ( P(A\cap B)=0 )

. So they cannot be equal.


7) Visual guide: what relationship do events have? #

%%{init: {'theme':'base','themeVariables': {
  'fontFamily':'Inter, ui-sans-serif, system-ui',
  'primaryColor':'#E8F1FF',
  'primaryTextColor':'#1F2937',
  'primaryBorderColor':'#A7C7FF',
  'lineColor':'#94A3B8',
  'tertiaryColor':'#F8FAFC'
}}}%%
flowchart LR
  A["Two events A and B"] --> Q{"Can A and B happen together?"}
  Q -->|No| M["Mutually exclusive<br/>A ∩ B = ∅"]
  Q -->|Yes| R{"Does knowing A change B?"}
  R -->|No| I["Independent<br/>P(A ∩ B)=P(A)P(B)"]
  R -->|Yes| D["Dependent<br/>Use conditional probability"]

8) Worked patterns #

Pattern A: “fraction of those who passed first also passed second” #

Let:

  • \( A \) : passed exam 1
  • \( B \) : passed exam 2

If you are given \( P(A\cap B) \) and \( P(A) \) :

\[ P(B\mid A)=\frac{P(A\cap B)}{P(A)} \]

Example values: if \( P(A\cap B)=0.35 \) and \( P(A)=0.42 \) ,

\[ P(B\mid A)=\frac{0.35}{0.42}=\frac{35}{42}=\frac{5}{6} \]

Pattern B: “given it does not fail immediately” #

Conditioning removes some outcomes.

Example structure: if “fails immediately” bulbs are excluded by the condition, then the sample space becomes only: (partially defective + acceptable)

\[ P(\text{acceptable}\mid \text{not immediate fail}) =\frac{\text{acceptable}}{\text{acceptable}+\text{partial}} \]

Pattern C: table-based conditional probability (counts) #

If you have a two-way table (like Age group vs Default Yes/No), then:

  • joint count: a cell in the table
  • marginal count: row total or column total

Example structure:

\[ P(\text{No default}\mid \text{Middle-aged}) =\frac{\text{No default and Middle-aged}}{\text{Middle-aged total}} \]

Reverse conditioning changes the denominator:

\[ P(\text{Middle-aged}\mid \text{No default}) =\frac{\text{No default and Middle-aged}}{\text{No default total}} \]

9) Common traps #

Trap 1: mixing up “A given B” and “B given A” #

They are usually different.

\[ P(A\mid B)=\frac{P(A\cap B)}{P(B)} \qquad P(B\mid A)=\frac{P(A\cap B)}{P(A)} \]

Same joint probability: different denominators.


Trap 2: adding conditional probabilities #

In general: \( P(C\mid B)+P(C\mid B^c) \) is not equal to 1.

What is true is the weighted version:

\[ P(C)=P(C\mid B)P(B)+P(C\mid B^c)P(B^c) \]

Trap 3: complements of independent events #

If \( A \) and \( B \) are independent, then complements remain independent: \( A^c \) and \( B \) are also independent.

A quick check form:

\[ P(B\mid A)=P(B)\ \Rightarrow\ P(B\mid A^c)=P(B) \]

Mini-check (self-test) #

  1. What must be true for \( P(A\mid B) \) to be defined?
  2. Rewrite \( P(A\cap B) \) using conditional probability.
  3. If \( A \) and \( B \) are independent, what is \( P(A\mid B) \) ?
  4. If \( A \) and \( B \) are mutually exclusive and both possible, are they independent?

Answers:

  1. \( P(B)>0 \)
  2. \( P(A\cap B)=P(B)P(A\mid B) \)
  3. \( P(A\mid B)=P(A) \)
  4. No

What’s next #

  • Total probability and Bayes’ theorem
  • This is where you combine conditional probability with a partition of the sample space and “reverse” the conditioning.

Home | Conditional Probability & Bayes’ Theorem