Random Variables #
A random variable is a way to attach numbers to outcomes of a random experiment.
It lets us move from: “what happened?” to: “what number should we analyse?”
Key takeaway: A random variable is a function from the sample space to real numbers. Once you define the random variable clearly, the rest (pmf/pdf/cdf, mean, variance) becomes systematic.
1) Definition #
Random variable: a rule that assigns a number to each outcome.
\[ X: S \to \mathbb{R} \]Where:
- \( S \) is the sample space
- \( \mathbb{R} \) is the set of real numbers
2) Types of random variables #
Discrete random variable #
A discrete random variable can take a countable set of values (e.g., 0, 1, 2, 3, …).
Examples:
- number of heads in 2 coin tosses
- number of customers arriving in an hour
- number of cars in a drive-thru queue
Continuous random variable #
A continuous random variable can take any value in an interval.
Examples:
- time to finish a task
- height of a person
- temperature
Rule of thumb: Counts → usually discrete. Measurements → usually continuous.
3) Probability functions: pmf, pdf, cdf #
3.1 Discrete: probability mass function (pmf) #
The pmf gives the probability of each possible value.
\[ p(x)=P(X=x) \]A valid pmf must satisfy:
\[ p(x)\ge 0 \quad \text{for all }x \] \[ \sum_x p(x)=1 \]Example: number of heads in 2 tosses #
Let \( X \) = number of heads. Outcomes: HH, HT, TH, TT.
So:
- \( P(X=0)=1/4 \)
- \( P(X=1)=2/4 \)
- \( P(X=2)=1/4 \)
3.2 Continuous: probability density function (pdf) #
For a continuous random variable, we do not talk about: \( P(X=c) \) (it is 0).
We talk about intervals.
\[ P(a\le X\le b)=\int_a^b f(x)\,dx \]A valid pdf must satisfy:
\[ f(x)\ge 0 \] \[ \int_{-\infty}^{\infty} f(x)\,dx=1 \]3.3 Cumulative distribution function (cdf) #
The cdf accumulates probability up to a point.
\[ F(x)=P(X\le x) \]Discrete case: it is the running total of pmf values.
Continuous case: it is the area under the pdf to the left of x.
Discrete: cdf jumps (step-like). Continuous: cdf is smooth (usually).
4) Mean (Expectation) and Variance #
4.1 Expectation #
Discrete:
\[ E(X)=\sum_x x\,p(x) \]Continuous:
\[ E(X)=\int_{-\infty}^{\infty} x\,f(x)\,dx \]Meaning: the long-run average value of X if you repeat the experiment many times.
4.2 Variance and standard deviation #
Variance measures spread around the mean.
Discrete:
\[ V(X)=\sum_x (x-\mu)^2\,p(x) \]Continuous:
\[ V(X)=\int_{-\infty}^{\infty} (x-\mu)^2\,f(x)\,dx \]Shortcut (both discrete and continuous):
\[ V(X)=E(X^2)-[E(X)]^2 \]Standard deviation:
\[ \sigma_X = \sqrt{V(X)} \]5) Rules to memorise #
Expected value rules:
\[ E(aX+b)=aE(X)+b \]Variance rules:
\[ V(aX+b)=a^2V(X) \]6) Two random variables: joint, marginal, conditional #
6.1 Joint distribution (discrete) #
\[ p(x,y)=P(X=x,\,Y=y) \]6.2 Marginal distributions #
Sum over the other variable:
\[ p_X(x)=\sum_y p(x,y) \] \[ p_Y(y)=\sum_x p(x,y) \]6.3 Conditional distribution #
\[ p_{Y\mid X}(y\mid x)=\frac{p(x,y)}{p_X(x)}\quad (p_X(x)>0) \]7) Covariance (relationship between X and Y) #
Covariance measures whether two variables move together.
\[ \operatorname{Cov}(X,Y)=E(XY)-E(X)E(Y) \]- Positive covariance: large X tends to come with large Y.
- Negative covariance: large X tends to come with small Y.
- Zero covariance: no linear relationship (but they could still be dependent).
8) Transformation of random variables (basic idea) #
Sometimes we define a new random variable: \( Y=g(X) \) .
Discrete: compute \( P(Y=y) \) by summing probabilities of all \( x \) values that map to y.
Continuous (monotone g): use a change-of-variables approach.
You will use transformations later for: standardisation, log transforms, and turning data into “nice” distributions.
Mini-check #
- If X is continuous, what is \( P(X=c) \) ?
- For a pmf, what must \( \sum_x p(x) \) equal?
- State the shortcut formula for variance.
Answers:
\( V(X)=E(X^2)-[E(X)]^2 \)
- 0
- 1
References #
- Devore: Ch. 3 (discrete random variables) and Ch. 4 (continuous random variables)
Home | Probability Distributions