Statistics

Formula Sheet

Formula Sheet #

This page is a quick reference of definitions + formulas, grouped by the modules.


Notation #

  • Sample size: \( n \) (sample), \( N \) (population)
  • Sample mean: \( \bar{x} \) , population mean: \( \mu \)
  • Sample variance: \( s^2 \) , population variance: \( \sigma^2 \)
  • Sample SD: \( s \) , population SD: \( \sigma \)
  • Complement: \( A^c \)
  • Intersection (“and”): \( A\cap B \) , union (“or”): \( A\cup B \)
  • Conditional probability: \( P(A\mid B) \)

1. Basic Probability & Statistics #

1.1 Measures of Central Tendency #

Arithmetic mean #

Sample mean (ungrouped):

Stats Formula Sheet

Stats Formula Sheet #

Keep this page as a quick reference of definitions + formulas.


Notation #

  • Sample size: \( n \) (sample), \( N \) (population)
  • Mean: \( \bar{x} \) (sample), \( \mu \) (population)
  • Variance: \( s^2 \) (sample), \( \sigma^2 \) (population)
  • Standard deviation: \( s \) (sample), \( \sigma \) (population)

Module 1: Basic Statistics #

Measures of Central Tendency #

Sample mean (ungrouped):

Basic Statistics

Basic Statistics #

Statistics: describes data (what you see).
Probability: models uncertainty (what you don’t know yet).

  • Summarise a dataset using central tendency and variability
  • Explain core probability ideas using simple examples
  • Apply the axioms of probability
  • Distinguish mutually exclusive vs independent events

flowchart TD
    A[Dataset] --> B[Central Tendency]
    A --> C[Variability]
    B --> B1[Mean]
    B --> B2[Median]
    B --> B3[Mode]
    C --> C1[Range]
    C --> C2[Variance]
    C --> C3[Standard Deviation]
    C --> C4[IQR]

Measures of Central Tendency #

Central tendency tells you where the “middle” of the data is. Describes a set of scores with a single number that describes the PERFORMANCE of the group.

Basic Probability

Basic Probability #

Probability models uncertainty: what you don’t know yet, but want to reason about.

Key takeaway: Probability is a number between 0 and 1 that measures how likely an event is. The whole topic is about defining events clearly and applying a few core rules consistently.

Probability quantifies uncertainty: a number between 0 and 1.

  • 0 means: impossible
  • 1 means: certain

Terminology #

Random experiment #

A random experiment is an action whose outcome is not known in advance.

Conditional Probability & Bayes’ Theorem

Conditional Probability & Bayes’ Theorem #

Probability often changes when we learn new information.

Conditional probability and Bayes’ theorem give a structured way to update beliefs using evidence.

Conditional probability updates probabilities after observing an event.

Bayes’ theorem lets you estimate a hidden cause from observed evidence.

Naïve Bayes turns Bayes’ theorem into a practical classifier by assuming conditional independence of features given the class.


flowchart TD

A[Conditional<br/>probability] -->|foundation| B[Bayes<br/>theorem]
D[Independent<br/>events] -->|implies| C[Independence]
C -->|simplifies| A

E[Prior] -->|with likelihood| B
F[Likelihood] -->|updates| H[Posterior]
G[Evidence] -->|normalises| B
B -->|yields| H

I[Naïve<br/>Bayes] -->|uses| B
J[Naïve<br/>assumption] -->|assumes| C
K[Features] -->|given class| J
L[Class] -->|conditions| J
I -->|predicts| M[Classification]
M -->|selects| L

style A fill:#90CAF9,stroke:#1E88E5,color:#000
style B fill:#90CAF9,stroke:#1E88E5,color:#000
style C fill:#90CAF9,stroke:#1E88E5,color:#000

style D fill:#CE93D8,stroke:#8E24AA,color:#000
style E fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style G fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style K fill:#CE93D8,stroke:#8E24AA,color:#000
style L fill:#CE93D8,stroke:#8E24AA,color:#000

style H fill:#C8E6C9,stroke:#2E7D32,color:#000
style I fill:#C8E6C9,stroke:#2E7D32,color:#000
style M fill:#C8E6C9,stroke:#2E7D32,color:#000


Quick summary #

  • Conditional probability: updates probability after an event is known.
  • Multiplication rule: computes joint probability from conditional parts.
  • Independence: tested using \( P(A\cap B)=P(A)P(B) \) .
  • Total probability: breaks a probability into weighted cases.
  • Bayes’ theorem: reverses conditioning to infer causes from evidence.

What’s next #

Probability Distributions
Move from events to random variables and distributions.

Conditional Probability

Conditional Probability #

Conditional probability updates the probability of an event when new information is available.

It shows up whenever a question says:

  • “given that…”
  • “among those who…”
  • “out of the items that…”
  • “if it does not fail immediately…”

Key takeaway: Conditional probability is always:

joint probability ÷ probability of the condition.

The condition must not be an impossible event.


Prior vs posterior #

  • Prior probability: probability with no condition (before new information)

Bayes’ Theorem

Bayes’ Theorem #

2.1 Total probability (needed for Bayes) #

Often we split the world into cases \( E_1,E_2,\dots,E_k \) that:

  • are mutually exclusive
  • cover the whole sample space

Then for any event \( A \) :

\[ P(A)=\sum_{i=1}^{k} P(A\mid E_i)\,P(E_i) \]

Tree intuition:

Naïve Bayes

Naïve Bayes #

Naïve Bayes is a probabilistic classifier.

  • Supervised Learning Problem
  • Binary Classification - final target variable is considered in two classes
  • Hypothesis is target which you want to classify
  • Total Probability (Prior) of Yes and No is already calculated
  • Post / Posterior is when you start studying data
  • Based on max probability of hypotheses classify given instance into a class

It predicts a class label by computing:

Probability Distributions

Probability Distributions #

Probability distributions are the bridge between: real-world randomness and mathematical modelling.

A random experiment produces outcomes. A random variable turns those outcomes into numbers. A probability distribution tells you how likely each number (or range of numbers) is.

Key takeaway: A distribution is a complete “story” about uncertainty: what values are possible, how likely they are, and how we summarise them (mean, variance).


flowchart TD
	PD["Probability<br/>distributions"] --> RV["Random<br/>variables"]
	PD["Probability<br/>distributions"] --> DS["Common<br/>distributions"]

	style PD fill:#90CAF9,stroke:#1E88E5,color:#000
	style RV fill:#90CAF9,stroke:#1E88E5,color:#000
	style DS fill:#90CAF9,stroke:#1E88E5,color:#000

AI/ML Connection #

  • Many ML models are probabilistic: they assume data (or errors) follow a distribution.
  • Loss functions often come from distribution assumptions: squared loss aligns with Gaussian noise.
  • Naïve Bayes (from the previous module) becomes practical once you can model: \( P(X\mid Y) \) using suitable distributions.

In practice: choosing a distribution is a modelling decision. It affects: prediction, uncertainty estimates, and what “rare” or “typical” means in your data.

Random Variables

Random Variables #

A random variable is a way to attach numbers to outcomes of a random experiment.

It lets us move from: “what happened?” to: “what number should we analyse?”

Key takeaway: A random variable is a function from the sample space to real numbers. Once you define the random variable clearly, the rest (pmf/pdf/cdf, mean, variance) becomes systematic.


flowchart TD
PD["Probability<br/>distributions"] --> RV["Random<br/>variables"]

RV --> T["Types"]
T --> RV1["Discrete<br/>RVs"]
T --> RV2["Continuous<br/>RVs"]

RV --> F["PMF / PDF / CDF"]
RV --> S["Mean / Variance<br/>Covariance"]
RV --> J["Joint & Marginal<br/>distributions"]
RV --> X["Transformations"]

style PD fill:#90CAF9,stroke:#1E88E5,color:#000
style RV fill:#90CAF9,stroke:#1E88E5,color:#000

style T fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style S fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style X fill:#CE93D8,stroke:#8E24AA,color:#000
style RV1 fill:#CE93D8,stroke:#8E24AA,color:#000
style RV2 fill:#CE93D8,stroke:#8E24AA,color:#000

1) Definition #

Random variable: a rule that assigns a number to each outcome.