Gradient Descent

Sat, 21 Feb 2026 00:00:00 +0000

Gradient Descent for Linear Regression #

Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.

Iterative method
Types: batch / stochastic / mini-batch

Key takeaway: Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.

flowchart TD
GD["Gradient<br/>Descent"] -->|minimises| CF["Cost<br/>function"]
GD -->|updates| W["Parameters<br/>(weights)"]
GD -->|uses| GR["Gradient<br/>(slope)"]

GD --> H["Hyperparameters"]
H --> LR["Learning<br/>rate"]
H --> BS["Batch<br/>size"]
H --> EP["Epochs"]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000

Types of GD #

flowchart TD
T["Gradient Descent<br/>types"] --> BGD["Batch<br/>GD"]
T --> SGD["Stochastic<br/>GD"]
T --> MGD["Mini-batch<br/>GD"]

BGD --> ALL["All data<br/>per step"]
BGD --> STB["Smooth<br/>updates"]

SGD --> ONE["1 sample<br/>per step"]
SGD --> FAST["Quick<br/>progress"]
SGD --> NOISE["Noisy<br/>updates"]

MGD --> MB["Small batch<br/>per step"]
MGD --> PRACT["Practical<br/>default"]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000

Batch #

Use only if you have huge compute and a lot of time to train

SGD #

go-to solution

Gradient Descent on Arshad Siddiqui

Gradient Descent

Gradient Descent for Linear Regression #

Types of GD #

Batch #

SGD #