Optimisation on Arshad Siddiqui

LNN for Regression

Sun, 15 Feb 2026 00:00:00 +0000

Linear Neural Networks for Regression #

A linear neural network for regression is a model that predicts a continuous target by taking a weighted sum of input features and applying the identity activation (so the output can be any real number).

Single neuron for regression (predicting how much / how many)
Data + linear model (single neuron, no hidden layers) + squared loss
Training using batch gradient descent algorithm
Prediction (inference)
Eg: Auto MPG (UCI) style prediction with a single neuron (from-scratch code)

flowchart LR
 D["Data<br/>X, y"] --> M["Linear model<br/>w, b<br/>Single neuron"]
 M --> A["Activation<br/>Identity"]
 A --> L["Loss<br/>MSE (Squared error)"]
 L --> O["Optimiser<br/>Batch Gradient DescentBatch GD / Mini-batch GD"]
 O --> P["Parameters<br/>w, b"]
 P --> I["Inference<br/>Predict ŷ (number) for new x"]

 %% Pastel colour scheme
 style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
 style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
 style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
 style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
 style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
 style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
 style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px

Regression #

Regression is a supervised learning task that predicts a continuous-valued output based on input features.

Gradient Descent Algorithm

Thu, 26 Feb 2026 00:00:00 +0000

Gradient Descent Algorithm #

Gradient Descent Algorithm (GDA) is

an optimisation method
used to train models
by repeatedly updating parameters (weights and biases) to reduce the loss

In deep learning, the default training approach is almost always mini-batch gradient descent, usually with Adam or SGD + momentum.

Gradient Descent is used in both regression and classification.

It’s not tied to the task type — it’s tied to the fact you have:

LNN for Classification

Sun, 15 Feb 2026 00:00:00 +0000

Linear NN for Classification #

A Linear Neural Network (LNN) for classification uses no hidden layers.
It learns a linear decision boundary and outputs class probabilities, then converts them into predicted classes.

Neural-network view:

Binary classification → logistic regression (single neuron + sigmoid)

Multi-class classification → softmax regression (K output neurons + softmax)

flowchart LR
 D["Data<br/>X, y"] --> M["Linear model<br/>w, b"]
 M --> A["Activation<br/>Sigmoid / Softmax"]
 A --> L["Loss<br/>Cross-entropy"]
 L --> O["Optimiser<br/>Mini-batch GD / Adam"]
 O --> P["Updated parameters<br/>w, b"]
 P --> I["Inference<br/>Probabilities → class"]

 %% Pastel colour scheme
 style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
 style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
 style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
 style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
 style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
 style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
 style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px

Classification #

Classification predicts a discrete class label.
Common settings: