February 15, 2026Linear Neural Networks for Regression
#
A linear neural network for regression is a model that predicts a continuous target by taking a weighted sum of input features and applying the identity activation (so the output can be any real number).
- Single neuron for regression (predicting how much / how many)
- Data + linear model (single neuron, no hidden layers) + squared loss
- Training using batch gradient descent algorithm
- Prediction (inference)
- Eg: Auto MPG (UCI) style prediction with a single neuron (from-scratch code)
flowchart LR
D["Data<br/>X, y"] --> M["Linear model<br/>w, b<br/>Single neuron"]
M --> A["Activation<br/>Identity"]
A --> L["Loss<br/>MSE (Squared error)"]
L --> O["Optimiser<br/>Batch Gradient DescentBatch GD / Mini-batch GD"]
O --> P["Parameters<br/>w, b"]
P --> I["Inference<br/>Predict ŷ (number) for new x"]
%% Pastel colour scheme
style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px
Regression
#
Regression is a supervised learning task that predicts a continuous-valued output based on input features.
February 26, 2026Gradient Descent Algorithm
#
Gradient Descent Algorithm (GDA) is
- an optimisation method
- used to train models
- by repeatedly updating parameters (weights and biases) to reduce the loss
In deep learning, the default training approach is almost always mini-batch gradient descent, usually with Adam or SGD + momentum.
Gradient Descent is used in both regression and classification.
It’s not tied to the task type — it’s tied to the fact you have:
February 15, 2026Linear NN for Classification
#
A Linear Neural Network (LNN) for classification uses no hidden layers.
It learns a linear decision boundary and outputs class probabilities, then converts them into predicted classes.
Neural-network view:
- Binary classification → logistic regression (single neuron + sigmoid)
- Multi-class classification → softmax regression (K output neurons + softmax)
flowchart LR
D["Data<br/>X, y"] --> M["Linear model<br/>w, b"]
M --> A["Activation<br/>Sigmoid / Softmax"]
A --> L["Loss<br/>Cross-entropy"]
L --> O["Optimiser<br/>Mini-batch GD / Adam"]
O --> P["Updated parameters<br/>w, b"]
P --> I["Inference<br/>Probabilities → class"]
%% Pastel colour scheme
style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px
Classification
#
Classification predicts a discrete class label.
Common settings: