Perceptron #

A Perceptron is the simplest form of an artificial neural network (ANN).
It is an algorithm for supervised learning used mainly for binary classification problems.

It forms the basic building block of modern neural networks and deep learning models.

What a Perceptron Does #

A perceptron:

Takes multiple real-valued inputs
Assigns a weight to each input
Computes a weighted sum
Adds a bias
Applies an activation function
Produces a binary output

In simple terms, it decides between two classes.

Mathematical Model #

\[ \hat{y} = f\left(\sum_{i=1}^{n} w_i x_i + b\right) \]

Where:

(x_i) → input features
(w_i) → weights
(b) → bias
\( f(\cdot) \) → activation function

Perceptron Architecture #

flowchart LR
    X1[x₁]
    X2[x₂]
    X3[x₃]

    SUM((Σ))
    ACT[Activation Function]
    Y((ŷ))

    X1 -- w₁ --> SUM
    X2 -- w₂ --> SUM
    X3 -- w₃ --> SUM

    SUM -- BIAS(+b) --> ACT -- Output --> Y

Core Components #

1. Inputs (x_1, x_2, … , x_n) #

Inputs are the features or measurable attributes of a data point.

Example (OR gate):

\[ (x_1, x_2) \in \{0,1\}^2 \]

Inputs by themselves have no influence unless multiplied by weights.

2. Weights (w_1, w_2, … , w_n) #

Weights determine how strongly each input influences the output
Larger weights → higher importance
Learned during training
Act as importance scores for features

3. Bias ((b) #

The bias is a constant added to the weighted sum.

Shifts the decision boundary
Allows classification even when all inputs are zero
Prevents the boundary from being forced through the origin

Geometric intuition:

Weights tilt the decision line
Bias shifts the line

4. Net Input (Weighted Sum) #

\[ z = \sum_{i=1}^{n} w_i x_i + b \]

This value determines whether the perceptron activates.

5. Activation Function (Step Function) #

The classic perceptron uses a step function:

\[ \hat{y} = \begin{cases} 1 & \text{if } z \ge 0 \\ 0 & \text{otherwise} \end{cases} \]

Output is binary
Decision boundary is linear
Suitable only for linearly separable data

A single perceptron cannot solve problems like XOR because they are not linearly separable.

Why the Perceptron Is Limited #

Uses a linear decision boundary
Can only solve linearly separable problems
Cannot model complex, non-linear relationships

This limitation led to multi-layer neural networks.

From Perceptron to Neural Networks #

A neural network extends the perceptron by stacking many neurons across layers.

1. Input Layer #

Receives raw feature vector
No computation happens

\[ \mathbf{x} = (x_1, x_2, \dots, x_n) \]

2. Hidden Layers #

Hidden layers contain multiple perceptrons that learn intermediate representations.

Hidden layer computation:

\[ \mathbf{z}^{(1)} = W^{(1)}\mathbf{x} + \mathbf{b}^{(1)} \] \[ \mathbf{a}^{(1)} = \sigma(\mathbf{z}^{(1)}) \]