AI

Deep Feedforward Neural Networks (DFNN) for Classification

February 26, 2026

Deep Learning, DFNN, MLP, Classification

Deep Feedforward Neural Networks (DFNN) or Multi Layer Perceptrons (MLP) for Classification #

A Deep Feedforward Neural Network (DFNN), also called a Multi-Layer Perceptron (MLP), is a neural network with one or more hidden layers where information flows forward only (no recurrence).
For classification, DFNNs learn non-linear decision boundaries by combining hidden layers with non-linear activation functions.

Core idea:
A single neuron can only learn linear boundaries.
Adding hidden layers + non-linearity allows DFNNs to solve problems like XOR.

MLP as solution for XOR #

A single perceptron fails on XOR because XOR is not linearly separable.

Decision Tree

AI, ML

Decision Tree #

A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).

Key takeaway: A decision tree grows by repeatedly splitting the training data into purer subsets using an impurity measure (Entropy / Gini / Classification Error).

Information Theory #

Decision trees need a way to measure: “How mixed are the class labels at a node?”

Prediction & Forecasting

AI, Statistics

Prediction & Forecasting #

Correlation #

Regression #

Time Series Analysis #

Introduction, Components of time series data #

MA model – basic and weighted MA model #

Time series models #

AR Model
ARIMA Model
SARIMA,SARIMAX,VAR,VARMAX
Simple exponential smoothing model

Reference #

Prediction & Forecasting

Home | Statistics

Convolutional Neural Networks

April 19, 2026

AI, Deep-Learning

Deep Learning, Cnn, Convolutional Neural Networks, Computer Vision, Feature Maps, Receptive Field, Padding, Pooling

Convolutional Neural Networks (CNN) #

Convolutional Neural Networks (CNNs) are specialised neural networks designed for data with spatial structure, especially images. They became the standard model for computer vision because they preserve spatial locality, reuse the same pattern detector across the image, and build representations hierarchically. In practical terms, a CNN starts by learning simple features such as edges and corners, then combines them into textures, shapes, object parts, and finally full semantic categories.

Statistics

March 12, 2026

AI, ML

Statistics, Probability, Data Analysis

Statistics #

Statistical methods help you turn raw data into reliable conclusions, while understanding uncertainty, variability, and confidence.

Statistics provides the language and tools for reasoning about data, uncertainty, and inference.

ML needs understanding data behaviour, drawing conclusions, and validating machine learning models.

Collect Data
Present & Organise Data (in a systematic manner)
Alalyse Data
Infer about the Data
Take Decision from the Data

Statistics Topic	What you learn (plain English)	ML Connection
1. Basic Probability & Statistics	Summarise data; understand spread; basic probability rules	Data understanding (EDA), feature sanity checks, detecting outliers, interpreting “average behaviour”
2. Conditional Probability & Bayes	Update probability using new information; Bayes’ rule	Naïve Bayes, Bayesian thinking, posterior probabilities, probabilistic classification
3. Probability Distributions	Model randomness with distributions; expectation/variance/covariance	Likelihood models, noise assumptions (Gaussian), sampling, probabilistic modelling foundations
4. Hypothesis Testing	Sampling, CLT, confidence intervals, significance tests, ANOVA, MLE	A/B testing, evaluating model improvements, significance vs noise, parameter estimation (MLE)
5. Prediction & Forecasting	Correlation, regression, time series (AR/MA/ARIMA/SARIMA etc.)	Linear regression, forecasting, sequential data modelling, baseline predictive modelling
6. GMM & EM	Mixtures of Gaussians; iterative estimation with EM	Unsupervised learning (soft clustering), density estimation, latent-variable models

flowchart TD
  A["Statistical Methods<br/>AIML ZC418"] --> B["1. Basic Probability and Statistics"]
  A --> C["2. Conditional Probability and Bayes"]
  A --> D["3. Probability Distributions"]
  A --> E["4. Hypothesis Testing"]
  A --> F["5. Prediction and Forecasting"]
  A --> G["6. Gaussian Mixture Model and EM"]

  B --> B1["Central Tendency<br/>Mean - Median - Mode"]
  B --> B2["Variability<br/>Range - Variance - SD - Quartiles"]
  B --> B3["Basic Probability Concepts"]
  B3 --> B31["Axioms of Probability"]
  B3 --> B32["Definition of Probability"]
  B3 --> B33["Mutually Exclusive vs Independent"]

  C --> C1["Conditional Probability"]
  C --> C2["Independence (conditional)"]
  C --> C3["Bayes Theorem"]
  C --> C4["Naive Bayes (intro)"]

  D --> D1["Random Variables<br/>Discrete and Continuous"]
  D --> D2["Expectation - Variance - Covariance"]
  D --> D3["Transformations of RVs"]
  D --> D4["Key Distributions"]
  D4 --> D41["Bernoulli"]
  D4 --> D42["Binomial"]
  D4 --> D43["Poisson"]
  D4 --> D44["Normal (Gaussian)"]
  D4 --> D45["t - Chi-square - F (intro)"]

  E --> E1["Sampling<br/>Random and Stratified"]
  E --> E2["Sampling Distributions<br/>CLT"]
  E --> E3["Estimation<br/>Confidence Intervals"]
  E --> E4["Hypothesis Tests<br/>Means and Proportions"]
  E --> E5["ANOVA<br/>Single and Dual factor"]
  E --> E6["Maximum Likelihood"]

  F --> F1["Correlation"]
  F --> F2["Regression"]
  F --> F3["Time Series Basics<br/>Components"]
  F --> F4["Moving Averages<br/>Simple and Weighted"]
  F --> F5["Time Series Models"]
  F5 --> F51["AR"]
  F5 --> F52["ARMA / ARIMA"]
  F5 --> F53["SARIMA / SARIMAX"]
  F5 --> F54["VAR / VARMAX"]
  F --> F6["Exponential Smoothing"]

  G --> G1["GMM<br/>Mixture of Gaussians"]
  G --> G2["EM Algorithm<br/>E-step - M-step"]

  B -.-> C
  C -.-> D
  D -.-> E
  E -.-> F
  F -.-> G

Data - Types #

flowchart TD
	A[(Data)] --> B["Categorical (Qualitative)"]
    A --> C["Numerical (Quantitative)"]

    B --> B1[Nominal]
    B --> B2[Ordinal]

    C --> C1[Discrete]
    C --> C2[Continuous]

    C2 --> C21[Interval]
    C2 --> C22[Ratio]

    %% Styling
    style A fill:#E1F5FE,stroke:#333
    style B fill:#90CAF9,stroke:#333
    style B1 fill:#90CAF9,stroke:#333
    style B2 fill:#90CAF9,stroke:#333
    style C fill:#FFF9C4,stroke:#333
    style C1 fill:#FFF9C4,stroke:#333
    style C2 fill:#FFF9C4,stroke:#333
    style C21 fill:#FFF9C4,stroke:#333
    style C22 fill:#FFF9C4,stroke:#333

Categorical (Qualitative) #
express a qualitative attribute e.g. hair color, eye color

Gaussian Mixture model & Expectation Maximization

AI, Statistics

Gaussian Mixture model & Expectation Maximization #

Reference #

Gaussian Mixture model

Expectation Maximization

Home | Statistics

Instance-based Learning

AI, ML

Instance-based Learning #

Instance-based learning is a family of methods that do not build one explicit global model during training. Instead, they store training examples and delay most of the work until a new query arrives.

When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.

Instance-based Learning covers three linked ideas:

Deep CNN Architectures

April 19, 2026

AI, Deep-Learning

Deep Learning, Cnn, Deep CNN Architectures, LeNet, AlexNet, VGG, NiN, GoogLeNet, ResNet, Transfer Learning

Deep CNN Architectures #

Once the basic ideas of convolution, pooling, channels, and classifier heads are understood, the next step is to study how successful CNN architectures are designed in practice. The history of deep CNNs is not just a list of famous models. It is a progression of design ideas: smaller filters, more depth, better optimisation, bottlenecks, multi-scale processing, residual connections, and transfer learning.

Key takeaway:
Deep CNN architectures evolved by solving specific problems one by one: LeNet established the template, AlexNet proved deep learning could dominate large-scale vision, VGG simplified the design, NiN introduced powerful 1 × 1 ideas, GoogLeNet made multi-scale processing efficient, and ResNet solved the optimisation problem of very deep networks.

CNN Pipeline

April 22, 2026

AI, Deep-Learning

Deep Learning, Cnn, Keras

CNN Pipeline: Preprocessing & Models #

Understand CNN concepts deeply
Build CNN models step-by-step
Apply CNNs in assignments using Keras

Think of CNN as a pipeline: Image → Features → Patterns → Prediction

1. Image Representation #

\[ X \in \mathbb{R}^{H \times W \times C} \]

H = Height
W = Width
C = Channels

2. Convolution Operation #

\[ Z(i,j) = \sum_{m,n} X(i+m, j+n) \cdot K(m,n) \]

Sliding filter extracts features
Produces feature maps

3. Stride & Padding #

\[ Output = \frac{N - F + 2P}{S} + 1 \]

4. Activation (ReLU) #

\[ ReLU(x) = max(0, x) \]

5. Pooling #

Max Pooling → strongest feature
Average Pooling → smooth

6. Global Average Pooling #

\[ y_k = \frac{1}{HW} \sum_{i,j} x_{i,j,k} \]

7. Loss Function #

\[ L = - \sum y \log(\hat{y}) \]

8. CNN Architecture #

graph LR
A[Input Image] --> B[Conv]
B --> C[ReLU]
C --> D[Pooling]
D --> E[Conv Layers]
E --> F[Flatten / GAP]
F --> G[Dense]
G --> H[Output]

9. Training #

Forward pass
Loss computation
Backpropagation
Weight update

10. Keras Implementation #

Model #

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dense, Flatten

model = Sequential()

model.add(Conv2D(32, (3,3), activation='relu', input_shape=(64,64,3)))
model.add(MaxPooling2D((2,2)))

model.add(Conv2D(64, (3,3), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Flatten())

model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))

Compile #

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

Train #

model.fit(X_train, y_train, epochs=10, batch_size=32)

Predict #

pred = model.predict(X_test)

11. Tips #

Normalize images
Use small filters
Avoid too many dense layers

12. Summary #

CNN = Automatic feature extractor + classifier

Recurrent Neural Networks

April 19, 2026

AI, Deep-Learning

Deep Learning, RNN, Recurrent Neural Networks, Sequence Modelling, BPTT, Encoder Decoder, Teacher Forcing, Time Series

Recurrent Neural Networks #

Recurrent Neural Networks (RNNs) are neural networks designed for sequential data, where the order of inputs matters and the model must use information from earlier time steps to interpret later ones. Unlike a feedforward network, an RNN does not process each input in isolation. It carries a hidden state from one time step to the next, so the network can build a running summary of what it has seen so far.

Deep Feedforward Neural Networks (DFNN) or Multi Layer Perceptrons (MLP) for Classification #

MLP as solution for XOR #

Decision Tree #

Information Theory #

Prediction & Forecasting #

Correlation #

Regression #

Time Series Analysis #

Introduction, Components of time series data #

MA model – basic and weighted MA model #

Time series models #

Reference #

Convolutional Neural Networks (CNN) #

Statistics #

Data - Types #

Categorical (Qualitative) #

Gaussian Mixture model & Expectation Maximization #

Reference #

Instance-based Learning #

Deep CNN Architectures #

CNN Pipeline: Preprocessing & Models #

1. Image Representation #

2. Convolution Operation #

3. Stride & Padding #

4. Activation (ReLU) #

5. Pooling #

6. Global Average Pooling #

7. Loss Function #

8. CNN Architecture #

9. Training #

10. Keras Implementation #

Model #

Compile #

Train #

Predict #

11. Tips #

12. Summary #

Recurrent Neural Networks #