Deep Feedforward Neural Networks (DFNN) or Multi Layer Perceptrons (MLP) for Classification
#
A Deep Feedforward Neural Network (DFNN), also called a Multi-Layer Perceptron (MLP), is a neural network with one or more hidden layers where information flows forward only (no recurrence). For classification, DFNNs learn non-linear decision boundaries by combining hidden layers with non-linear activation functions.
Core idea:
A single neuron can only learn linear boundaries.
Adding hidden layers + non-linearity allows DFNNs to solve problems like XOR.
A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).
Key takeaway:
A decision tree grows by repeatedly splitting the training data into purer subsets using an impurity measure
(Entropy / Gini / Classification Error).
Convolutional Neural Networks (CNNs) are specialised neural networks designed for data with spatial structure, especially images. They became the standard model for computer vision because they preserve spatial locality, reuse the same pattern detector across the image, and build representations hierarchically. In practical terms, a CNN starts by learning simple features such as edges and corners, then combines them into textures, shapes, object parts, and finally full semantic categories.
A/B testing, evaluating model improvements, significance vs noise, parameter estimation (MLE)
5. Prediction & Forecasting
Correlation, regression, time series (AR/MA/ARIMA/SARIMA etc.)
Linear regression, forecasting, sequential data modelling, baseline predictive modelling
6. GMM & EM
Mixtures of Gaussians; iterative estimation with EM
Unsupervised learning (soft clustering), density estimation, latent-variable models
flowchart TD
A["Statistical Methods<br/>AIML ZC418"] --> B["1. Basic Probability and Statistics"]
A --> C["2. Conditional Probability and Bayes"]
A --> D["3. Probability Distributions"]
A --> E["4. Hypothesis Testing"]
A --> F["5. Prediction and Forecasting"]
A --> G["6. Gaussian Mixture Model and EM"]
B --> B1["Central Tendency<br/>Mean - Median - Mode"]
B --> B2["Variability<br/>Range - Variance - SD - Quartiles"]
B --> B3["Basic Probability Concepts"]
B3 --> B31["Axioms of Probability"]
B3 --> B32["Definition of Probability"]
B3 --> B33["Mutually Exclusive vs Independent"]
C --> C1["Conditional Probability"]
C --> C2["Independence (conditional)"]
C --> C3["Bayes Theorem"]
C --> C4["Naive Bayes (intro)"]
D --> D1["Random Variables<br/>Discrete and Continuous"]
D --> D2["Expectation - Variance - Covariance"]
D --> D3["Transformations of RVs"]
D --> D4["Key Distributions"]
D4 --> D41["Bernoulli"]
D4 --> D42["Binomial"]
D4 --> D43["Poisson"]
D4 --> D44["Normal (Gaussian)"]
D4 --> D45["t - Chi-square - F (intro)"]
E --> E1["Sampling<br/>Random and Stratified"]
E --> E2["Sampling Distributions<br/>CLT"]
E --> E3["Estimation<br/>Confidence Intervals"]
E --> E4["Hypothesis Tests<br/>Means and Proportions"]
E --> E5["ANOVA<br/>Single and Dual factor"]
E --> E6["Maximum Likelihood"]
F --> F1["Correlation"]
F --> F2["Regression"]
F --> F3["Time Series Basics<br/>Components"]
F --> F4["Moving Averages<br/>Simple and Weighted"]
F --> F5["Time Series Models"]
F5 --> F51["AR"]
F5 --> F52["ARMA / ARIMA"]
F5 --> F53["SARIMA / SARIMAX"]
F5 --> F54["VAR / VARMAX"]
F --> F6["Exponential Smoothing"]
G --> G1["GMM<br/>Mixture of Gaussians"]
G --> G2["EM Algorithm<br/>E-step - M-step"]
B -.-> C
C -.-> D
D -.-> E
E -.-> F
F -.-> G
Instance-based learning is a family of methods that do not build one explicit global model during training. Instead, they store training examples and delay most of the work until a new query arrives.
When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.
Instance-based Learning covers three linked ideas:
Once the basic ideas of convolution, pooling, channels, and classifier heads are understood, the next step is to study how successful CNN architectures are designed in practice. The history of deep CNNs is not just a list of famous models. It is a progression of design ideas: smaller filters, more depth, better optimisation, bottlenecks, multi-scale processing, residual connections, and transfer learning.
Key takeaway: Deep CNN architectures evolved by solving specific problems one by one: LeNet established the template, AlexNet proved deep learning could dominate large-scale vision, VGG simplified the design, NiN introduced powerful 1 × 1 ideas, GoogLeNet made multi-scale processing efficient, and ResNet solved the optimisation problem of very deep networks.
Recurrent Neural Networks (RNNs) are neural networks designed for sequential data, where the order of inputs matters and the model must use information from earlier time steps to interpret later ones. Unlike a feedforward network, an RNN does not process each input in isolation. It carries a hidden state from one time step to the next, so the network can build a running summary of what it has seen so far.