Machine Learning

Dimensionality reduction and PCA

Dimensionality reduction and PCA #

PCA and SVM connect linear algebra, geometry, and optimisation.

Dimensionality reduction means representing high-dimensional data using fewer dimensions while trying to preserve the important structure of the data.

Principal Components Analysis, or PCA, is a linear dimensionality reduction method. It finds directions in the data along which the variance is maximum, and projects the data onto those directions.

Key takeaway: PCA chooses the eigenvectors of the covariance matrix corresponding to the largest eigenvalues. These eigenvectors form the principal subspace. The largest eigenvalues represent the directions that preserve the most variance.

AI Learning Resources

AI Learning Resources #

A curated list of high-quality online courses to learn Artificial Intelligence, Machine Learning, and Deep Learning from reputable universities and organisations.



Deep Neural Networks (DNN) #

  • Deep Learning. MIT Press.
    Goodfellow, I., Bengio, Y., & Courville, A. (2016). (Vol. 1, No. 2).

  • Introduction to Deep Learning. MIT Press.
    Eugene, C. (2019).

  • Deep Learning with Python. Simon & Schuster.
    Chollet, F. (2021).

ML Pipeline

Machine Learning Pipeline: Preprocessing & Models #

This page explains both data preprocessing and model development concepts in a clear, structured way to support understanding.

A complete ML pipeline includes preprocessing, feature engineering, feature selection, and model training.


1. Data Preprocessing Overview #

Raw data is often:

  • Noisy
  • Incomplete
  • Inconsistent

Preprocessing ensures data is suitable for machine learning.


2. Missing Values #

Why they occur

  • Sensor errors
  • Data collection issues

Methods