Calculus

Mathematical Foundation

Mathematical Foundations for Machine Learning #

Machine Learning is built on mathematical principles that allow models to:

  • represent data
  • learn patterns
  • optimise performance
flowchart LR
    DATA[Data]
    MATH[Math Models]
    OPT[Optimisation]
    MODEL[Trained Model]

    DATA --> MATH
    MATH --> OPT
    OPT --> MODEL

ML requires core mathematical tools to understand how ML algorithms work internally. Algebra deals with relationships between variables and quantities, while Calculus focuses on change and optimization.

Calculus

Calculus #

Calculus is:

  • the mathematical framework for understanding and controlling how quantities change
  • the mathematics of change and accumulation

It helps answer:

  • How fast is something changing right now?
  • What happens when inputs change slightly?
  • Where is something maximum or minimum?

It answers two big questions:

  • How fast is something changing right now? → derivatives (differentiation)
  • How much has accumulated over an interval? → integrals (integration)

flowchart TD
  A[Calculus] --> B[Limits]
  B --> C[Continuity]
  B --> D[Derivatives]
  B --> E[Integrals]
  D --> F[Optimisation: maxima/minima]
  D --> G[ML: gradients & learning]
  E --> H[Accumulation: area/total change]


  1. Differential Calculus (Rates of Change) #

    Studies how things change.

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) #

  • dimensionality reduction technique
  • helps us to reduce the number of features in a dataset while keeping the most important information.
  • changes complex datasets by transforming correlated features into a smaller set of uncorrelated components.
  • uses linear algebra to transform data into new features called principal components.
  • finds these by calculating eigenvectors (directions) and eigenvalues (importance) from the covariance matrix.
  • PCA selects the top components with the highest eigenvalues and projects the data onto them simplify the dataset.

PCA prioritizes the directions where the data varies the most because more variation = more useful information.