Optimizers

Optimisation of Deep models

Optimisation of Deep models #

Optimizers are algorithms that update neural network parameters to reduce the loss function.

Deep networks usually have millions or billions of parameters, so there is usually no closed-form solution.

Instead, training uses iterative optimisation.

Key takeaway:
An optimiser decides how the model moves through the loss landscape towards lower loss.


  • Goal of Optimization
  • Optimization Challenges in Deep Learning
  • Gradient Descent
  • Stochastic Gradient Descent
  • Minibatch Stochastic Gradient Descent
  • Momentum
  • Adagrad and Algorithm
  • RMSProp and Algorithm
  • Adadelta and Algorithm
  • Adam and Algorithm
  • Code Implementation and comparison of algorithms (webinar)

flowchart TD
    A["Optimisers in DNN"] --> B["Gradient Descent Variants"]
    A --> C["Momentum-based Optimiser"]
    A --> D["Adaptive Methods"]
    A --> E["Learning Rate Schedules"]

    D --> D1["Parameter-specific learning rates"]

    E --> E1["Learning rate changes during training"]

    style A fill:#E1F5FE,stroke:#4A90E2,stroke-width:2px
    style B fill:#EDE7F6,stroke:#7E57C2
    style C fill:#C8E6C9,stroke:#43A047
    style D fill:#FFF9C4,stroke:#FBC02D
    style E fill:#F8BBD0,stroke:#D81B60

Goal of Optimisation ☆ #

The goal is to find parameters \( \theta \) that minimise the loss.