Optimisation of Deep models
Optimisation of Deep models #
Optimizers are algorithms that update neural network parameters to reduce the loss function.
Deep networks usually have millions or billions of parameters, so there is usually no closed-form solution.
Instead, training uses iterative optimisation.
Key takeaway:
An optimiser decides how the model moves through the loss landscape towards lower loss.
- Goal of Optimization
- Optimization Challenges in Deep Learning
- Gradient Descent
- Stochastic Gradient Descent
- Minibatch Stochastic Gradient Descent
- Momentum
- Adagrad and Algorithm
- RMSProp and Algorithm
- Adadelta and Algorithm
- Adam and Algorithm
- Code Implementation and comparison of algorithms (webinar)
flowchart TD
A["Optimisers in DNN"] --> B["Gradient Descent Variants"]
A --> C["Momentum-based Optimiser"]
A --> D["Adaptive Methods"]
A --> E["Learning Rate Schedules"]
D --> D1["Parameter-specific learning rates"]
E --> E1["Learning rate changes during training"]
style A fill:#E1F5FE,stroke:#4A90E2,stroke-width:2px
style B fill:#EDE7F6,stroke:#7E57C2
style C fill:#C8E6C9,stroke:#43A047
style D fill:#FFF9C4,stroke:#FBC02D
style E fill:#F8BBD0,stroke:#D81B60
Goal of Optimisation ☆ #
The goal is to find parameters \( \theta \) that minimise the loss.