<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Optimizers on Arshad Siddiqui</title><link>https://arshadhs.github.io/tags/optimizers/</link><description>Recent content in Optimizers on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><atom:link href="https://arshadhs.github.io/tags/optimizers/index.xml" rel="self" type="application/rss+xml"/><item><title>Optimisation of Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</guid><description>&lt;h1 id="optimisation-of-deep-models">
 Optimisation of Deep models
 
 &lt;a class="anchor" href="#optimisation-of-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;p>Optimizers are algorithms that update neural network parameters to reduce the loss function.&lt;/p>
&lt;p>Deep networks usually have millions or billions of parameters, so there is usually no closed-form solution.&lt;/p>
&lt;p>Instead, training uses iterative optimisation.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>&lt;strong>Key takeaway:&lt;/strong>&lt;br>
An optimiser decides how the model moves through the loss landscape towards lower loss.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;ul>
&lt;li>Goal of Optimization&lt;/li>
&lt;li>Optimization Challenges in Deep Learning&lt;/li>
&lt;li>Gradient Descent&lt;/li>
&lt;li>Stochastic Gradient Descent&lt;/li>
&lt;li>Minibatch Stochastic Gradient Descent&lt;/li>
&lt;li>Momentum&lt;/li>
&lt;li>Adagrad and Algorithm&lt;/li>
&lt;li>RMSProp and Algorithm&lt;/li>
&lt;li>Adadelta and Algorithm&lt;/li>
&lt;li>Adam and Algorithm&lt;/li>
&lt;li>Code Implementation and comparison of algorithms (webinar)&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
 A[&amp;#34;Optimisers in DNN&amp;#34;] --&amp;gt; B[&amp;#34;Gradient Descent Variants&amp;#34;]
 A --&amp;gt; C[&amp;#34;Momentum-based Optimiser&amp;#34;]
 A --&amp;gt; D[&amp;#34;Adaptive Methods&amp;#34;]
 A --&amp;gt; E[&amp;#34;Learning Rate Schedules&amp;#34;]

 D --&amp;gt; D1[&amp;#34;Parameter-specific learning rates&amp;#34;]

 E --&amp;gt; E1[&amp;#34;Learning rate changes during training&amp;#34;]

 style A fill:#E1F5FE,stroke:#4A90E2,stroke-width:2px
 style B fill:#EDE7F6,stroke:#7E57C2
 style C fill:#C8E6C9,stroke:#43A047
 style D fill:#FFF9C4,stroke:#FBC02D
 style E fill:#F8BBD0,stroke:#D81B60
&lt;/pre>

&lt;hr>
&lt;h2 id="goal-of-optimisation-">
 Goal of Optimisation ☆
 
 &lt;a class="anchor" href="#goal-of-optimisation-">#&lt;/a>
 
&lt;/h2>
&lt;p>The goal is to find parameters 
&lt;span>
 \( \theta \)
 &lt;/span>

 that minimise the loss.&lt;/p></description></item></channel></rss>