<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Gradient Descent on Arshad Siddiqui</title><link>https://arshadhs.github.io/tags/gradient-descent/</link><description>Recent content in Gradient Descent on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 21 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://arshadhs.github.io/tags/gradient-descent/index.xml" rel="self" type="application/rss+xml"/><item><title>Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</guid><description>&lt;h1 id="gradient-descent-for-linear-regression">
 Gradient Descent for Linear Regression
 
 &lt;a class="anchor" href="#gradient-descent-for-linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Iterative method&lt;/strong>&lt;/li>
&lt;li>Types: batch / stochastic / mini-batch&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.&lt;/p>
&lt;/blockquote>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
GD[&amp;#34;Gradient&amp;lt;br/&amp;gt;Descent&amp;#34;] --&amp;gt;|minimises| CF[&amp;#34;Cost&amp;lt;br/&amp;gt;function&amp;#34;]
GD --&amp;gt;|updates| W[&amp;#34;Parameters&amp;lt;br/&amp;gt;(weights)&amp;#34;]
GD --&amp;gt;|uses| GR[&amp;#34;Gradient&amp;lt;br/&amp;gt;(slope)&amp;#34;]

GD --&amp;gt; H[&amp;#34;Hyperparameters&amp;#34;]
H --&amp;gt; LR[&amp;#34;Learning&amp;lt;br/&amp;gt;rate&amp;#34;]
H --&amp;gt; BS[&amp;#34;Batch&amp;lt;br/&amp;gt;size&amp;#34;]
H --&amp;gt; EP[&amp;#34;Epochs&amp;#34;]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-gd">
 Types of GD
 
 &lt;a class="anchor" href="#types-of-gd">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Gradient Descent&amp;lt;br/&amp;gt;types&amp;#34;] --&amp;gt; BGD[&amp;#34;Batch&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; SGD[&amp;#34;Stochastic&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; MGD[&amp;#34;Mini-batch&amp;lt;br/&amp;gt;GD&amp;#34;]

BGD --&amp;gt; ALL[&amp;#34;All data&amp;lt;br/&amp;gt;per step&amp;#34;]
BGD --&amp;gt; STB[&amp;#34;Smooth&amp;lt;br/&amp;gt;updates&amp;#34;]

SGD --&amp;gt; ONE[&amp;#34;1 sample&amp;lt;br/&amp;gt;per step&amp;#34;]
SGD --&amp;gt; FAST[&amp;#34;Quick&amp;lt;br/&amp;gt;progress&amp;#34;]
SGD --&amp;gt; NOISE[&amp;#34;Noisy&amp;lt;br/&amp;gt;updates&amp;#34;]

MGD --&amp;gt; MB[&amp;#34;Small batch&amp;lt;br/&amp;gt;per step&amp;#34;]
MGD --&amp;gt; PRACT[&amp;#34;Practical&amp;lt;br/&amp;gt;default&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h3 id="batch">
 Batch
 
 &lt;a class="anchor" href="#batch">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>Use only if you have huge compute and a lot of time to train&lt;/li>
&lt;/ul>
&lt;h3 id="sgd">
 SGD
 
 &lt;a class="anchor" href="#sgd">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>go-to solution&lt;/p></description></item></channel></rss>