<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ML on Arshad Siddiqui</title><link>https://arshadhs.github.io/categories/ml/</link><description>Recent content in ML on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 18 Mar 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://arshadhs.github.io/categories/ml/index.xml" rel="self" type="application/rss+xml"/><item><title>Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</guid><description>&lt;h1 id="supervised-learning">
 Supervised Learning
 
 &lt;a class="anchor" href="#supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Trained using &lt;strong>labelled data&lt;/strong>.&lt;br>
Each example in the training set includes the &lt;strong>correct output&lt;/strong>.&lt;br>
The algorithm learns to &lt;strong>generalise&lt;/strong> and make predictions on unseen data.&lt;br>
Generally more &lt;strong>accurate&lt;/strong> than unsupervised methods.&lt;br>
Requires &lt;strong>human intervention&lt;/strong> for labelling and setup.&lt;br>
Widely used due to its &lt;strong>accuracy and efficiency&lt;/strong>.&lt;br>
Produces &lt;strong>highly accurate results&lt;/strong> when trained on good-quality labelled data.&lt;/p>
&lt;hr>
&lt;h2 id="classification">
 Classification
 
 &lt;a class="anchor" href="#classification">#&lt;/a>
 
&lt;/h2>
&lt;p>Output is &lt;strong>discrete&lt;/strong> (e.g. Yes/No, Spam/Not Spam).&lt;br>
Used for &lt;strong>categorising data&lt;/strong> into predefined classes.&lt;br>
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).&lt;/p></description></item><item><title>Differentiation of Univariate Functions</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/</guid><description>&lt;h1 id="differentiation-of-univariate-functions">
 Differentiation of Univariate Functions
 
 &lt;a class="anchor" href="#differentiation-of-univariate-functions">#&lt;/a>
 
&lt;/h1>
&lt;p>Differentiation measures rate of change.&lt;/p>
&lt;p>For a function f(x), the derivative measures the rate of change.&lt;/p>
&lt;span style="color: red;">
 $[
f'(x) = $lim_{h $to 0} $frac{f(x+h)-f(x)}{h}
$]
&lt;/span>
&lt;p>Interpretation:&lt;/p>
&lt;ul>
&lt;li>Slope of tangent&lt;/li>
&lt;li>Instantaneous rate of change&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Unsupervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</guid><description>&lt;h1 id="unsupervised-learning">
 Unsupervised Learning
 
 &lt;a class="anchor" href="#unsupervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Works on &lt;strong>unlabelled raw data&lt;/strong>.&lt;/li>
&lt;li>The algorithm &lt;strong>discovers hidden patterns&lt;/strong> without prior knowledge of outcomes.&lt;/li>
&lt;li>Requires &lt;strong>no human intervention&lt;/strong> during training.&lt;/li>
&lt;li>Does not make direct predictions — it &lt;strong>groups or organises data&lt;/strong> instead.&lt;/li>
&lt;li>Carries a &lt;strong>higher risk&lt;/strong> because there’s no ground truth to verify results.&lt;/li>
&lt;li>Common techniques include &lt;strong>Clustering&lt;/strong>, &lt;strong>Association&lt;/strong>, and &lt;strong>Dimensionality Reduction&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ML maths-based colours (same palette as supervised)
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151

 %% Root
 USL: Unsupervised Learning

 %% Main branches
 USL --&amp;gt; CLU:::category
 CLU: Clustering

 USL --&amp;gt; DR:::category
 DR: Dimensionality Reduction

 %% Clustering algorithms
 CLU --&amp;gt; KM:::geometry
 KM: K-Means

 CLU --&amp;gt; HC:::geometry
 HC: Hierarchical Clustering

 CLU --&amp;gt; DB:::geometry
 DB: DBSCAN

 %% Probabilistic models
 USL --&amp;gt; PM:::category
 PM: Probabilistic Models

 PM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 PM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;h2 id="clustering">
 Clustering
 
 &lt;a class="anchor" href="#clustering">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Groups &lt;strong>similar data points&lt;/strong> together based on shared features.&lt;/li>
&lt;li>Commonly used for &lt;strong>market segmentation&lt;/strong>, &lt;strong>image compression&lt;/strong>, and &lt;strong>anomaly detection&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-types-of-clustering">
 Common Types of Clustering
 
 &lt;a class="anchor" href="#common-types-of-clustering">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>K-Means Clustering&lt;/strong> – Divides data into &lt;em>K&lt;/em> groups based on similarity.&lt;/li>
&lt;li>&lt;strong>Hierarchical Clustering&lt;/strong> – Builds a hierarchy (tree) of clusters.&lt;/li>
&lt;li>&lt;strong>DBSCAN (Density-Based Spatial Clustering)&lt;/strong> – Groups points close in density; identifies noise/outliers.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="association">
 Association
 
 &lt;a class="anchor" href="#association">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Identifies &lt;strong>relationships or correlations&lt;/strong> between variables in a dataset.&lt;/li>
&lt;li>Commonly used in &lt;strong>market basket analysis&lt;/strong> (e.g. &amp;ldquo;Customers who bought X also bought Y&amp;rdquo;).&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Apriori Algorithm&lt;/strong> – Finds frequent itemsets and generates association rules.&lt;/li>
&lt;li>&lt;strong>Eclat Algorithm&lt;/strong> – Similar to Apriori but uses set intersections for faster computation.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="dimensionality-reduction">
 Dimensionality Reduction
 
 &lt;a class="anchor" href="#dimensionality-reduction">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Reduces the &lt;strong>number of input variables&lt;/strong> to simplify data.&lt;/li>
&lt;li>Helps remove noise and redundancy.&lt;/li>
&lt;li>Commonly used in &lt;strong>data pre-processing&lt;/strong> and &lt;strong>visualisation&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques-1">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques-1">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Principal Component Analysis (PCA)&lt;/strong> – Projects data onto fewer dimensions while keeping most variance.&lt;/li>
&lt;li>&lt;strong>Linear Discriminant Analysis (LDA)&lt;/strong> – Focuses on class separation.&lt;/li>
&lt;li>&lt;strong>t-SNE (t-Distributed Stochastic Neighbour Embedding)&lt;/strong> – Used for visualising high-dimensional data.&lt;/li>
&lt;li>&lt;strong>Autoencoders&lt;/strong> – Neural networks that compress and reconstruct data.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
mindmap
 root(Unsupervised Learning)
 Clustering
 K Means
 Hierarchical Clustering
 DBSCAN
 Dimensionality Reduction
 PCA
 t SNE
 Autoencoders
 Probabilistic Models
 Gaussian Mixture Model
 Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Partial Differentiation and Gradients</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/</guid><description>&lt;h1 id="partial-differentiation-and-gradients">
 Partial Differentiation and Gradients
 
 &lt;a class="anchor" href="#partial-differentiation-and-gradients">#&lt;/a>
 
&lt;/h1>
&lt;p>For f(x1, x2, &amp;hellip;, xn):&lt;/p>
&lt;span style="color: red;">
 [
\frac{\partial f}{\partial x_i}
]
&lt;/span>
&lt;p>Gradient vector:&lt;/p>
&lt;span style="color: red;">
 [
\nabla f =
\begin{bmatrix}
\frac{\partial f}{\partial x_1} \
\vdots \
\frac{\partial f}{\partial x_n}
\end{bmatrix}
]
&lt;/span>
&lt;p>Gradient points in direction of steepest ascent.&lt;/p>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 Input --&amp;gt; Function
 Function --&amp;gt; Gradient
 Gradient --&amp;gt; Optimisation
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Linear Independence</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/</guid><description>&lt;h1 id="linear-independence">
 Linear Independence
 
 &lt;a class="anchor" href="#linear-independence">#&lt;/a>
 
&lt;/h1>
&lt;p>A set of vectors is &lt;strong>linearly independent&lt;/strong> if none of them can be written as a linear combination of the others.&lt;/p>

&lt;span style="color: green;">
 &lt;span>
 \[ 
c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}
\;\Rightarrow\;
c_1=\cdots=c_k=0
 \]
 &lt;/span>

&lt;/span>
&lt;p>Independence means each vector adds &lt;strong>new information&lt;/strong>.&lt;/p>
&lt;h2 id="why-it-matters">
 Why it matters
 
 &lt;a class="anchor" href="#why-it-matters">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Detects redundancy&lt;/li>
&lt;li>Connects to rank and basis&lt;/li>
&lt;/ul>
&lt;p>If one vector can already be formed using others, it does not add anything new.&lt;/p></description></item><item><title>Semi-Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</guid><description>&lt;h1 id="semi-supervised-learning">
 Semi-Supervised Learning
 
 &lt;a class="anchor" href="#semi-supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A combination of &lt;strong>labelled&lt;/strong> and &lt;strong>unlabelled data&lt;/strong>.&lt;/li>
&lt;li>Useful when labelling large datasets is &lt;strong>expensive or time-consuming&lt;/strong>.&lt;/li>
&lt;li>Works well with &lt;strong>high-volume datasets&lt;/strong> (e.g. millions of images).&lt;/li>
&lt;li>Only a &lt;strong>small fraction of data&lt;/strong> is labelled (e.g. a few thousand).&lt;/li>
&lt;li>The algorithm learns from both labelled examples and structure in unlabelled data.&lt;/li>
&lt;li>&lt;strong>Ideal for medical imaging&lt;/strong> where labelled data is limited.&lt;/li>
&lt;li>For example, a &lt;strong>radiologist&lt;/strong> can label a small set of medical scans,&lt;br>
and the model uses that to learn from thousands of unlabelled scans.&lt;/li>
&lt;li>Helps improve &lt;strong>accuracy and generalisation&lt;/strong> with minimal manual effort.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Gradients of Vector-Valued and Matrix Functions</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/</guid><description>&lt;h1 id="gradients-of-vector-valued-and-matrix-functions">
 Gradients of Vector-Valued and Matrix Functions
 
 &lt;a class="anchor" href="#gradients-of-vector-valued-and-matrix-functions">#&lt;/a>
 
&lt;/h1>
&lt;p>Covers gradients when outputs or parameters are vectors/matrices.&lt;/p>
&lt;p>If f: R^n -&amp;gt; R^m, the derivative is the Jacobian.&lt;/p>
&lt;span style="color: red;">
 [
J =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} &amp;amp; \dots &amp;amp; \frac{\partial f_1}{\partial x_n} \
\vdots &amp;amp; \ddots &amp;amp; \vdots \
\frac{\partial f_m}{\partial x_1} &amp;amp; \dots &amp;amp; \frac{\partial f_m}{\partial x_n}
\end{bmatrix}
]
&lt;/span>
&lt;p>For scalar f(x):&lt;/p>
&lt;span style="color: red;">
 [
H = \nabla^2 f
]
&lt;/span>
&lt;p>Hessian captures curvature.&lt;/p></description></item><item><title>Reinforcement Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</guid><description>&lt;h1 id="reinforcement-learning-rl">
 Reinforcement Learning (RL)
 
 &lt;a class="anchor" href="#reinforcement-learning-rl">#&lt;/a>
 
&lt;/h1>
&lt;p>RL is learning by &lt;strong>trial and error&lt;/strong>.&lt;/p>
&lt;p>Reinforcement Learning (RL) is a type of machine learning where an &lt;strong>autonomous agent learns to make decisions by interacting with an environment&lt;/strong>.&lt;/p>
&lt;p>Instead of being told the correct answer, the agent:&lt;/p>
&lt;ul>
&lt;li>takes actions&lt;/li>
&lt;li>observes outcomes&lt;/li>
&lt;li>receives rewards or penalties&lt;/li>
&lt;li>gradually learns a strategy that maximises long-term reward&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>&lt;strong>Reinforcement Learning teaches an agent how to act, not what to predict.&lt;/strong>&lt;/p></description></item><item><title>Useful Gradient Identities</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/</guid><description>&lt;h1 id="useful-gradient-identities">
 Useful Gradient Identities
 
 &lt;a class="anchor" href="#useful-gradient-identities">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: red;">
 [
\nabla (a^T x) = a
]
&lt;/span>
&lt;span style="color: red;">
 [
\nabla (x^T A x) = (A + A^T)x
]
&lt;/span>
&lt;p>If A symmetric:&lt;/p>
&lt;span style="color: red;">
 [
\nabla (x^T A x) = 2Ax
]
&lt;/span>
&lt;p>These are heavily used in &lt;strong>optimisation&lt;/strong>.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Inner Products and Dot Product</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/</guid><description>&lt;h1 id="inner-products-and-dot-product">
 Inner Products and Dot Product
 
 &lt;a class="anchor" href="#inner-products-and-dot-product">#&lt;/a>
 
&lt;/h1>
&lt;p>An &lt;strong>inner product&lt;/strong> maps two vectors to a &lt;strong>single scalar&lt;/strong>.&lt;/p>
&lt;p>It allows us to measure:&lt;/p>
&lt;ul>
&lt;li>similarity&lt;/li>
&lt;li>vector length&lt;/li>
&lt;li>projections&lt;/li>
&lt;li>orthogonality&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Inner&amp;lt;br/&amp;gt;products&amp;lt;br/&amp;gt;(types)&amp;#34;] --&amp;gt; DOT[&amp;#34;Euclidean&amp;lt;br/&amp;gt;Dot product&amp;#34;]
T --&amp;gt; WIP[&amp;#34;Weighted&amp;lt;br/&amp;gt;inner product&amp;#34;]
T --&amp;gt; FN[&amp;#34;Function-space&amp;lt;br/&amp;gt;(integral)&amp;#34;]
T --&amp;gt; HERM[&amp;#34;Complex&amp;lt;br/&amp;gt;Hermitian&amp;#34;]
T --&amp;gt; MAT[&amp;#34;Matrix&amp;lt;br/&amp;gt;inner product&amp;lt;br/&amp;gt;(Frobenius)&amp;#34;]

DOT --&amp;gt; Rn[&amp;#34;Vectors in&amp;lt;br/&amp;gt;
&amp;lt;span&amp;gt;
 \( \mathbb{R}^n \)
 &amp;lt;/span&amp;gt;

&amp;#34;]
WIP --&amp;gt; SPD[&amp;#34;SPD matrix&amp;lt;br/&amp;gt;W&amp;#34;]
FN --&amp;gt; L2[&amp;#34;L2 space&amp;lt;br/&amp;gt;functions&amp;#34;]
HERM --&amp;gt; Cn[&amp;#34;Vectors in&amp;lt;br/&amp;gt;C^n&amp;#34;]
MAT --&amp;gt; Mnm[&amp;#34;Matrices&amp;lt;br/&amp;gt;R^{m×n}&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style DOT fill:#C8E6C9,stroke:#2E7D32,color:#000
style WIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style FN fill:#C8E6C9,stroke:#2E7D32,color:#000
style HERM fill:#C8E6C9,stroke:#2E7D32,color:#000
style MAT fill:#C8E6C9,stroke:#2E7D32,color:#000

style Rn fill:#CE93D8,stroke:#8E24AA,color:#000
style SPD fill:#CE93D8,stroke:#8E24AA,color:#000
style L2 fill:#CE93D8,stroke:#8E24AA,color:#000
style Cn fill:#CE93D8,stroke:#8E24AA,color:#000
style Mnm fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>For vectors&lt;br>

&lt;span>
 \( \mathbf{a}, \mathbf{b} \in \mathbb{R}^n \)
 &lt;/span>

&lt;/p></description></item><item><title>Backpropagation and Automatic Differentiation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/</guid><description>&lt;h1 id="backpropagation-and-automatic-differentiation">
 Backpropagation and Automatic Differentiation
 
 &lt;a class="anchor" href="#backpropagation-and-automatic-differentiation">#&lt;/a>
 
&lt;/h1>
&lt;p>Backpropagation applies the chain rule:&lt;/p>
&lt;ul>
&lt;li>efficiently across a computational graph.&lt;/li>
&lt;li>repeatedly.&lt;/li>
&lt;/ul>
&lt;p>Chain rule:&lt;/p>
&lt;span style="color: red;">
 [
\frac{dL}{dx} = \frac{dL}{dy} \cdot \frac{dy}{dx}
]
&lt;/span>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 x --&amp;gt; y
 y --&amp;gt; L
&lt;/pre>

&lt;p>Automatic differentiation computes exact derivatives efficiently using computational graphs.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Higher-order derivatives</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/</guid><description>&lt;h1 id="higher-order-derivatives">
 Higher-order derivatives
 
 &lt;a class="anchor" href="#higher-order-derivatives">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Angles and Orthogonality</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/</guid><description>&lt;h1 id="angles-and-orthogonality">
 Angles and Orthogonality
 
 &lt;a class="anchor" href="#angles-and-orthogonality">#&lt;/a>
 
&lt;/h1>
&lt;p>Once we define an inner product, we can define the &lt;strong>angle between two vectors&lt;/strong>.&lt;/p>
&lt;p>Angles allow us to measure how aligned or different two vectors are in space.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Angle measures similarity between vectors.
Orthogonality means complete independence (no similarity).&lt;/p>
&lt;/blockquote>
&lt;h2 id="why-it-matters-in-machine-learning">
 Why It Matters in Machine Learning
 
 &lt;a class="anchor" href="#why-it-matters-in-machine-learning">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>PCA produces orthogonal components&lt;/li>
&lt;li>Orthogonal features reduce redundancy&lt;/li>
&lt;li>Gradient directions depend on angle&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="angle-formula">
 Angle Formula
 
 &lt;a class="anchor" href="#angle-formula">#&lt;/a>
 
&lt;/h1>
&lt;p>For vectors in n-dimensional space:&lt;/p></description></item><item><title>Taylor’s series</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/</guid><description>&lt;h1 id="linearization-and-multivariate-taylors-series">
 Linearization and multivariate Taylor’s series
 
 &lt;a class="anchor" href="#linearization-and-multivariate-taylors-series">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Maxima and Minima</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/</guid><description>&lt;h1 id="computing-maxima-and-minima-for-unconstrained-optimization">
 Computing maxima and minima for unconstrained optimization
 
 &lt;a class="anchor" href="#computing-maxima-and-minima-for-unconstrained-optimization">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>AI Stages: ANI, AGI, ASI</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</guid><description>&lt;h1 id="ai-development-stages-ani--agi--asi">
 AI Development Stages: ANI → AGI → ASI
 
 &lt;a class="anchor" href="#ai-development-stages-ani--agi--asi">#&lt;/a>
 
&lt;/h1>
&lt;p>Artificial Intelligence is often described in &lt;strong>three stages&lt;/strong>, based on capability and scope:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>ANI:&lt;/strong> Task-specific intelligence (today’s AI)&lt;/li>
&lt;li>&lt;strong>AGI:&lt;/strong> Human-level general intelligence (future goal)&lt;/li>
&lt;li>&lt;strong>ASI:&lt;/strong> Beyond human intelligence (theoretical)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/ai_stages.png" alt="AI Stages" />&lt;/p>
&lt;hr>
&lt;h2 id="ani--artificial-narrow-intelligence">
 ANI — Artificial Narrow Intelligence
 
 &lt;a class="anchor" href="#ani--artificial-narrow-intelligence">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>also called &lt;strong>Weak AI&lt;/strong>&lt;/li>
&lt;li>designed to perform &lt;strong>one specific task&lt;/strong>&lt;/li>
&lt;li>Operates within a &lt;strong>predefined environment&lt;/strong>&lt;/li>
&lt;li>Cannot generalise beyond its training&lt;/li>
&lt;li>&lt;strong>Most AI systems today are ANI&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>examples&lt;/strong>&lt;/p></description></item><item><title>Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</guid><description>&lt;h1 id="neural-networks">
 Neural Networks
 
 &lt;a class="anchor" href="#neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A &lt;strong>network of artificial neurons&lt;/strong> inspired by how neurons function in the &lt;strong>human brain&lt;/strong>.&lt;/li>
&lt;li>At its core - a &lt;strong>mathematical model&lt;/strong> designed to process and learn from data.&lt;/li>
&lt;li>Neural networks form the &lt;strong>foundation of &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>&lt;/strong> (involves training large and complex networks on vast amounts of data).&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 subgraph subGraph0[&amp;#34;Input Layer&amp;#34;]
 I1((&amp;#34;Input 1&amp;#34;))
 I2((&amp;#34;Input 2&amp;#34;))
 I3((&amp;#34;Input 3&amp;#34;))
 end
 subgraph subGraph1[&amp;#34;Hidden Layer&amp;#34;]
 H1((&amp;#34;Hidden 1&amp;#34;))
 H2((&amp;#34;Hidden 2&amp;#34;))
 H3((&amp;#34;Hidden 3&amp;#34;))
 end
 subgraph subGraph2[&amp;#34;Output Layer&amp;#34;]
 O((&amp;#34;Output&amp;#34;))
 end
 I1 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I2 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I3 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 H1 --&amp;gt; O
 H2 --&amp;gt; O
 H3 --&amp;gt; O

 style I1 fill:#C8E6C9
 style I2 fill:#C8E6C9
 style I3 fill:#C8E6C9
 style H1 stroke:#2962FF,fill:#BBDEFB
 style H2 fill:#BBDEFB
 style H3 fill:#BBDEFB
 style O fill:#FFCDD2
 style subGraph0 stroke:none,fill:transparent
 style subGraph1 stroke:none,fill:transparent
 style subGraph2 stroke:none,fill:transparent
&lt;/pre>

&lt;hr>
&lt;h3 id="structure-of-a-neural-network">
 Structure of a Neural Network
 
 &lt;a class="anchor" href="#structure-of-a-neural-network">#&lt;/a>
 
&lt;/h3>
&lt;p>A typical neural network has &lt;strong>three main layers&lt;/strong>:&lt;/p></description></item><item><title>Machine Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/</link><pubDate>Tue, 06 Aug 2024 23:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/</guid><description>&lt;h1 id="machine-learning">
 Machine Learning
 
 &lt;a class="anchor" href="#machine-learning">#&lt;/a>
 
&lt;/h1>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ===== CLASS DEFINITIONS (Math-based colours) =====
 classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

 %% ===== ROOT =====
 ML: Machine Learning

 %% ===== SUPERVISED =====
 ML --&amp;gt; SL:::category
 SL: Supervised Learning

 SL --&amp;gt; Regression
 Regression --&amp;gt; LR:::algebra
 LR: Linear Regression

 LR --&amp;gt; NN:::algebra
 NN: Neural Network

 NN --&amp;gt; DT:::logic
 DT: Decision Tree

 SL --&amp;gt; Classification
 Classification --&amp;gt; NB:::probability
 NB: Naive Bayes

 NB --&amp;gt; KNN:::geometry
 KNN: k-Nearest Neighbours

 KNN --&amp;gt; SVM:::algebra
 SVM: Support Vector Machine
 
 %% ===== UNSUPERVISED =====
 ML --&amp;gt; USL:::category
 USL: Unsupervised Learning

 USL --&amp;gt; Clustering
 Clustering --&amp;gt; KM:::geometry
 KM: K-Means

 KM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 GMM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model

 %% ===== REINFORCEMENT =====
 ML --&amp;gt; RL:::category
 RL: Reinforcement Learning

 RL --&amp;gt; DM:::logic
 DM: Decision Making
&lt;/pre>

&lt;hr>
&lt;details >&lt;summary>Mathematical Legend&lt;/summary>
 &lt;div class="markdown-inner">
&lt;h3 id="algebra--linear-algebra-blue">
 Algebra / Linear Algebra (Blue)
 
 &lt;a class="anchor" href="#algebra--linear-algebra-blue">#&lt;/a>
 
&lt;/h3>
&lt;p>Used heavily when models rely on:&lt;/p></description></item><item><title>Artificial Neuron and Perceptron</title><link>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</guid><description>&lt;h1 id="artificial-neuron-and-perceptron">
 Artificial Neuron and Perceptron
 
 &lt;a class="anchor" href="#artificial-neuron-and-perceptron">#&lt;/a>
 
&lt;/h1>
&lt;blockquote class="book-hint info">
&lt;p>knowledge in neural networks is stored in &lt;strong>connection weights&lt;/strong>, and learning means &lt;strong>modifying those weights&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="biological-neuron">
 Biological Neuron
 
 &lt;a class="anchor" href="#biological-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.&lt;/p>
&lt;p>Core components:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Dendrites&lt;/strong>: receive signals from other neurons&lt;/li>
&lt;li>&lt;strong>Cell body (soma)&lt;/strong>: processes incoming signals&lt;/li>
&lt;li>&lt;strong>Axon&lt;/strong>: transmits the output signal&lt;/li>
&lt;li>&lt;strong>Synapses&lt;/strong>: connection points between neurons&lt;/li>
&lt;/ul>
&lt;p>Biological intuition:&lt;/p>
&lt;ul>
&lt;li>many inputs arrive to one neuron&lt;/li>
&lt;li>one neuron can connect out to many neurons&lt;/li>
&lt;li>massive parallelism enables fast perception and recognition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="artificial-neuron">
 Artificial Neuron
 
 &lt;a class="anchor" href="#artificial-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>An artificial neuron is a simplified computational model inspired by biological neurons.&lt;/p></description></item><item><title>ML Workflow</title><link>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</guid><description>&lt;h1 id="machine-learning-workflow">
 Machine learning Workflow
 
 &lt;a class="anchor" href="#machine-learning-workflow">#&lt;/a>
 
&lt;/h1>
&lt;p>Data is the foundation of any machine learning system.
Quality of data matters more than model complexity.&lt;/p>
&lt;h3 id="role-of-data">
 Role of Data
 
 &lt;a class="anchor" href="#role-of-data">#&lt;/a>
 
&lt;/h3>
&lt;p>Data determines:&lt;/p>
&lt;ul>
&lt;li>What patterns the model can learn&lt;/li>
&lt;li>How well it generalises&lt;/li>
&lt;li>Whether bias or noise is introduced&lt;/li>
&lt;/ul>
&lt;p>Bad data → bad model (even with perfect algorithms).&lt;/p>
&lt;hr>
&lt;h3 id="data-preprocessing-wrangling">
 Data Preprocessing, wrangling
 
 &lt;a class="anchor" href="#data-preprocessing-wrangling">#&lt;/a>
 
&lt;/h3>
&lt;p>Raw data is never ready for training.&lt;/p>
&lt;p>&lt;strong>Data Issues&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Noise
&lt;ul>
&lt;li>For &lt;strong>objects&lt;/strong>, noise is an &lt;strong>extraneous object&lt;/strong>&lt;/li>
&lt;li>For &lt;strong>attributes&lt;/strong>, noise refers to &lt;strong>modification of original values&lt;/strong>&lt;/li>
&lt;li>Use Log or Z Transfer to convert to mean&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Outliers
&lt;ul>
&lt;li>Data objects with characteristics that are considerably different than most of the other data objects in the data set&lt;/li>
&lt;li>Handle: Use &lt;strong>IQR&lt;/strong> method&lt;/li>
&lt;li>Find Lower and Upper Bound and &lt;strong>replace Outlier with Lower or Upper Bound&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Missing Values
&lt;ul>
&lt;li>Eliminate data objects or variables&lt;/li>
&lt;li>Handle: Estimate missing values
&lt;ul>
&lt;li>&lt;strong>Mean, Median or Mode&lt;/strong>&lt;/li>
&lt;li>Prefer &lt;strong>Median&lt;/strong> if there are missing &lt;strong>outliers&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Ignore the missing value during analysis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Duplicate Data
&lt;ul>
&lt;li>Major issue when merging data from heterogeneous sources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Inconsistent Codes
&lt;ul>
&lt;li>Find all Unique and transfer all inconsistent to&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Data Preprocessing techniques&lt;/strong>&lt;/p></description></item><item><title>Regression(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</guid><description>&lt;h1 id="linear-regression">
 Linear Regression
 
 &lt;a class="anchor" href="#linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Linear Regression is a supervised 
&lt;span style="color: blue;">
 ML
&lt;/span> method used to predict a &lt;strong>numerical&lt;/strong> target by fitting a model that is &lt;strong>linear in its parameters&lt;/strong>.&lt;/p>
&lt;p>In 
&lt;span style="color: blue;">
 ML
&lt;/span>, linear models are a core baseline:
they’re fast, often surprisingly strong, and usually easy to interpret.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Linear Regression learns parameters by minimising a squared-error cost.
You can solve it directly (closed form) or iteratively (gradient descent),
and you can extend it using basis functions and regularisation.&lt;/p></description></item><item><title>Ordinary Least Squares</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</guid><description>&lt;h1 id="direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">
 Direct solution method - Ordinary Least Squares and the Line of Best Fit
 
 &lt;a class="anchor" href="#direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">#&lt;/a>
 
&lt;/h1>
&lt;p>It is possible to compute the best parameters for linear regression &lt;strong>in one shot&lt;/strong> (closed-form),
instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6&lt;/p>
&lt;p>For linear regression, the direct method is usually &lt;strong>Ordinary Least Squares (OLS)&lt;/strong>.&lt;/p>
&lt;p>Ordinary Least Squares (OLS) chooses the “best” line by &lt;strong>minimising squared prediction errors&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
OLS defines “best fit” as the line that minimises the total squared residual error across all data points.&lt;/p></description></item><item><title>Cost Function</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</guid><description>&lt;h1 id="cost-function">
 Cost Function
 
 &lt;a class="anchor" href="#cost-function">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>also known as an objective function&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>how far the predicted values are from the actual ones&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measure of the difference between predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>quantifies the error between a model&amp;rsquo;s predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measures the model’s error on a group of datapoints&lt;/p>
&lt;/li>
&lt;li>
&lt;p>method used to predict values by drawing the best-fit line through the data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>used to evaluate the accuracy of a model’s predictions&lt;/p></description></item><item><title>Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</guid><description>&lt;h1 id="gradient-descent-for-linear-regression">
 Gradient Descent for Linear Regression
 
 &lt;a class="anchor" href="#gradient-descent-for-linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Iterative method&lt;/strong>&lt;/li>
&lt;li>Types: batch / stochastic / mini-batch&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.&lt;/p>
&lt;/blockquote>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
GD[&amp;#34;Gradient&amp;lt;br/&amp;gt;Descent&amp;#34;] --&amp;gt;|minimises| CF[&amp;#34;Cost&amp;lt;br/&amp;gt;function&amp;#34;]
GD --&amp;gt;|updates| W[&amp;#34;Parameters&amp;lt;br/&amp;gt;(weights)&amp;#34;]
GD --&amp;gt;|uses| GR[&amp;#34;Gradient&amp;lt;br/&amp;gt;(slope)&amp;#34;]

GD --&amp;gt; H[&amp;#34;Hyperparameters&amp;#34;]
H --&amp;gt; LR[&amp;#34;Learning&amp;lt;br/&amp;gt;rate&amp;#34;]
H --&amp;gt; BS[&amp;#34;Batch&amp;lt;br/&amp;gt;size&amp;#34;]
H --&amp;gt; EP[&amp;#34;Epochs&amp;#34;]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-gd">
 Types of GD
 
 &lt;a class="anchor" href="#types-of-gd">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Gradient Descent&amp;lt;br/&amp;gt;types&amp;#34;] --&amp;gt; BGD[&amp;#34;Batch&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; SGD[&amp;#34;Stochastic&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; MGD[&amp;#34;Mini-batch&amp;lt;br/&amp;gt;GD&amp;#34;]

BGD --&amp;gt; ALL[&amp;#34;All data&amp;lt;br/&amp;gt;per step&amp;#34;]
BGD --&amp;gt; STB[&amp;#34;Smooth&amp;lt;br/&amp;gt;updates&amp;#34;]

SGD --&amp;gt; ONE[&amp;#34;1 sample&amp;lt;br/&amp;gt;per step&amp;#34;]
SGD --&amp;gt; FAST[&amp;#34;Quick&amp;lt;br/&amp;gt;progress&amp;#34;]
SGD --&amp;gt; NOISE[&amp;#34;Noisy&amp;lt;br/&amp;gt;updates&amp;#34;]

MGD --&amp;gt; MB[&amp;#34;Small batch&amp;lt;br/&amp;gt;per step&amp;#34;]
MGD --&amp;gt; PRACT[&amp;#34;Practical&amp;lt;br/&amp;gt;default&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h3 id="batch">
 Batch
 
 &lt;a class="anchor" href="#batch">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>Use only if you have huge compute and a lot of time to train&lt;/li>
&lt;/ul>
&lt;h3 id="sgd">
 SGD
 
 &lt;a class="anchor" href="#sgd">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>go-to solution&lt;/p></description></item><item><title>Classification(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</guid><description>&lt;h1 id="linear-models-for-classification">
 Linear models for Classification
 
 &lt;a class="anchor" href="#linear-models-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>categorises data by finding a linear boundary (hyperplane) that separates classes&lt;/li>
&lt;li>calculating a weighted sum of input features plus bias&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Linear&amp;lt;br/&amp;gt;classification&amp;lt;br/&amp;gt;models&amp;#34;] --&amp;gt; P[&amp;#34;Perceptron&amp;#34;]
T --&amp;gt; LR[&amp;#34;Logistic&amp;lt;br/&amp;gt;regression&amp;#34;]
T --&amp;gt; SVM[&amp;#34;Linear&amp;lt;br/&amp;gt;SVM&amp;#34;]

P --&amp;gt;|uses| STEP[&amp;#34;Step&amp;lt;br/&amp;gt;activation&amp;#34;]
LR --&amp;gt;|uses| SIG[&amp;#34;Sigmoid&amp;lt;br/&amp;gt;+ log loss&amp;#34;]
SVM --&amp;gt;|uses| HNG[&amp;#34;Hinge&amp;lt;br/&amp;gt;loss&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style P fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style SVM fill:#C8E6C9,stroke:#2E7D32,color:#000

style STEP fill:#CE93D8,stroke:#8E24AA,color:#000
style SIG fill:#CE93D8,stroke:#8E24AA,color:#000
style HNG fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h2 id="discriminant-functions">
 Discriminant Functions
 
 &lt;a class="anchor" href="#discriminant-functions">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="decision-theory">
 Decision Theory
 
 &lt;a class="anchor" href="#decision-theory">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-discriminative-classifiers">
 Probabilistic Discriminative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-discriminative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h2 id="logistic-regression">
 Logistic Regression
 
 &lt;a class="anchor" href="#logistic-regression">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Supervised machine learning algorithm&lt;/li>
&lt;li>Binary &lt;strong>classification&lt;/strong> algorithm&lt;/li>
&lt;li>requires data to be linearly separable&lt;/li>
&lt;li>predicts the probability that an input belongs to a specific class&lt;/li>
&lt;li>uses &lt;strong>Sigmoid function&lt;/strong> to convert inputs into a probability value between 0 and 1&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Logistic regression predicts $P(y=1\mid x)$ using a sigmoid of a linear score $z=w\cdot x+b$,
then learns $w,b$ by maximising likelihood (equivalently minimising log-loss).&lt;/p></description></item><item><title>Foundation Models</title><link>https://arshadhs.github.io/docs/ai/genai/foundation-model/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/foundation-model/</guid><description>&lt;h1 id="foundation-model">
 Foundation Model
 
 &lt;a class="anchor" href="#foundation-model">#&lt;/a>
 
&lt;/h1>
&lt;p>AI models trained on massive datasets to perform a wide range of tasks with minimal fine-tuning.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>are large deep learning neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>are large AI models trained on &lt;strong>massive and diverse datasets&lt;/strong> (text, images, audio, or multiple modalities).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Contain &lt;strong>millions or billions of parameters&lt;/strong>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed to perform a &lt;strong>broad range of general tasks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed for &lt;strong>general-purpose intelligence&lt;/strong>, not a single task.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>acts as &lt;strong>base models&lt;/strong> for building specialised AI applications&lt;/p></description></item><item><title>LLM - Model</title><link>https://arshadhs.github.io/docs/ai/genai/llm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/llm/</guid><description>&lt;h1 id="llm--large-language-model">
 LLM – Large Language Model
 
 &lt;a class="anchor" href="#llm--large-language-model">#&lt;/a>
 
&lt;/h1>
&lt;p>Large Language Models (LLMs) are &lt;strong>advanced AI systems&lt;/strong> designed to process, understand, and generate &lt;strong>human-like text&lt;/strong>.&lt;/p>
&lt;p>They learn language by analysing &lt;strong>massive amounts of text data&lt;/strong>, discovering patterns in:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>meaning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>relationships between words and sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Built on &lt;strong>Deep Learning&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Implemented using &lt;strong>Neural Networks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Based on &lt;strong>Transformers&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often combined with tools like:&lt;/p>
&lt;ul>
&lt;li>Retrieval (RAG)&lt;/li>
&lt;li>Agents&lt;/li>
&lt;li>External APIs&lt;/li>
&lt;li>Memory systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-makes-an-llm-special">
 What makes an LLM special?
 
 &lt;a class="anchor" href="#what-makes-an-llm-special">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Built using &lt;strong>deep neural networks&lt;/strong>&lt;/li>
&lt;li>Trained on &lt;strong>very large datasets&lt;/strong> (books, articles, code, web text)&lt;/li>
&lt;li>Can perform many tasks &lt;strong>without task-specific training&lt;/strong>&lt;/li>
&lt;li>General-purpose language understanding, not single-task models&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="foundation-transformer-architecture">
 Foundation: Transformer Architecture
 
 &lt;a class="anchor" href="#foundation-transformer-architecture">#&lt;/a>
 
&lt;/h2>
&lt;p>LLMs are based on the &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/transformer/">Transformer Architecture&lt;/a>&lt;/strong>, which allows models to understand &lt;strong>context and long-range dependencies&lt;/strong> in text.&lt;/p></description></item><item><title>AI Agents</title><link>https://arshadhs.github.io/docs/ai/genai/ai-agents/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/ai-agents/</guid><description>&lt;h1 id="ai-agents">
 AI Agents
 
 &lt;a class="anchor" href="#ai-agents">#&lt;/a>
 
&lt;/h1>
&lt;p>Also referred to as Agentic AI.&lt;/p>
&lt;p>AI agents are &lt;strong>intelligent systems&lt;/strong> that can &lt;strong>plan, make decisions, and take actions&lt;/strong> to achieve goals with &lt;strong>minimal human intervention&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A common use case is &lt;strong>task automation&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>for example booking travel based on a user’s request.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AI agents typically build on &lt;strong>Generative AI&lt;/strong> and use &lt;strong>Large Language Models (LLMs)&lt;/strong> as the reasoning core.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Agents often interact with tools (APIs, databases, calendars) to complete multi-step workflows.&lt;/p></description></item><item><title>Retrieval-Augmented Generation (RAG)</title><link>https://arshadhs.github.io/docs/ai/genai/rag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/rag/</guid><description>&lt;h1 id="retrieval-augmented-generation-rag">
 Retrieval-Augmented Generation (RAG)
 
 &lt;a class="anchor" href="#retrieval-augmented-generation-rag">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Retrieval-Augmented Generation (RAG)&lt;/strong> is a system design pattern that improves an LLM’s answers by:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Retrieving&lt;/strong> relevant information from an external knowledge source, and then&lt;/li>
&lt;li>&lt;strong>Augmenting&lt;/strong> the LLM prompt with that retrieved context before generating the final response.&lt;/li>
&lt;/ol>
&lt;p>RAG helps an LLM &lt;strong>look things up first&lt;/strong>, then &lt;strong>answer using evidence&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="why-rag-is-useful">
 Why RAG is Useful
 
 &lt;a class="anchor" href="#why-rag-is-useful">#&lt;/a>
 
&lt;/h2>
&lt;p>RAG is commonly used when:&lt;/p>
&lt;ul>
&lt;li>Your knowledge is in &lt;strong>private documents&lt;/strong> (PDFs, policies, internal wiki)&lt;/li>
&lt;li>You need &lt;strong>up-to-date information&lt;/strong> (things not in the model’s training data)&lt;/li>
&lt;li>You want fewer &lt;strong>hallucinations&lt;/strong> by grounding answers in retrieved sources&lt;/li>
&lt;li>You want &lt;strong>traceability&lt;/strong> (show “where the answer came from”)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>RAG does not change the model weights.&lt;br>
It changes what the model &lt;em>sees&lt;/em> at inference time by adding retrieved context.&lt;/p></description></item><item><title>Mathematical Foundation</title><link>https://arshadhs.github.io/docs/ai/maths/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/</guid><description>&lt;h1 id="mathematical-foundations-for-machine-learning">
 Mathematical Foundations for Machine Learning
 
 &lt;a class="anchor" href="#mathematical-foundations-for-machine-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Machine Learning is built on &lt;strong>mathematical principles&lt;/strong> that allow models to:&lt;/p>
&lt;ul>
&lt;li>represent data&lt;/li>
&lt;li>learn patterns&lt;/li>
&lt;li>optimise performance&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 DATA[Data]
 MATH[Math Models]
 OPT[Optimisation]
 MODEL[Trained Model]

 DATA --&amp;gt; MATH
 MATH --&amp;gt; OPT
 OPT --&amp;gt; MODEL
&lt;/pre>

&lt;p>ML requires &lt;strong>core mathematical tools&lt;/strong> to understand how ML algorithms work internally. Algebra deals with relationships between variables and quantities, while Calculus focuses on change and optimization.&lt;/p></description></item><item><title>Decision Tree</title><link>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</guid><description>&lt;h1 id="decision-tree">
 Decision Tree
 
 &lt;a class="anchor" href="#decision-tree">#&lt;/a>
 
&lt;/h1>
&lt;p>A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A decision tree grows by repeatedly splitting the training data into &lt;strong>purer&lt;/strong> subsets using an impurity measure
(Entropy / Gini / Classification Error).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="information-theory">
 Information Theory
 
 &lt;a class="anchor" href="#information-theory">#&lt;/a>
 
&lt;/h2>
&lt;p>Decision trees need a way to measure:
“How mixed are the class labels at a node?”&lt;/p></description></item><item><title>Statistics</title><link>https://arshadhs.github.io/docs/ai/statistics/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/</guid><description>&lt;h1 id="statistics">
 Statistics
 
 &lt;a class="anchor" href="#statistics">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Statistical methods&lt;/strong> help you turn &lt;strong>raw data into reliable conclusions&lt;/strong>, while understanding &lt;strong>uncertainty, variability, and confidence&lt;/strong>.&lt;/p>
&lt;p>Statistics provides the &lt;strong>language and tools&lt;/strong> for reasoning about data, uncertainty, and inference.&lt;/p>
&lt;p>ML needs &lt;strong>understanding data behaviour&lt;/strong>, drawing conclusions, and validating machine learning models.&lt;/p>
&lt;ul>
&lt;li>Collect Data&lt;/li>
&lt;li>Present &amp;amp; Organise Data (in a systematic manner)&lt;/li>
&lt;li>Alalyse Data&lt;/li>
&lt;li>Infer about the Data&lt;/li>
&lt;li>Take Decision from the Data&lt;/li>
&lt;/ul>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/00_formulas/">Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/">Stats Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/">Basic Statistics&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/">Basic Probability&lt;/a>
 &lt;/li>
 
 
 
 
 
 
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/">Hypothesis Testing&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/">Prediction &amp;amp; Forecasting&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/">Gaussian Mixture model &amp;amp; Expectation Maximization&lt;/a>
 &lt;/li>
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/">Conditional Probability &amp;amp; Bayes’ Theorem&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/">Conditional Probability&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/">Bayes’ Theorem&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/">Naïve Bayes&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/">Probability Distributions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/">Random Variables&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/">Common Probability Distributions&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Statistics Topic&lt;/th>
 &lt;th>What you learn (plain English)&lt;/th>
 &lt;th>ML Connection&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>1. Basic Probability &amp;amp; Statistics&lt;/td>
 &lt;td>Summarise data;&lt;br>understand spread;&lt;br>basic probability rules&lt;/td>
 &lt;td>Data understanding (EDA), feature sanity checks,&lt;br>detecting outliers, interpreting “average behaviour”&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>2. Conditional Probability &amp;amp; Bayes&lt;/td>
 &lt;td>Update probability using new information;&lt;br>Bayes’ rule&lt;/td>
 &lt;td>Naïve Bayes, Bayesian thinking,&lt;br>posterior probabilities, probabilistic classification&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>3. Probability Distributions&lt;/td>
 &lt;td>Model randomness with distributions;&lt;br>expectation/variance/covariance&lt;/td>
 &lt;td>Likelihood models, noise assumptions (Gaussian), sampling,&lt;br>probabilistic modelling foundations&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>4. Hypothesis Testing&lt;/td>
 &lt;td>Sampling, CLT, confidence intervals,&lt;br>significance tests, ANOVA, MLE&lt;/td>
 &lt;td>A/B testing, evaluating model improvements,&lt;br>significance vs noise, parameter estimation (MLE)&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>5. Prediction &amp;amp; Forecasting&lt;/td>
 &lt;td>Correlation, regression,&lt;br>time series (AR/MA/ARIMA/SARIMA etc.)&lt;/td>
 &lt;td>Linear regression, forecasting, sequential data modelling, baseline predictive modelling&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>6. GMM &amp;amp; EM&lt;/td>
 &lt;td>Mixtures of Gaussians;&lt;br>iterative estimation with EM&lt;/td>
 &lt;td>Unsupervised learning (soft clustering),&lt;br>density estimation, latent-variable models&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
 A[&amp;#34;Statistical Methods&amp;lt;br/&amp;gt;AIML ZC418&amp;#34;] --&amp;gt; B[&amp;#34;1. Basic Probability and Statistics&amp;#34;]
 A --&amp;gt; C[&amp;#34;2. Conditional Probability and Bayes&amp;#34;]
 A --&amp;gt; D[&amp;#34;3. Probability Distributions&amp;#34;]
 A --&amp;gt; E[&amp;#34;4. Hypothesis Testing&amp;#34;]
 A --&amp;gt; F[&amp;#34;5. Prediction and Forecasting&amp;#34;]
 A --&amp;gt; G[&amp;#34;6. Gaussian Mixture Model and EM&amp;#34;]

 B --&amp;gt; B1[&amp;#34;Central Tendency&amp;lt;br/&amp;gt;Mean - Median - Mode&amp;#34;]
 B --&amp;gt; B2[&amp;#34;Variability&amp;lt;br/&amp;gt;Range - Variance - SD - Quartiles&amp;#34;]
 B --&amp;gt; B3[&amp;#34;Basic Probability Concepts&amp;#34;]
 B3 --&amp;gt; B31[&amp;#34;Axioms of Probability&amp;#34;]
 B3 --&amp;gt; B32[&amp;#34;Definition of Probability&amp;#34;]
 B3 --&amp;gt; B33[&amp;#34;Mutually Exclusive vs Independent&amp;#34;]

 C --&amp;gt; C1[&amp;#34;Conditional Probability&amp;#34;]
 C --&amp;gt; C2[&amp;#34;Independence (conditional)&amp;#34;]
 C --&amp;gt; C3[&amp;#34;Bayes Theorem&amp;#34;]
 C --&amp;gt; C4[&amp;#34;Naive Bayes (intro)&amp;#34;]

 D --&amp;gt; D1[&amp;#34;Random Variables&amp;lt;br/&amp;gt;Discrete and Continuous&amp;#34;]
 D --&amp;gt; D2[&amp;#34;Expectation - Variance - Covariance&amp;#34;]
 D --&amp;gt; D3[&amp;#34;Transformations of RVs&amp;#34;]
 D --&amp;gt; D4[&amp;#34;Key Distributions&amp;#34;]
 D4 --&amp;gt; D41[&amp;#34;Bernoulli&amp;#34;]
 D4 --&amp;gt; D42[&amp;#34;Binomial&amp;#34;]
 D4 --&amp;gt; D43[&amp;#34;Poisson&amp;#34;]
 D4 --&amp;gt; D44[&amp;#34;Normal (Gaussian)&amp;#34;]
 D4 --&amp;gt; D45[&amp;#34;t - Chi-square - F (intro)&amp;#34;]

 E --&amp;gt; E1[&amp;#34;Sampling&amp;lt;br/&amp;gt;Random and Stratified&amp;#34;]
 E --&amp;gt; E2[&amp;#34;Sampling Distributions&amp;lt;br/&amp;gt;CLT&amp;#34;]
 E --&amp;gt; E3[&amp;#34;Estimation&amp;lt;br/&amp;gt;Confidence Intervals&amp;#34;]
 E --&amp;gt; E4[&amp;#34;Hypothesis Tests&amp;lt;br/&amp;gt;Means and Proportions&amp;#34;]
 E --&amp;gt; E5[&amp;#34;ANOVA&amp;lt;br/&amp;gt;Single and Dual factor&amp;#34;]
 E --&amp;gt; E6[&amp;#34;Maximum Likelihood&amp;#34;]

 F --&amp;gt; F1[&amp;#34;Correlation&amp;#34;]
 F --&amp;gt; F2[&amp;#34;Regression&amp;#34;]
 F --&amp;gt; F3[&amp;#34;Time Series Basics&amp;lt;br/&amp;gt;Components&amp;#34;]
 F --&amp;gt; F4[&amp;#34;Moving Averages&amp;lt;br/&amp;gt;Simple and Weighted&amp;#34;]
 F --&amp;gt; F5[&amp;#34;Time Series Models&amp;#34;]
 F5 --&amp;gt; F51[&amp;#34;AR&amp;#34;]
 F5 --&amp;gt; F52[&amp;#34;ARMA / ARIMA&amp;#34;]
 F5 --&amp;gt; F53[&amp;#34;SARIMA / SARIMAX&amp;#34;]
 F5 --&amp;gt; F54[&amp;#34;VAR / VARMAX&amp;#34;]
 F --&amp;gt; F6[&amp;#34;Exponential Smoothing&amp;#34;]

 G --&amp;gt; G1[&amp;#34;GMM&amp;lt;br/&amp;gt;Mixture of Gaussians&amp;#34;]
 G --&amp;gt; G2[&amp;#34;EM Algorithm&amp;lt;br/&amp;gt;E-step - M-step&amp;#34;]

 B -.-&amp;gt; C
 C -.-&amp;gt; D
 D -.-&amp;gt; E
 E -.-&amp;gt; F
 F -.-&amp;gt; G
&lt;/pre>

&lt;hr>
&lt;h2 id="data---types">
 Data - Types
 
 &lt;a class="anchor" href="#data---types">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
	A[(Data)] --&amp;gt; B[&amp;#34;Categorical (Qualitative)&amp;#34;]
 A --&amp;gt; C[&amp;#34;Numerical (Quantitative)&amp;#34;]

 B --&amp;gt; B1[Nominal]
 B --&amp;gt; B2[Ordinal]

 C --&amp;gt; C1[Discrete]
 C --&amp;gt; C2[Continuous]

 C2 --&amp;gt; C21[Interval]
 C2 --&amp;gt; C22[Ratio]

 %% Styling
 style A fill:#E1F5FE,stroke:#333
 style B fill:#90CAF9,stroke:#333
 style B1 fill:#90CAF9,stroke:#333
 style B2 fill:#90CAF9,stroke:#333
 style C fill:#FFF9C4,stroke:#333
 style C1 fill:#FFF9C4,stroke:#333
 style C2 fill:#FFF9C4,stroke:#333
 style C21 fill:#FFF9C4,stroke:#333
 style C22 fill:#FFF9C4,stroke:#333
&lt;/pre>

&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="categorical-qualitative">
 Categorical (Qualitative)
 
 &lt;a class="anchor" href="#categorical-qualitative">#&lt;/a>
 
&lt;/h2>
&lt;p>express a qualitative attribute
e.g. hair color, eye color&lt;/p></description></item><item><title>Instance-based Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</guid><description>&lt;h1 id="instance-based-learning">
 Instance-based Learning
 
 &lt;a class="anchor" href="#instance-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Instance-based learning is a family of methods that &lt;strong>do not build one explicit global model during training&lt;/strong>. Instead, they &lt;strong>store training examples&lt;/strong> and delay most of the work until a new query arrives.&lt;/p>
&lt;p>When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.&lt;/p>
&lt;p>Instance-based Learning covers three linked ideas:&lt;/p></description></item><item><title>Support Vector Machine</title><link>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</guid><description>&lt;h1 id="support-vector-machine-svm">
 Support Vector Machine (SVM)
 
 &lt;a class="anchor" href="#support-vector-machine-svm">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Support Vector Machine (SVM)&lt;/strong> is a &lt;strong>supervised machine learning algorithm&lt;/strong> used for:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Classification&lt;/strong> (most common)&lt;/li>
&lt;li>&lt;strong>Regression&lt;/strong> (SVR – Support Vector Regression)&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>Find the decision boundary that separates classes with the &lt;strong>maximum margin&lt;/strong>.&lt;/p>
&lt;/blockquote>&lt;blockquote class="book-hint default">
&lt;p>A Support Vector Machine is a supervised learning algorithm that finds an optimal hyperplane by maximising the margin between classes, using support vectors and kernel functions to handle non-linear data.&lt;/p></description></item><item><title>Attention Mechanism</title><link>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</guid><description>&lt;h1 id="attention-mechanism">
 Attention Mechanism
 
 &lt;a class="anchor" href="#attention-mechanism">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Queries, Keys, and Values&lt;/li>
&lt;li>Attention Pooling by Similarity&lt;/li>
&lt;li>Attention Pooling via Nadaraya–Watson Regression&lt;/li>
&lt;li>Attention Scoring Functions&lt;/li>
&lt;li>Dot Product Attention&lt;/li>
&lt;li>Convenience Functions&lt;/li>
&lt;li>Scaled Dot Product Attention&lt;/li>
&lt;li>Additive Attention&lt;/li>
&lt;li>Bahdanau Attention Mechanism&lt;/li>
&lt;li>Multi-Head Attention&lt;/li>
&lt;li>Self-Attention&lt;/li>
&lt;li>Positional Encoding&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_builders-guide/model-construction.html">Ch 10&lt;/a>, &lt;a href="https://d2l.ai/chapter_convolutional-neural-networks/index.html">Ch7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Bayesian Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</guid><description>&lt;h1 id="bayesian-learning">
 Bayesian Learning
 
 &lt;a class="anchor" href="#bayesian-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="mle-hypothesis">
 MLE Hypothesis
 
 &lt;a class="anchor" href="#mle-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="map-hypothesis">
 MAP Hypothesis
 
 &lt;a class="anchor" href="#map-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayes-rule">
 Bayes Rule
 
 &lt;a class="anchor" href="#bayes-rule">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="optimal-bayes-classifier">
 Optimal Bayes Classifier
 
 &lt;a class="anchor" href="#optimal-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="naïve-bayes-classifier">
 Naïve Bayes Classifier
 
 &lt;a class="anchor" href="#na%c3%afve-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-generative-classifiers">
 Probabilistic Generative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-generative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayesian-linear-regression">
 Bayesian Linear Regression
 
 &lt;a class="anchor" href="#bayesian-linear-regression">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Transformer</title><link>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</guid><description>&lt;h1 id="transformer">
 Transformer
 
 &lt;a class="anchor" href="#transformer">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>is an architecture of neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>based on the multi-head attention mechanism&lt;/p>
&lt;/li>
&lt;li>
&lt;p>text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table&lt;/p>
&lt;/li>
&lt;li>
&lt;p>takes a text sequence as input and produces another text sequence as output&lt;/p>
&lt;/li>
&lt;li>
&lt;p>foundation for modern &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/genai/llm/">Large Language Models (LLMs)&lt;/a>&lt;/strong> like ChatGPT and Gemini&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transformer architecture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Model, Positionwise Feed-Forward Networks, Residual Connection and Layer Normalization&lt;/p></description></item><item><title>Ensemble Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</guid><description>&lt;h1 id="ensemble-learning">
 Ensemble Learning
 
 &lt;a class="anchor" href="#ensemble-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="combining-classifiers">
 Combining Classifiers
 
 &lt;a class="anchor" href="#combining-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bagging">
 Bagging
 
 &lt;a class="anchor" href="#bagging">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="random-forest">
 Random Forest
 
 &lt;a class="anchor" href="#random-forest">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="boosting">
 Boosting
 
 &lt;a class="anchor" href="#boosting">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="adaboost">
 ADABoost
 
 &lt;a class="anchor" href="#adaboost">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="gradient-boosting">
 Gradient Boosting
 
 &lt;a class="anchor" href="#gradient-boosting">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="xgboost">
 XGBoost
 
 &lt;a class="anchor" href="#xgboost">#&lt;/a>
 
&lt;/h3>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Optimisation of Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</guid><description>&lt;h1 id="optimisation-of-deep-models">
 Optimisation of Deep models
 
 &lt;a class="anchor" href="#optimisation-of-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Goal of Optimization&lt;/li>
&lt;li>Optimization Challenges in Deep Learning&lt;/li>
&lt;li>Gradient Descent&lt;/li>
&lt;li>Stochastic Gradient Descent&lt;/li>
&lt;li>Minibatch Stochastic Gradient Descent&lt;/li>
&lt;li>Momentum&lt;/li>
&lt;li>Adagrad and Algorithm&lt;/li>
&lt;li>RMSProp and Algorithm&lt;/li>
&lt;li>Adadelta and Algorithm&lt;/li>
&lt;li>Adam and Algorithm&lt;/li>
&lt;li>Code Implementation and comparison of algorithms (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (Ch12)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Evaluation/Comparison</title><link>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</guid><description>&lt;h1 id="machine-learning-model-evaluationcomparison">
 Machine Learning Model Evaluation/Comparison
 
 &lt;a class="anchor" href="#machine-learning-model-evaluationcomparison">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="comparing-machine-learning-models">
 Comparing Machine Learning Models
 
 &lt;a class="anchor" href="#comparing-machine-learning-models">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">
 Emerging requirements e.g., bias, fairness, interpretability of ML models
 
 &lt;a class="anchor" href="#emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Regularisation for Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</guid><description>&lt;h1 id="regularisation-for-deep-models">
 Regularisation for Deep models
 
 &lt;a class="anchor" href="#regularisation-for-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Generalization for regression&lt;/li>
&lt;li>Training Error and Generalization Error&lt;/li>
&lt;li>Underfitting or Overfitting&lt;/li>
&lt;li>Model Selection&lt;/li>
&lt;li>Weight Decay and Norms&lt;/li>
&lt;li>Generalization in Classification&lt;/li>
&lt;li>Environment and Distribution Shift&lt;/li>
&lt;li>Generalization in Deep Learning&lt;/li>
&lt;li>Dropout&lt;/li>
&lt;li>Batch Normalization&lt;/li>
&lt;li>Layer Normalization&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_introduction/index.html">T1 – Ch 3.6, 3.7, T1 - Ch 4.6, 4.7, T1 - Ch 5.5, 5.6, T1 - Ch 8.5, T1 - Ch 11.7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Linear Algebra</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/</guid><description>&lt;h1 id="linear-algebra">
 Linear Algebra
 
 &lt;a class="anchor" href="#linear-algebra">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>study of vectors and matrices&lt;/strong> is called Linear Algebra.&lt;/p>
&lt;p>Linear Algebra provides the &lt;strong>mathematical language&lt;/strong> used &lt;strong>to represent data, transformations, and structure&lt;/strong> in ML.&lt;/p>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/">Linear Systems&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/">Systems of Linear Equations&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/">Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/matrix-transposition/">Matrix Transposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/">Solving Linear Systems&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/">Forward and Backward Substitution&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/">Inverse Matrix&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/">Convex Combination&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/">Vector Spaces&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/020-basis-and-rank/">Basis and Rank&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/">Linear Independence&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/030-norm/">Norm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/">Inner Products and Dot Product&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/050-lengths-and-distances/">Lengths and Distances&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/">Angles and Orthogonality&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/070-orthonormal-basis/">Orthonormal Basis&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/">Feature Space&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/">Cauchy–Schwarz&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">Matrix Decompositions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/">Characteristic Polynomial&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/">Determinant and Trace&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/">Eigenvalues and Eigenvectors&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/">Cholesky Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/">Eigen Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/">Diagonalization&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/">Singular Value Decomposition (SVD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/">Matrix Approximation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">Dimensionality reduction and PCA&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;h2 id="why-linear-algebra-matters-in-ml">
 Why Linear Algebra Matters in ML
 
 &lt;a class="anchor" href="#why-linear-algebra-matters-in-ml">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Every machine learning model uses matrices&lt;/li>
&lt;li>All data in ML is represented using &lt;strong>vectors and matrices&lt;/strong>&lt;/li>
&lt;li>Neural networks are pipelines of matrix operations&lt;/li>
&lt;li>Models apply &lt;strong>matrix transformations&lt;/strong> to data&lt;/li>
&lt;li>Optimisation relies on linear algebra operations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-to-learn">
 What to Learn
 
 &lt;a class="anchor" href="#what-to-learn">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Scalars, vectors, and matrices&lt;/li>
&lt;li>Vector operations (addition, dot product)&lt;/li>
&lt;li>Matrix multiplication &lt;em>(critical)&lt;/em>&lt;/li>
&lt;li>Identity matrices and transpose&lt;/li>
&lt;li>Eigenvalues and eigenvectors &lt;em>(conceptual understanding)&lt;/em>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;ul>
&lt;li>&lt;strong>Scalar&lt;/strong> → a number&lt;/li>
&lt;li>&lt;strong>Vector&lt;/strong> → a directed point&lt;/li>
&lt;li>&lt;strong>Matrix&lt;/strong> → a space transformer&lt;/li>
&lt;li>&lt;strong>Linear transformation&lt;/strong> → structured mapping&lt;/li>
&lt;li>&lt;strong>Feature&lt;/strong> → one axis&lt;/li>
&lt;li>&lt;strong>Feature space&lt;/strong> → where data lives&lt;/li>
&lt;li>&lt;strong>Vector space&lt;/strong> → where vectors live&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/">
 Mathematical Foundation
&lt;/a>&lt;/p></description></item><item><title>Linear Systems</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/</guid><description>&lt;h1 id="linear-systems">
 Linear Systems
 
 &lt;a class="anchor" href="#linear-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>How systems of linear equations are represented and solved using matrices.&lt;/p>
&lt;ul>
&lt;li>the study of vectors and rules to manipulate
vectors&lt;/li>
&lt;li>describe multiple linear equations solved simultaneously&lt;/li>
&lt;li>connect algebraic equations with matrix representations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/matrix_vector_operations.png" alt="Matrix" />&lt;/p>
&lt;hr>
&lt;h2 id="idea-of-closure">
 Idea of Closure
 
 &lt;a class="anchor" href="#idea-of-closure">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>performing a specific operation (like addition or multiplication) on members of a set always produces a result that belongs to the same set&lt;/p>
&lt;/li>
&lt;li>
&lt;p>idea of closure is fundamental to defining a &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/linear-algebra/01-linear-systems">Vector space&lt;/a>&lt;/strong> because it ensures that performing arithmetic operations (addition and scalar multiplication) on vectors within a set does not produce a new element outside that set.&lt;/p></description></item><item><title>Systems of Linear Equations</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/</guid><description>&lt;h1 id="systems-of-linear-equations">
 Systems of Linear Equations
 
 &lt;a class="anchor" href="#systems-of-linear-equations">#&lt;/a>
 
&lt;/h1>
&lt;p>A system of linear equations can be written compactly as:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
A\mathbf{x}=\mathbf{b}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>This represents:&lt;/p>
&lt;ul>
&lt;li>a &lt;strong>linear transformation&lt;/strong> applied to an unknown vector (\mathbf{x})&lt;/li>
&lt;li>producing an output vector (\mathbf{b})&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="key-components">
 Key components
 
 &lt;a class="anchor" href="#key-components">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="coefficient-matrix-a">
 Coefficient matrix (A)
 
 &lt;a class="anchor" href="#coefficient-matrix-a">#&lt;/a>
 
&lt;/h3>
&lt;p>(A) contains the coefficients of the variables.&lt;/p></description></item><item><title>Calculus</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/</guid><description>&lt;h1 id="calculus">
 Calculus
 
 &lt;a class="anchor" href="#calculus">#&lt;/a>
 
&lt;/h1>
&lt;p>Calculus is:&lt;/p>
&lt;ul>
&lt;li>the mathematical framework for understanding and controlling how quantities change&lt;/li>
&lt;li>the mathematics of &lt;strong>change&lt;/strong> and &lt;strong>accumulation&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>It helps answer:&lt;/p>
&lt;ul>
&lt;li>How fast is something changing &lt;strong>right now&lt;/strong>?&lt;/li>
&lt;li>What happens when inputs change &lt;strong>slightly&lt;/strong>?&lt;/li>
&lt;li>Where is something &lt;strong>maximum or minimum&lt;/strong>?&lt;/li>
&lt;/ul>
&lt;p>It answers two big questions:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>How fast is something changing right now?&lt;/strong> → derivatives (differentiation)&lt;/li>
&lt;li>&lt;strong>How much has accumulated over an interval?&lt;/strong> → integrals (integration)&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
 A[Calculus] --&amp;gt; B[Limits]
 B --&amp;gt; C[Continuity]
 B --&amp;gt; D[Derivatives]
 B --&amp;gt; E[Integrals]
 D --&amp;gt; F[Optimisation: maxima/minima]
 D --&amp;gt; G[ML: gradients &amp;amp; learning]
 E --&amp;gt; H[Accumulation: area/total change]
&lt;/pre>

&lt;hr>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">Vector Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">Continuous Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">Nonlinear Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="differential-calculus-rates-of-change">
 Differential Calculus (Rates of Change)
 
 &lt;a class="anchor" href="#differential-calculus-rates-of-change">#&lt;/a>
 
&lt;/h2>
&lt;p>Studies &lt;strong>how things change&lt;/strong>.&lt;/p></description></item><item><title>Matrices</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/</guid><description>&lt;h1 id="matrices">
 Matrices
 
 &lt;a class="anchor" href="#matrices">#&lt;/a>
 
&lt;/h1>
&lt;p>Matrices are the &lt;strong>core data structure of linear algebra&lt;/strong> and the &lt;strong>workhorse of machine learning&lt;/strong>.&lt;br>
Almost every ML model can be described as a sequence of matrix operations.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://arshadhs.github.io/docs/ai/maths/linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="matrix">
 Matrix
 
 &lt;a class="anchor" href="#matrix">#&lt;/a>
 
&lt;/h2>
&lt;p>A &lt;strong>matrix&lt;/strong> is a rectangular array of numbers arranged in &lt;strong>rows and columns&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
A \in \mathbb{R}^{m \times n}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>An ( m \times n ) matrix has:&lt;/p></description></item><item><title>Solving Linear Systems</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/</guid><description>&lt;h1 id="solving-linear-systems">
 Solving Linear Systems
 
 &lt;a class="anchor" href="#solving-linear-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>Solve using:&lt;/p>
&lt;ul>
&lt;li>Substitution Method&lt;/li>
&lt;li>Elimination Method (Multiple &amp;amp; then Subtract)&lt;/li>
&lt;li>Cross Multiplication&lt;/li>
&lt;/ul>
&lt;p>Linear system can have:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>no solution&lt;/strong>&lt;/li>
&lt;li>&lt;strong>a unique solution&lt;/strong>&lt;/li>
&lt;li>&lt;strong>infinitely many solutions&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h2 id="positive-definite-matrices">
 Positive Definite Matrices
 
 &lt;a class="anchor" href="#positive-definite-matrices">#&lt;/a>
 
&lt;/h2>
&lt;p>A square matrix is positive definite if pre-multiplying and post-multiplying it by the same vector always gives a positive number as a result, independently of how we choose the vector.&lt;/p>
&lt;p>Positive definite symmetric matrices have the property that all their eigenvalues are positive.&lt;/p></description></item><item><title>Forward and Backward Substitution</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/</guid><description>&lt;h1 id="forward-and-backward-substitution">
 Forward and Backward Substitution
 
 &lt;a class="anchor" href="#forward-and-backward-substitution">#&lt;/a>
 
&lt;/h1>
&lt;p>Forward and backward substitution are efficient algorithms used to solve linear systems when the coefficient matrix is &lt;strong>triangular&lt;/strong>.&lt;/p>
&lt;p>They are typically used after:&lt;/p>
&lt;ul>
&lt;li>Gaussian elimination&lt;/li>
&lt;li>LU decomposition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="1-forward-substitution-lower-triangular-systems">
 1. Forward Substitution (Lower Triangular Systems)
 
 &lt;a class="anchor" href="#1-forward-substitution-lower-triangular-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>Used to solve:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
L\mathbf{x} = \mathbf{b}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>where (L) is a &lt;strong>lower triangular matrix&lt;/strong>:&lt;/p></description></item><item><title>Inverse Matrix</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/</guid><description>&lt;h1 id="inverse-matrix">
 Inverse Matrix
 
 &lt;a class="anchor" href="#inverse-matrix">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>inverse of a matrix&lt;/strong> is a matrix that, when multiplied with the original matrix, produces the &lt;strong>identity matrix&lt;/strong>.&lt;/p>
&lt;p>A square matrix (A) is &lt;strong>invertible&lt;/strong> if there exists a matrix (A^{-1}) such that:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
AA^{-1} = A^{-1}A = I
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>Here:&lt;/p></description></item><item><title>Convex Combination</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/</guid><description>&lt;h1 id="convex-combination-of-two-points">
 Convex Combination of Two Points
 
 &lt;a class="anchor" href="#convex-combination-of-two-points">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>convex combination&lt;/strong> describes how to form a point between two points using weighted averages.&lt;/p>
&lt;p>It is a fundamental building block in several advanced fields:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Linear Algebra &amp;amp; Geometry&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Optimization Theory&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Machine Learning&lt;/strong> (Specifically in SVMs, clustering, and data interpolation)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>Given two points (or vectors) $\mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n$, a convex combination of these points is defined as:&lt;/p>
$$\mathbf{x} = \lambda \mathbf{x}_1 + (1 - \lambda)\mathbf{x}_2$$&lt;p>&lt;strong>Where:&lt;/strong>&lt;/p></description></item><item><title>Vector Spaces</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/</guid><description>&lt;h1 id="vector-spaces">
 Vector Spaces
 
 &lt;a class="anchor" href="#vector-spaces">#&lt;/a>
 
&lt;/h1>
&lt;p>A vector space is the mathematical “home” where vectors live and where addition and scaling are valid operations.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A vector space is a set closed under vector addition and scalar multiplication.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Machine learning operates in vector spaces.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>covers independence, bases, rank, and geometric tools like norms and inner products that are used to measure length, distance, and angles.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>A &lt;strong>vector space&lt;/strong> is a set of vectors that follows &lt;strong>ten axioms&lt;/strong>, defined under two operations:&lt;/p></description></item><item><title>Feature Space</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/</guid><description>&lt;h1 id="feature">
 Feature
 
 &lt;a class="anchor" href="#feature">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>feature&lt;/strong> is an individual measurable property or characteristic of a data point used as input to a machine learning model.&lt;/p>
&lt;p>Each feature corresponds to &lt;strong>one dimension&lt;/strong>.&lt;/p>

&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \[ 
x_i \in \mathbb{R}
 \]
 &lt;/span>


&lt;p>A data point with ( d ) features is represented as:&lt;/p></description></item><item><title>Cauchy–Schwarz</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/</guid><description>&lt;h1 id="cauchyschwarz-inequality">
 Cauchy–Schwarz Inequality
 
 &lt;a class="anchor" href="#cauchyschwarz-inequality">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>Cauchy–Schwarz Inequality&lt;/strong> is one of the most important results in linear algebra.&lt;/p>
&lt;p>It places a fundamental bound on the inner product of two vectors.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>If you see &lt;strong>angle&lt;/strong>, &lt;strong>cosine&lt;/strong>, &lt;strong>similarity&lt;/strong>, or &lt;strong>inner product bounds&lt;/strong>&lt;br>
→ think &lt;strong>Cauchy–Schwarz Inequality&lt;/strong>&lt;/p>
&lt;p>Key Idea:
The inner product (dot product) can never exceed the product of magnitudes.
This ensures all geometric interpretations (angles, cosine) are valid.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="statement-of-the-inequality">
 Statement of the Inequality
 
 &lt;a class="anchor" href="#statement-of-the-inequality">#&lt;/a>
 
&lt;/h2>
&lt;p>For any vectors:&lt;/p></description></item><item><title>Matrix Decompositions</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/</guid><description>&lt;h1 id="matrix-decompositions">
 Matrix Decompositions
 
 &lt;a class="anchor" href="#matrix-decompositions">#&lt;/a>
 
&lt;/h1>
&lt;p>Decompositions reveal structure in matrices and power algorithms like PCA.&lt;/p>
&lt;p>Matrix decompositions break complex matrices into simpler parts.&lt;/p>
&lt;p>From the lecture introduction, matrices are used to describe mappings and transformations of vectors.&lt;/p>
&lt;p>That is why decomposition is important:
it lets us understand a complicated transformation by rewriting it using simpler building blocks.&lt;/p>
&lt;p>In the slides, the topic is introduced as part of three closely connected goals:
how to summarise matrices,
how matrices can be decomposed,
and how the decompositions can be used for matrix approximations.&lt;/p></description></item><item><title>Characteristic Polynomial</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/</guid><description>&lt;h1 id="characteristic-polynomial">
 Characteristic Polynomial
 
 &lt;a class="anchor" href="#characteristic-polynomial">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>characteristic polynomial&lt;/strong> of a square matrix is the key tool used to compute &lt;strong>eigenvalues&lt;/strong>.&lt;/p>
&lt;p>It connects:&lt;/p>
&lt;ul>
&lt;li>Determinants&lt;/li>
&lt;li>Trace&lt;/li>
&lt;li>Eigenvalues&lt;/li>
&lt;li>Matrix structure&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>Let&lt;br>

&lt;span>
 \( A \in \mathbb{R}^{n \times n} \)
 &lt;/span>

&lt;br>
and 
&lt;span>
 \( \lambda \in \mathbb{R} \)
 &lt;/span>

.&lt;/p>
&lt;p>The &lt;strong>characteristic polynomial&lt;/strong> of (A) is defined as:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
p_A(\lambda) = \det(A - \lambda I)
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>It is a polynomial in 
&lt;span>
 \( \lambda \)
 &lt;/span>

 of degree (n).&lt;/p></description></item><item><title>Determinant and Trace</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/</guid><description>&lt;h1 id="determinant-and-trace">
 Determinant and Trace
 
 &lt;a class="anchor" href="#determinant-and-trace">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;h2 id="minor">
 Minor
 
 &lt;a class="anchor" href="#minor">#&lt;/a>
 
&lt;/h2>
&lt;p>The &lt;strong>minor&lt;/strong> of an element 
&lt;span>
 \( a_{ij} \)
 &lt;/span>

 is the determinant of the smaller square matrix formed by:&lt;/p>
&lt;ul>
&lt;li>removing &lt;strong>row&lt;/strong> 
&lt;span>
 \( i \)
 &lt;/span>

&lt;/li>
&lt;li>removing &lt;strong>column&lt;/strong> 
&lt;span>
 \( j \)
 &lt;/span>

&lt;/li>
&lt;/ul>
&lt;p>The minor is denoted 
&lt;span>
 \( M_{ij} \)
 &lt;/span>

.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Minors are used to compute &lt;strong>cofactors&lt;/strong>, which are used for determinants and inverses (via adjoint/adjugate).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="cofactor">
 Cofactor
 
 &lt;a class="anchor" href="#cofactor">#&lt;/a>
 
&lt;/h2>
&lt;p>The &lt;strong>cofactor&lt;/strong> of 
&lt;span>
 \( a_{ij} \)
 &lt;/span>

, denoted 
&lt;span>
 \( C_{ij} \)
 &lt;/span>

, is:&lt;/p></description></item><item><title>Eigenvalues and Eigenvectors</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/</guid><description>&lt;h1 id="eigenvalues-and-eigenvectors">
 Eigenvalues and Eigenvectors
 
 &lt;a class="anchor" href="#eigenvalues-and-eigenvectors">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Eigenvalues give scaling.&lt;/li>
&lt;li>Eigenvectors define invariant directions of transformation.&lt;/li>
&lt;/ul>
&lt;p>Eigenvalues and eigenvectors describe directions that remain unchanged under a linear transformation, except for scaling.&lt;/p>
&lt;p>From lectures:
matrix multiplication represents a transformation of space.&lt;br>
Most vectors change direction and magnitude.&lt;br>
Some special vectors only scale.&lt;br>
These are eigenvectors.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
A matrix transformation stretches or compresses vectors.
Eigenvectors are directions that remain unchanged.
Eigenvalues tell how much scaling happens.&lt;/p></description></item><item><title>Cholesky Decomposition</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/</guid><description>&lt;h1 id="cholesky-decomposition">
 Cholesky Decomposition
 
 &lt;a class="anchor" href="#cholesky-decomposition">#&lt;/a>
 
&lt;/h1>
&lt;p>Cholesky decomposition is a special matrix factorisation used for symmetric positive definite matrices.&lt;/p>
&lt;p>From lecture discussions, this decomposition is powerful because it reduces a matrix into a triangular form, making computations easier and more stable.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Cholesky decomposition expresses a matrix as a product of a lower triangular matrix and its transpose.
It is efficient and numerically stable.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>For a symmetric positive definite matrix:&lt;/p></description></item><item><title>Eigen Decomposition</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/</guid><description>&lt;h1 id="eigen-decomposition">
 Eigen Decomposition
 
 &lt;a class="anchor" href="#eigen-decomposition">#&lt;/a>
 
&lt;/h1>
&lt;p>Eigen decomposition expresses a matrix using its eigenvectors and eigenvalues.&lt;/p>
&lt;p>From lecture discussions, this is one of the most important ways to understand the internal structure of a matrix.&lt;/p>
&lt;p>Instead of treating the matrix as a black box, eigen decomposition reveals its fundamental directions and scaling behaviour.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Eigen decomposition rewrites a matrix in terms of directions (eigenvectors) and scaling factors (eigenvalues).
This makes complex transformations easier to understand and compute.&lt;/p></description></item><item><title>Diagonalization</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/</guid><description>&lt;h1 id="diagonalization">
 Diagonalization
 
 &lt;a class="anchor" href="#diagonalization">#&lt;/a>
 
&lt;/h1>
&lt;p>Diagonalisation expresses a matrix using its eigenvectors and eigenvalues when possible.&lt;/p>
&lt;p>From lecture explanation, diagonalisation is one of the most powerful tools because it converts a complicated matrix into a much simpler form.&lt;/p>
&lt;p>Instead of working with a full matrix, we work with a diagonal matrix, which is much easier to analyse and compute.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
If a matrix has enough independent eigenvectors, it can be rewritten as a diagonal matrix using a change of basis.
This simplifies matrix operations significantly.&lt;/p></description></item><item><title>Singular Value Decomposition (SVD)</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/</guid><description>&lt;h1 id="singular-value-decomposition-svd">
 Singular Value Decomposition (SVD)
 
 &lt;a class="anchor" href="#singular-value-decomposition-svd">#&lt;/a>
 
&lt;/h1>
&lt;p>Singular Value Decomposition (SVD) is one of the most important matrix decomposition techniques in linear algebra and machine learning.&lt;/p>
&lt;p>It factorises any matrix into three simpler matrices that reveal its structure.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
SVD decomposes a matrix into rotations + scaling.
It tells us how data is transformed along orthogonal directions.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h1 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h1>
&lt;p>For any matrix in real space:

&lt;span style="color: green;">
 &lt;span>
 \[ 
A \in \mathbb{R}^{m \times n}
 \]
 &lt;/span>

&lt;/span>&lt;/p></description></item><item><title>Matrix Approximation</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/</guid><description>&lt;h1 id="matrix-approximation">
 Matrix Approximation
 
 &lt;a class="anchor" href="#matrix-approximation">#&lt;/a>
 
&lt;/h1>
&lt;p>Low-rank approximation keeps the most important structure while reducing noise and computation.&lt;/p>
&lt;hr>
&lt;h2 id="low-rank-approximation">
 Low-Rank Approximation
 
 &lt;a class="anchor" href="#low-rank-approximation">#&lt;/a>
 
&lt;/h2>
&lt;p>Used for:&lt;/p>
&lt;ul>
&lt;li>Dimensionality reduction&lt;/li>
&lt;li>Noise removal&lt;/li>
&lt;li>Efficient computation&lt;/li>
&lt;/ul>
&lt;p>Forms the basis of &lt;strong>PCA&lt;/strong>.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">
 Matrix Decompositions
&lt;/a>&lt;/p></description></item><item><title>Vector Calculus</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/</guid><description>&lt;h1 id="vector-calculus">
 Vector Calculus
 
 &lt;a class="anchor" href="#vector-calculus">#&lt;/a>
 
&lt;/h1>
&lt;p>Vector calculus extends differentiation to multivariate and vector-valued functions.&lt;/p>
&lt;p>Gradients power learning. This section builds differentiation skills needed for backpropagation.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD

 %% Core Node
 PD[&amp;#34;Partial Derivatives&amp;#34;]

 %% Supporting Concepts
 DQ[&amp;#34;Difference Quotient&amp;#34;]
 JH[&amp;#34;Jacobian / Hessian&amp;#34;]
 TS[&amp;#34;Taylor Series&amp;#34;]

 %% Application Chapters
 CH6[&amp;#34;&amp;lt;br/&amp;gt;Probability&amp;#34;]
 CH7[&amp;#34;&amp;lt;br/&amp;gt;Optimization&amp;#34;]
 CH9[&amp;#34;&amp;lt;br/&amp;gt;Regression&amp;#34;]
 CH10[&amp;#34;&amp;lt;br/&amp;gt;Dimensionality Reduction&amp;#34;]
 CH11[&amp;#34;&amp;lt;br/&amp;gt;Density Estimation&amp;#34;]
 CH12[&amp;#34;&amp;lt;br/&amp;gt;Classification&amp;#34;]

 %% Relationships
 DQ --&amp;gt;|defines| PD
 PD --&amp;gt;|collected in| JH
 JH --&amp;gt;|used in| TS
 JH --&amp;gt;|used in| CH6
	
 PD --&amp;gt;|used in| CH7
 PD --&amp;gt;|used in| CH9
 PD --&amp;gt;|used in| CH10
 PD --&amp;gt;|used in| CH11
 PD --&amp;gt;|used in| CH12

 %% Styling (Your Soft Academic Palette)
 style PD fill:#90CAF9,stroke:#1E88E5,color:#000

 style DQ fill:#CE93D8,stroke:#8E24AA,color:#000
 style JH fill:#CE93D8,stroke:#8E24AA,color:#000
 style TS fill:#CE93D8,stroke:#8E24AA,color:#000
 style CH6 fill:#CE93D8,stroke:#8E24AA,color:#000
	
 style CH7 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH9 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH10 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH11 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH12 fill:#C8E6C9,stroke:#2E7D32,color:#000

&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Continuous Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/</guid><description>&lt;h1 id="continuous-optimisation">
 Continuous Optimisation
 
 &lt;a class="anchor" href="#continuous-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Optimisation finds parameters that minimise (or maximise) an objective function.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Optimisation using Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/</guid><description>&lt;h1 id="optimisation-using-gradient-descent">
 Optimisation using Gradient Descent
 
 &lt;a class="anchor" href="#optimisation-using-gradient-descent">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an optimisation algorithm used to train ML and neural networks.&lt;/p>
&lt;ul>
&lt;li>Gradient descent updates parameters by moving opposite the gradient.&lt;/li>
&lt;/ul>
&lt;p>Trains ML models by minimising errors:&lt;/p>
&lt;ul>
&lt;li>between predicted and actual results&lt;/li>
&lt;li>by iteratively adjusting its parameters&lt;/li>
&lt;li>moves step‑by‑step in the direction of the steepest decrease in the loss function, it helps ML models learn the best possible weights for better predictions&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="types-of-gradient-gescent-learning-algorithms">
 Types of Gradient Gescent learning algorithms
 
 &lt;a class="anchor" href="#types-of-gradient-gescent-learning-algorithms">#&lt;/a>
 
&lt;/h2>
&lt;ol>
&lt;li>Batch gradient descent&lt;/li>
&lt;li>Stochastic gradient descent&lt;/li>
&lt;li>Mini-batch gradient descent&lt;/li>
&lt;/ol>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Constrained Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/</guid><description>&lt;h1 id="constrained-optimisation">
 Constrained Optimisation
 
 &lt;a class="anchor" href="#constrained-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Optimisation with constraints (equalities/inequalities).&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Lagrange Multipliers</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/</guid><description>&lt;h1 id="lagrange-multipliers">
 Lagrange Multipliers
 
 &lt;a class="anchor" href="#lagrange-multipliers">#&lt;/a>
 
&lt;/h1>
&lt;p>Transforms constrained problems into unconstrained ones using Lagrangians.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Convex Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/</guid><description>&lt;h1 id="convex-optimisation">
 Convex Optimisation
 
 &lt;a class="anchor" href="#convex-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Convex objectives have a single global minimum, making optimisation reliable.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Nonlinear Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/</guid><description>&lt;h1 id="nonlinear-optimisation-in-machine-learning">
 Nonlinear Optimisation in Machine Learning
 
 &lt;a class="anchor" href="#nonlinear-optimisation-in-machine-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Practical training challenges and modern optimisers used in ML.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Challenges in Gradient-Based Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/</guid><description>&lt;h1 id="challenges-in-gradient-based-optimisation">
 Challenges in Gradient-Based Optimisation
 
 &lt;a class="anchor" href="#challenges-in-gradient-based-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Local optima and flat regions&lt;/li>
&lt;li>Differential curvature&lt;/li>
&lt;li>Difficult topologies (cliffs and valleys)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Stochastic Gradient Descent (SGD)</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/</guid><description>&lt;h1 id="stochastic-gradient-descent-sgd">
 Stochastic Gradient Descent (SGD)
 
 &lt;a class="anchor" href="#stochastic-gradient-descent-sgd">#&lt;/a>
 
&lt;/h1>
&lt;p>SGD uses mini-batches to trade exact gradients for speed and generalisation.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Momentum-Based Learning</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/</guid><description>&lt;h1 id="momentum-based-learning">
 Momentum-Based Learning
 
 &lt;a class="anchor" href="#momentum-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Momentum smooths updates and helps traverse valleys efficiently.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Adaptive Methods: AdaGrad, RMSProp, Adam</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/</guid><description>&lt;h1 id="adaptive-methods-adagrad-rmsprop-adam">
 Adaptive Methods: AdaGrad, RMSProp, Adam
 
 &lt;a class="anchor" href="#adaptive-methods-adagrad-rmsprop-adam">#&lt;/a>
 
&lt;/h1>
&lt;p>Adaptive methods adjust learning rates per-parameter.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Tuning Hyperparameters and Preprocessing</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/</guid><description>&lt;h1 id="tuning-hyperparameters-and-preprocessing">
 Tuning Hyperparameters and Preprocessing
 
 &lt;a class="anchor" href="#tuning-hyperparameters-and-preprocessing">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Learning rate schedules&lt;/li>
&lt;li>Initialisation&lt;/li>
&lt;li>Tuning hyperparameters&lt;/li>
&lt;li>Importance of feature preprocessing&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Dimensionality reduction and PCA</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/</guid><description>&lt;h1 id="dimensionality-reduction-and-pca">
 Dimensionality reduction and PCA
 
 &lt;a class="anchor" href="#dimensionality-reduction-and-pca">#&lt;/a>
 
&lt;/h1>
&lt;p>PCA and SVM connect linear algebra, geometry, and optimisation.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/">
 Linear Algebra
&lt;/a>&lt;/p></description></item><item><title>Principal Component Analysis (PCA)</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/</guid><description>&lt;h1 id="principal-component-analysis-pca">
 Principal Component Analysis (PCA)
 
 &lt;a class="anchor" href="#principal-component-analysis-pca">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>dimensionality reduction technique&lt;/li>
&lt;li>helps us to &lt;strong>reduce the number of features&lt;/strong> in a dataset while keeping the most important information.&lt;/li>
&lt;li>changes complex datasets by transforming correlated features into a smaller set of uncorrelated components.&lt;/li>
&lt;li>uses &lt;strong>linear algebra&lt;/strong> to transform data into &lt;strong>new features&lt;/strong> called principal components.&lt;/li>
&lt;li>finds these by calculating &lt;strong>eigenvectors (directions)&lt;/strong> and &lt;strong>eigenvalues (importance)&lt;/strong> from the &lt;strong>covariance matrix&lt;/strong>.&lt;/li>
&lt;li>PCA &lt;strong>selects the top components with the highest eigenvalues&lt;/strong> and &lt;strong>projects the data onto them simplify the dataset&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint default">
&lt;p>PCA prioritizes the directions where the data varies the most because more variation = more useful information.&lt;/p></description></item><item><title>PCA Theory</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/</guid><description>&lt;h1 id="pca-theory">
 PCA Theory
 
 &lt;a class="anchor" href="#pca-theory">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Problem setting&lt;/li>
&lt;li>Maximum variance perspective&lt;/li>
&lt;li>Projection perspective&lt;/li>
&lt;li>Eigenvector and low-rank approximations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>PCA in Practice</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/</guid><description>&lt;h1 id="pca-in-practice">
 PCA in Practice
 
 &lt;a class="anchor" href="#pca-in-practice">#&lt;/a>
 
&lt;/h1>
&lt;p>Key steps of PCA in practice, including considerations in high dimensions.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Latent Variable Perspective</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/</guid><description>&lt;h1 id="latent-variable-perspective">
 Latent Variable Perspective
 
 &lt;a class="anchor" href="#latent-variable-perspective">#&lt;/a>
 
&lt;/h1>
&lt;p>PCA can be interpreted as modelling data using a smaller number of latent variables.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Mathematical Preliminaries of SVM</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/</guid><description>&lt;h1 id="mathematical-preliminaries-of-svm">
 Mathematical Preliminaries of SVM
 
 &lt;a class="anchor" href="#mathematical-preliminaries-of-svm">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Primal and dual perspectives&lt;/li>
&lt;li>Geometry of margins&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Nonlinear SVM and Kernels</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/</guid><description>&lt;h1 id="nonlinear-svm-and-kernels">
 Nonlinear SVM and Kernels
 
 &lt;a class="anchor" href="#nonlinear-svm-and-kernels">#&lt;/a>
 
&lt;/h1>
&lt;p>Kernels allow inner products in high-dimensional feature spaces without explicit mapping.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item></channel></rss>