<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on Arshad Siddiqui</title><link>https://arshadhs.github.io/categories/ai/</link><description>Recent content in AI on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 22 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://arshadhs.github.io/categories/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Formula Sheet</title><link>https://arshadhs.github.io/docs/ai/statistics/00_formulas/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/00_formulas/</guid><description>&lt;h1 id="formula-sheet">
 Formula Sheet
 
 &lt;a class="anchor" href="#formula-sheet">#&lt;/a>
 
&lt;/h1>
&lt;p>This page is a quick reference of &lt;strong>definitions + formulas&lt;/strong>, grouped by the modules.&lt;/p>
&lt;hr>
&lt;h2 id="notation">
 Notation
 
 &lt;a class="anchor" href="#notation">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Sample size: 
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( n \)
 &lt;/span>

 (sample), 
&lt;span>
 \( N \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Sample mean: 
&lt;span>
 \( \bar{x} \)
 &lt;/span>

, population mean: 
&lt;span>
 \( \mu \)
 &lt;/span>

&lt;/li>
&lt;li>Sample variance: 
&lt;span>
 \( s^2 \)
 &lt;/span>

, population variance: 
&lt;span>
 \( \sigma^2 \)
 &lt;/span>

&lt;/li>
&lt;li>Sample SD: 
&lt;span>
 \( s \)
 &lt;/span>

, population SD: 
&lt;span>
 \( \sigma \)
 &lt;/span>

&lt;/li>
&lt;li>Complement: 
&lt;span>
 \( A^c \)
 &lt;/span>

&lt;/li>
&lt;li>Intersection (“and”): 
&lt;span>
 \( A\cap B \)
 &lt;/span>

, union (“or”): 
&lt;span>
 \( A\cup B \)
 &lt;/span>

&lt;/li>
&lt;li>Conditional probability: 
&lt;span>
 \( P(A\mid B) \)
 &lt;/span>

&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="1-basic-probability--statistics">
 1. Basic Probability &amp;amp; Statistics
 
 &lt;a class="anchor" href="#1-basic-probability--statistics">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="11-measures-of-central-tendency">
 1.1 Measures of Central Tendency
 
 &lt;a class="anchor" href="#11-measures-of-central-tendency">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="arithmetic-mean">
 Arithmetic mean
 
 &lt;a class="anchor" href="#arithmetic-mean">#&lt;/a>
 
&lt;/h3>
&lt;p>Sample mean (ungrouped):&lt;/p></description></item><item><title>Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</guid><description>&lt;h1 id="supervised-learning">
 Supervised Learning
 
 &lt;a class="anchor" href="#supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Trained using &lt;strong>labelled data&lt;/strong>.&lt;br>
Each example in the training set includes the &lt;strong>correct output&lt;/strong>.&lt;br>
The algorithm learns to &lt;strong>generalise&lt;/strong> and make predictions on unseen data.&lt;br>
Generally more &lt;strong>accurate&lt;/strong> than unsupervised methods.&lt;br>
Requires &lt;strong>human intervention&lt;/strong> for labelling and setup.&lt;br>
Widely used due to its &lt;strong>accuracy and efficiency&lt;/strong>.&lt;br>
Produces &lt;strong>highly accurate results&lt;/strong> when trained on good-quality labelled data.&lt;/p>
&lt;hr>
&lt;h2 id="classification">
 Classification
 
 &lt;a class="anchor" href="#classification">#&lt;/a>
 
&lt;/h2>
&lt;p>Output is &lt;strong>discrete&lt;/strong> (e.g. Yes/No, Spam/Not Spam).&lt;br>
Used for &lt;strong>categorising data&lt;/strong> into predefined classes.&lt;br>
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).&lt;/p></description></item><item><title>Differentiation of Univariate Functions</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/</guid><description>&lt;h1 id="differentiation-of-univariate-functions">
 Differentiation of Univariate Functions
 
 &lt;a class="anchor" href="#differentiation-of-univariate-functions">#&lt;/a>
 
&lt;/h1>
&lt;p>Differentiation measures rate of change.&lt;/p>
&lt;p>For a function f(x), the derivative measures the rate of change.&lt;/p>
&lt;span style="color: red;">
 $[
f'(x) = $lim_{h $to 0} $frac{f(x+h)-f(x)}{h}
$]
&lt;/span>
&lt;p>Interpretation:&lt;/p>
&lt;ul>
&lt;li>Slope of tangent&lt;/li>
&lt;li>Instantaneous rate of change&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Artificial Intelligence</title><link>https://arshadhs.github.io/docs/ai/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/</guid><description>&lt;h1 id="my-ai-notes">
 My AI Notes
 
 &lt;a class="anchor" href="#my-ai-notes">#&lt;/a>
 
&lt;/h1>
&lt;p>Learning how machines learn! My working notes as I learn AI.&lt;/p>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 AI[Artificial Intelligence]
 ML[Machine Learning]
 DL[Deep Learning]
 FM[Foundation Models]
 LLM[LLM Models]

 AI --&amp;gt; ML
 ML --&amp;gt; DL
 DL --&amp;gt; FM
 FM --&amp;gt; LLM

 style AI fill:#E1F5FE
 style ML fill:#C8E6C9
 style DL fill:#90CAF9
 style FM fill:#64B5F6
 style LLM fill:#FFCCBC
&lt;/pre>

&lt;hr>
&lt;ul>
&lt;li>Mathematical Foundations for Machine Learning&lt;/li>
&lt;li>Statistical Methods&lt;/li>
&lt;li>Machine Learning&lt;/li>
&lt;li>Deep Neural Networks&lt;/li>
&lt;/ul>
&lt;hr>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/">AI Foundation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stages/">AI Stages: ANI, AGI, ASI&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stack/">AI Stack&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/">AI Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-notes/">AI Learning Resources&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">Machine Learning&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/">Supervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/">Unsupervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/">Semi-Supervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/">Reinforcement Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/">ML Workflow&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/">Regression(Linear Models)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/">Ordinary Least Squares&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/">Cost Function&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/">Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/">Classification(Linear Models)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/">Decision Tree&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/">Instance-based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/">Support Vector Machine&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/">Bayesian Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/">Ensemble Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/">Evaluation/Comparison&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/99-ml-pipeline-model/">ML Pipeline&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/">Generative AI&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/foundation-model/">Foundation Models&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/llm/">LLM - Model&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/ai-agents/">AI Agents&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/rag/">Retrieval-Augmented Generation (RAG)&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/">Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/">Artificial Neuron and Perceptron&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/030-linear-neural-networks-for-regression/">LNN for Regression&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/035-gradient-descent-algorithm/">Gradient Descent Algorithm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/040-linear-neural-networks-for-classification/">LNN for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/050-deep-feedforward/">Deep Feedforward Neural Networks (DFNN) for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/060-cnn-fundamentals/">Convolutional Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/065-deep-cnn-architectures/">Deep CNN Architectures&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/067-cnn-model/">CNN Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/070-recurrent-nn/">Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/075-recurrent-nn-deep/">Deep Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/">Attention Mechanism&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/">Transformer&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/">Optimisation of Deep models&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/">Regularisation for Deep models&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/">Mathematical Foundation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/">Linear Algebra&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/">Linear Systems&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/">Systems of Linear Equations&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/">Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/matrix-transposition/">Matrix Transposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/">Solving Linear Systems&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/">Forward and Backward Substitution&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/">Inverse Matrix&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/">Convex Combination&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/">Vector Spaces&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/020-basis-and-rank/">Basis and Rank&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/">Linear Independence&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/030-norm/">Norm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/">Inner Products and Dot Product&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/050-lengths-and-distances/">Lengths and Distances&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/">Angles and Orthogonality&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/070-orthonormal-basis/">Orthonormal Basis&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/">Feature Space&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/">Cauchy–Schwarz&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">Matrix Decompositions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/">Characteristic Polynomial&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/">Determinant and Trace&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/">Eigenvalues and Eigenvectors&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/">Cholesky Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/">Eigen Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/">Diagonalization&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/">Singular Value Decomposition (SVD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/">Matrix Approximation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">Dimensionality reduction and PCA&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">Vector Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">Continuous Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">Nonlinear Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">Statistics&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/00_formulas/">Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/">Stats Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/">Basic Statistics&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/">Basic Probability&lt;/a>
 &lt;/li>
 
 
 
 
 
 
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/">Hypothesis Testing&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/">Prediction &amp;amp; Forecasting&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/">Gaussian Mixture model &amp;amp; Expectation Maximization&lt;/a>
 &lt;/li>
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/">Conditional Probability &amp;amp; Bayes’ Theorem&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/">Conditional Probability&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/">Bayes’ Theorem&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/">Naïve Bayes&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/">Probability Distributions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/">Random Variables&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/">Common Probability Distributions&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;ul>
&lt;li>Machine Learning → The broad field where systems learn patterns from data to make predictions or decisions.&lt;/li>
&lt;li>Neural Networks → A subset of machine learning that uses interconnected artificial neurons to model complex relationships.&lt;/li>
&lt;li>Deep Learning → A subset of neural networks that uses many hidden layers to learn high-level features from large datasets.&lt;/li>
&lt;li>Foundation Models → Large deep learning models trained on massive datasets and reused across many tasks using transfer learning.&lt;/li>
&lt;li>LLMs (Large Language Models) → A specialised type of foundation model focused on understanding and generating human language.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
AI[&amp;#34;Artificial&amp;lt;br/&amp;gt;Intelligence&amp;#34;]
ML[&amp;#34;Machine&amp;lt;br/&amp;gt;Learning&amp;#34;]
NN[&amp;#34;Neural&amp;lt;br/&amp;gt;Networks&amp;#34;]
DL[&amp;#34;Deep&amp;lt;br/&amp;gt;Learning&amp;#34;]
FM[&amp;#34;Foundation&amp;lt;br/&amp;gt;Models&amp;#34;]
LLM[&amp;#34;LLM&amp;lt;br/&amp;gt;Models&amp;#34;]

AI --&amp;gt; ML
ML --&amp;gt; NN
NN --&amp;gt; DL
DL --&amp;gt; FM
FM --&amp;gt; LLM

LR[&amp;#34;Linear&amp;lt;br/&amp;gt;Regression&amp;#34;]
DT[&amp;#34;Decision&amp;lt;br/&amp;gt;Trees&amp;#34;]
ML --&amp;gt; LR
ML --&amp;gt; DT

MLP[&amp;#34;MLP&amp;#34;]
CNN[&amp;#34;CNN&amp;#34;]
NN --&amp;gt; MLP
NN --&amp;gt; CNN

CNNDL[&amp;#34;CNN&amp;lt;br/&amp;gt;(deep)&amp;#34;]
RNN[&amp;#34;RNN&amp;#34;]
DL --&amp;gt; CNNDL
DL --&amp;gt; RNN

BERT[&amp;#34;BERT&amp;#34;]
CLIP[&amp;#34;CLIP&amp;#34;]
FM --&amp;gt; BERT
FM --&amp;gt; CLIP

GPT[&amp;#34;GPT&amp;#34;]
LLAMA[&amp;#34;LLaMA&amp;#34;]
LLM --&amp;gt; GPT
LLM --&amp;gt; LLAMA

TEXT[&amp;#34;Text&amp;#34;]
IMAGE[&amp;#34;Images&amp;#34;]
AUDIO[&amp;#34;Audio&amp;#34;]
VIDEO[&amp;#34;Video&amp;#34;]
LLM --&amp;gt; TEXT
LLM --&amp;gt; IMAGE
LLM --&amp;gt; AUDIO
LLM --&amp;gt; VIDEO

style AI fill:#90CAF9,stroke:#1E88E5,color:#000
style ML fill:#90CAF9,stroke:#1E88E5,color:#000
style NN fill:#90CAF9,stroke:#1E88E5,color:#000

style DL fill:#CE93D8,stroke:#8E24AA,color:#000
style FM fill:#CE93D8,stroke:#8E24AA,color:#000

style LLM fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style DT fill:#C8E6C9,stroke:#2E7D32,color:#000
style MLP fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNNDL fill:#C8E6C9,stroke:#2E7D32,color:#000
style RNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style BERT fill:#C8E6C9,stroke:#2E7D32,color:#000
style CLIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style GPT fill:#C8E6C9,stroke:#2E7D32,color:#000
style LLAMA fill:#C8E6C9,stroke:#2E7D32,color:#000
style TEXT fill:#C8E6C9,stroke:#2E7D32,color:#000
style IMAGE fill:#C8E6C9,stroke:#2E7D32,color:#000
style AUDIO fill:#C8E6C9,stroke:#2E7D32,color:#000
style VIDEO fill:#C8E6C9,stroke:#2E7D32,color:#000
&lt;/pre>

&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/ai_ml_dl_ds_diagram.png" alt="AI, ML, DL, and Data Science Diagram" />&lt;/p></description></item><item><title>Stats Formula Sheet</title><link>https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/</link><pubDate>Wed, 25 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/</guid><description>&lt;h1 id="stats-formula-sheet">
 Stats Formula Sheet
 
 &lt;a class="anchor" href="#stats-formula-sheet">#&lt;/a>
 
&lt;/h1>
&lt;p>Keep this page as a quick reference of &lt;strong>definitions + formulas&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="notation">
 Notation
 
 &lt;a class="anchor" href="#notation">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Sample size: 
&lt;span>
 \( n \)
 &lt;/span>

 (sample), 
&lt;span>
 \( N \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Mean: 
&lt;span>
 \( \bar{x} \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \mu \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Variance: 
&lt;span>
 \( s^2 \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \sigma^2 \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Standard deviation: 
&lt;span>
 \( s \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \sigma \)
 &lt;/span>

 (population)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="module-1-basic-statistics">
 Module 1: Basic Statistics
 
 &lt;a class="anchor" href="#module-1-basic-statistics">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="measures-of-central-tendency">
 Measures of Central Tendency
 
 &lt;a class="anchor" href="#measures-of-central-tendency">#&lt;/a>
 
&lt;/h3>
&lt;p>&lt;strong>Sample mean (ungrouped):&lt;/strong>&lt;/p></description></item><item><title>Unsupervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</guid><description>&lt;h1 id="unsupervised-learning">
 Unsupervised Learning
 
 &lt;a class="anchor" href="#unsupervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Works on &lt;strong>unlabelled raw data&lt;/strong>.&lt;/li>
&lt;li>The algorithm &lt;strong>discovers hidden patterns&lt;/strong> without prior knowledge of outcomes.&lt;/li>
&lt;li>Requires &lt;strong>no human intervention&lt;/strong> during training.&lt;/li>
&lt;li>Does not make direct predictions — it &lt;strong>groups or organises data&lt;/strong> instead.&lt;/li>
&lt;li>Carries a &lt;strong>higher risk&lt;/strong> because there’s no ground truth to verify results.&lt;/li>
&lt;li>Common techniques include &lt;strong>Clustering&lt;/strong>, &lt;strong>Association&lt;/strong>, and &lt;strong>Dimensionality Reduction&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ML maths-based colours (same palette as supervised)
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151

 %% Root
 USL: Unsupervised Learning

 %% Main branches
 USL --&amp;gt; CLU:::category
 CLU: Clustering

 USL --&amp;gt; DR:::category
 DR: Dimensionality Reduction

 %% Clustering algorithms
 CLU --&amp;gt; KM:::geometry
 KM: K-Means

 CLU --&amp;gt; HC:::geometry
 HC: Hierarchical Clustering

 CLU --&amp;gt; DB:::geometry
 DB: DBSCAN

 %% Probabilistic models
 USL --&amp;gt; PM:::category
 PM: Probabilistic Models

 PM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 PM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;h2 id="clustering">
 Clustering
 
 &lt;a class="anchor" href="#clustering">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Groups &lt;strong>similar data points&lt;/strong> together based on shared features.&lt;/li>
&lt;li>Commonly used for &lt;strong>market segmentation&lt;/strong>, &lt;strong>image compression&lt;/strong>, and &lt;strong>anomaly detection&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-types-of-clustering">
 Common Types of Clustering
 
 &lt;a class="anchor" href="#common-types-of-clustering">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>K-Means Clustering&lt;/strong> – Divides data into &lt;em>K&lt;/em> groups based on similarity.&lt;/li>
&lt;li>&lt;strong>Hierarchical Clustering&lt;/strong> – Builds a hierarchy (tree) of clusters.&lt;/li>
&lt;li>&lt;strong>DBSCAN (Density-Based Spatial Clustering)&lt;/strong> – Groups points close in density; identifies noise/outliers.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="association">
 Association
 
 &lt;a class="anchor" href="#association">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Identifies &lt;strong>relationships or correlations&lt;/strong> between variables in a dataset.&lt;/li>
&lt;li>Commonly used in &lt;strong>market basket analysis&lt;/strong> (e.g. &amp;ldquo;Customers who bought X also bought Y&amp;rdquo;).&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Apriori Algorithm&lt;/strong> – Finds frequent itemsets and generates association rules.&lt;/li>
&lt;li>&lt;strong>Eclat Algorithm&lt;/strong> – Similar to Apriori but uses set intersections for faster computation.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="dimensionality-reduction">
 Dimensionality Reduction
 
 &lt;a class="anchor" href="#dimensionality-reduction">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Reduces the &lt;strong>number of input variables&lt;/strong> to simplify data.&lt;/li>
&lt;li>Helps remove noise and redundancy.&lt;/li>
&lt;li>Commonly used in &lt;strong>data pre-processing&lt;/strong> and &lt;strong>visualisation&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques-1">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques-1">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Principal Component Analysis (PCA)&lt;/strong> – Projects data onto fewer dimensions while keeping most variance.&lt;/li>
&lt;li>&lt;strong>Linear Discriminant Analysis (LDA)&lt;/strong> – Focuses on class separation.&lt;/li>
&lt;li>&lt;strong>t-SNE (t-Distributed Stochastic Neighbour Embedding)&lt;/strong> – Used for visualising high-dimensional data.&lt;/li>
&lt;li>&lt;strong>Autoencoders&lt;/strong> – Neural networks that compress and reconstruct data.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
mindmap
 root(Unsupervised Learning)
 Clustering
 K Means
 Hierarchical Clustering
 DBSCAN
 Dimensionality Reduction
 PCA
 t SNE
 Autoencoders
 Probabilistic Models
 Gaussian Mixture Model
 Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Partial Differentiation and Gradients</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/</guid><description>&lt;h1 id="partial-differentiation-and-gradients">
 Partial Differentiation and Gradients
 
 &lt;a class="anchor" href="#partial-differentiation-and-gradients">#&lt;/a>
 
&lt;/h1>
&lt;p>For f(x1, x2, &amp;hellip;, xn):&lt;/p>
&lt;span style="color: red;">
 [
\frac{\partial f}{\partial x_i}
]
&lt;/span>
&lt;p>Gradient vector:&lt;/p>
&lt;span style="color: red;">
 [
\nabla f =
\begin{bmatrix}
\frac{\partial f}{\partial x_1} \
\vdots \
\frac{\partial f}{\partial x_n}
\end{bmatrix}
]
&lt;/span>
&lt;p>Gradient points in direction of steepest ascent.&lt;/p>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 Input --&amp;gt; Function
 Function --&amp;gt; Gradient
 Gradient --&amp;gt; Optimisation
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Linear Independence</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/</guid><description>&lt;h1 id="linear-independence">
 Linear Independence
 
 &lt;a class="anchor" href="#linear-independence">#&lt;/a>
 
&lt;/h1>
&lt;p>A set of vectors is &lt;strong>linearly independent&lt;/strong> if none of them can be written as a linear combination of the others.&lt;/p>

&lt;span style="color: green;">
 &lt;span>
 \[ 
c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}
\;\Rightarrow\;
c_1=\cdots=c_k=0
 \]
 &lt;/span>

&lt;/span>
&lt;p>Independence means each vector adds &lt;strong>new information&lt;/strong>.&lt;/p>
&lt;h2 id="why-it-matters">
 Why it matters
 
 &lt;a class="anchor" href="#why-it-matters">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Detects redundancy&lt;/li>
&lt;li>Connects to rank and basis&lt;/li>
&lt;/ul>
&lt;p>If one vector can already be formed using others, it does not add anything new.&lt;/p></description></item><item><title>Semi-Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</guid><description>&lt;h1 id="semi-supervised-learning">
 Semi-Supervised Learning
 
 &lt;a class="anchor" href="#semi-supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A combination of &lt;strong>labelled&lt;/strong> and &lt;strong>unlabelled data&lt;/strong>.&lt;/li>
&lt;li>Useful when labelling large datasets is &lt;strong>expensive or time-consuming&lt;/strong>.&lt;/li>
&lt;li>Works well with &lt;strong>high-volume datasets&lt;/strong> (e.g. millions of images).&lt;/li>
&lt;li>Only a &lt;strong>small fraction of data&lt;/strong> is labelled (e.g. a few thousand).&lt;/li>
&lt;li>The algorithm learns from both labelled examples and structure in unlabelled data.&lt;/li>
&lt;li>&lt;strong>Ideal for medical imaging&lt;/strong> where labelled data is limited.&lt;/li>
&lt;li>For example, a &lt;strong>radiologist&lt;/strong> can label a small set of medical scans,&lt;br>
and the model uses that to learn from thousands of unlabelled scans.&lt;/li>
&lt;li>Helps improve &lt;strong>accuracy and generalisation&lt;/strong> with minimal manual effort.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Gradients of Vector-Valued and Matrix Functions</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/</guid><description>&lt;h1 id="gradients-of-vector-valued-and-matrix-functions">
 Gradients of Vector-Valued and Matrix Functions
 
 &lt;a class="anchor" href="#gradients-of-vector-valued-and-matrix-functions">#&lt;/a>
 
&lt;/h1>
&lt;p>Covers gradients when outputs or parameters are vectors/matrices.&lt;/p>
&lt;p>If f: R^n -&amp;gt; R^m, the derivative is the Jacobian.&lt;/p>
&lt;span style="color: red;">
 [
J =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} &amp;amp; \dots &amp;amp; \frac{\partial f_1}{\partial x_n} \
\vdots &amp;amp; \ddots &amp;amp; \vdots \
\frac{\partial f_m}{\partial x_1} &amp;amp; \dots &amp;amp; \frac{\partial f_m}{\partial x_n}
\end{bmatrix}
]
&lt;/span>
&lt;p>For scalar f(x):&lt;/p>
&lt;span style="color: red;">
 [
H = \nabla^2 f
]
&lt;/span>
&lt;p>Hessian captures curvature.&lt;/p></description></item><item><title>Reinforcement Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</guid><description>&lt;h1 id="reinforcement-learning-rl">
 Reinforcement Learning (RL)
 
 &lt;a class="anchor" href="#reinforcement-learning-rl">#&lt;/a>
 
&lt;/h1>
&lt;p>RL is learning by &lt;strong>trial and error&lt;/strong>.&lt;/p>
&lt;p>Reinforcement Learning (RL) is a type of machine learning where an &lt;strong>autonomous agent learns to make decisions by interacting with an environment&lt;/strong>.&lt;/p>
&lt;p>Instead of being told the correct answer, the agent:&lt;/p>
&lt;ul>
&lt;li>takes actions&lt;/li>
&lt;li>observes outcomes&lt;/li>
&lt;li>receives rewards or penalties&lt;/li>
&lt;li>gradually learns a strategy that maximises long-term reward&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>&lt;strong>Reinforcement Learning teaches an agent how to act, not what to predict.&lt;/strong>&lt;/p></description></item><item><title>Useful Gradient Identities</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/</guid><description>&lt;h1 id="useful-gradient-identities">
 Useful Gradient Identities
 
 &lt;a class="anchor" href="#useful-gradient-identities">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: red;">
 [
\nabla (a^T x) = a
]
&lt;/span>
&lt;span style="color: red;">
 [
\nabla (x^T A x) = (A + A^T)x
]
&lt;/span>
&lt;p>If A symmetric:&lt;/p>
&lt;span style="color: red;">
 [
\nabla (x^T A x) = 2Ax
]
&lt;/span>
&lt;p>These are heavily used in &lt;strong>optimisation&lt;/strong>.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Inner Products and Dot Product</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/</guid><description>&lt;h1 id="inner-products-and-dot-product">
 Inner Products and Dot Product
 
 &lt;a class="anchor" href="#inner-products-and-dot-product">#&lt;/a>
 
&lt;/h1>
&lt;p>An &lt;strong>inner product&lt;/strong> maps two vectors to a &lt;strong>single scalar&lt;/strong>.&lt;/p>
&lt;p>It allows us to measure:&lt;/p>
&lt;ul>
&lt;li>similarity&lt;/li>
&lt;li>vector length&lt;/li>
&lt;li>projections&lt;/li>
&lt;li>orthogonality&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Inner&amp;lt;br/&amp;gt;products&amp;lt;br/&amp;gt;(types)&amp;#34;] --&amp;gt; DOT[&amp;#34;Euclidean&amp;lt;br/&amp;gt;Dot product&amp;#34;]
T --&amp;gt; WIP[&amp;#34;Weighted&amp;lt;br/&amp;gt;inner product&amp;#34;]
T --&amp;gt; FN[&amp;#34;Function-space&amp;lt;br/&amp;gt;(integral)&amp;#34;]
T --&amp;gt; HERM[&amp;#34;Complex&amp;lt;br/&amp;gt;Hermitian&amp;#34;]
T --&amp;gt; MAT[&amp;#34;Matrix&amp;lt;br/&amp;gt;inner product&amp;lt;br/&amp;gt;(Frobenius)&amp;#34;]

DOT --&amp;gt; Rn[&amp;#34;Vectors in&amp;lt;br/&amp;gt;
&amp;lt;span&amp;gt;
 \( \mathbb{R}^n \)
 &amp;lt;/span&amp;gt;

&amp;#34;]
WIP --&amp;gt; SPD[&amp;#34;SPD matrix&amp;lt;br/&amp;gt;W&amp;#34;]
FN --&amp;gt; L2[&amp;#34;L2 space&amp;lt;br/&amp;gt;functions&amp;#34;]
HERM --&amp;gt; Cn[&amp;#34;Vectors in&amp;lt;br/&amp;gt;C^n&amp;#34;]
MAT --&amp;gt; Mnm[&amp;#34;Matrices&amp;lt;br/&amp;gt;R^{m×n}&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style DOT fill:#C8E6C9,stroke:#2E7D32,color:#000
style WIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style FN fill:#C8E6C9,stroke:#2E7D32,color:#000
style HERM fill:#C8E6C9,stroke:#2E7D32,color:#000
style MAT fill:#C8E6C9,stroke:#2E7D32,color:#000

style Rn fill:#CE93D8,stroke:#8E24AA,color:#000
style SPD fill:#CE93D8,stroke:#8E24AA,color:#000
style L2 fill:#CE93D8,stroke:#8E24AA,color:#000
style Cn fill:#CE93D8,stroke:#8E24AA,color:#000
style Mnm fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>For vectors&lt;br>

&lt;span>
 \( \mathbf{a}, \mathbf{b} \in \mathbb{R}^n \)
 &lt;/span>

&lt;/p></description></item><item><title>Backpropagation and Automatic Differentiation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/</guid><description>&lt;h1 id="backpropagation-and-automatic-differentiation">
 Backpropagation and Automatic Differentiation
 
 &lt;a class="anchor" href="#backpropagation-and-automatic-differentiation">#&lt;/a>
 
&lt;/h1>
&lt;p>Backpropagation applies the chain rule:&lt;/p>
&lt;ul>
&lt;li>efficiently across a computational graph.&lt;/li>
&lt;li>repeatedly.&lt;/li>
&lt;/ul>
&lt;p>Chain rule:&lt;/p>
&lt;span style="color: red;">
 [
\frac{dL}{dx} = \frac{dL}{dy} \cdot \frac{dy}{dx}
]
&lt;/span>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 x --&amp;gt; y
 y --&amp;gt; L
&lt;/pre>

&lt;p>Automatic differentiation computes exact derivatives efficiently using computational graphs.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Higher-order derivatives</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/</guid><description>&lt;h1 id="higher-order-derivatives">
 Higher-order derivatives
 
 &lt;a class="anchor" href="#higher-order-derivatives">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Angles and Orthogonality</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/</guid><description>&lt;h1 id="angles-and-orthogonality">
 Angles and Orthogonality
 
 &lt;a class="anchor" href="#angles-and-orthogonality">#&lt;/a>
 
&lt;/h1>
&lt;p>Once we define an inner product, we can define the &lt;strong>angle between two vectors&lt;/strong>.&lt;/p>
&lt;p>Angles allow us to measure how aligned or different two vectors are in space.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Angle measures similarity between vectors.
Orthogonality means complete independence (no similarity).&lt;/p>
&lt;/blockquote>
&lt;h2 id="why-it-matters-in-machine-learning">
 Why It Matters in Machine Learning
 
 &lt;a class="anchor" href="#why-it-matters-in-machine-learning">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>PCA produces orthogonal components&lt;/li>
&lt;li>Orthogonal features reduce redundancy&lt;/li>
&lt;li>Gradient directions depend on angle&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="angle-formula">
 Angle Formula
 
 &lt;a class="anchor" href="#angle-formula">#&lt;/a>
 
&lt;/h1>
&lt;p>For vectors in n-dimensional space:&lt;/p></description></item><item><title>Taylor’s series</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/</guid><description>&lt;h1 id="linearization-and-multivariate-taylors-series">
 Linearization and multivariate Taylor’s series
 
 &lt;a class="anchor" href="#linearization-and-multivariate-taylors-series">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>Maxima and Minima</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/</guid><description>&lt;h1 id="computing-maxima-and-minima-for-unconstrained-optimization">
 Computing maxima and minima for unconstrained optimization
 
 &lt;a class="anchor" href="#computing-maxima-and-minima-for-unconstrained-optimization">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">
 Vector Calculus
&lt;/a>&lt;/p></description></item><item><title>AI Foundation</title><link>https://arshadhs.github.io/docs/ai/foundation/</link><pubDate>Mon, 26 Jan 2026 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/</guid><description>&lt;h1 id="ai">
 AI
 
 &lt;a class="anchor" href="#ai">#&lt;/a>
 
&lt;/h1>
&lt;p>A selection of notes that didn&amp;rsquo;t fit elsewhere or are being worked on!.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stages/">AI Stages: ANI, AGI, ASI&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stack/">AI Stack&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/">AI Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-notes/">AI Learning Resources&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;a href="https://arshadhs.github.io/">Home&lt;/a></description></item><item><title>AI Stages: ANI, AGI, ASI</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</guid><description>&lt;h1 id="ai-development-stages-ani--agi--asi">
 AI Development Stages: ANI → AGI → ASI
 
 &lt;a class="anchor" href="#ai-development-stages-ani--agi--asi">#&lt;/a>
 
&lt;/h1>
&lt;p>Artificial Intelligence is often described in &lt;strong>three stages&lt;/strong>, based on capability and scope:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>ANI:&lt;/strong> Task-specific intelligence (today’s AI)&lt;/li>
&lt;li>&lt;strong>AGI:&lt;/strong> Human-level general intelligence (future goal)&lt;/li>
&lt;li>&lt;strong>ASI:&lt;/strong> Beyond human intelligence (theoretical)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/ai_stages.png" alt="AI Stages" />&lt;/p>
&lt;hr>
&lt;h2 id="ani--artificial-narrow-intelligence">
 ANI — Artificial Narrow Intelligence
 
 &lt;a class="anchor" href="#ani--artificial-narrow-intelligence">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>also called &lt;strong>Weak AI&lt;/strong>&lt;/li>
&lt;li>designed to perform &lt;strong>one specific task&lt;/strong>&lt;/li>
&lt;li>Operates within a &lt;strong>predefined environment&lt;/strong>&lt;/li>
&lt;li>Cannot generalise beyond its training&lt;/li>
&lt;li>&lt;strong>Most AI systems today are ANI&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>examples&lt;/strong>&lt;/p></description></item><item><title>Basic Statistics</title><link>https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/</guid><description>&lt;h1 id="basic-statistics">
 Basic Statistics
 
 &lt;a class="anchor" href="#basic-statistics">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Statistics&lt;/strong>: describes data (what you &lt;em>see&lt;/em>).&lt;br>
&lt;strong>Probability&lt;/strong>: models uncertainty (what you &lt;em>don’t know&lt;/em> yet).&lt;/p>
&lt;ul>
&lt;li>Summarise a dataset using central tendency and variability&lt;/li>
&lt;li>Explain core probability ideas using simple examples&lt;/li>
&lt;li>Apply the axioms of probability&lt;/li>
&lt;li>Distinguish mutually exclusive vs independent events&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
 A[Dataset] --&amp;gt; B[Central Tendency]
 A --&amp;gt; C[Variability]
 B --&amp;gt; B1[Mean]
 B --&amp;gt; B2[Median]
 B --&amp;gt; B3[Mode]
 C --&amp;gt; C1[Range]
 C --&amp;gt; C2[Variance]
 C --&amp;gt; C3[Standard Deviation]
 C --&amp;gt; C4[IQR]
&lt;/pre>

&lt;hr>
&lt;h2 id="measures-of-central-tendency">
 Measures of Central Tendency
 
 &lt;a class="anchor" href="#measures-of-central-tendency">#&lt;/a>
 
&lt;/h2>
&lt;p>Central tendency tells you where the “middle” of the data is.
Describes a set of scores with a &lt;strong>single number&lt;/strong> that describes the &lt;strong>PERFORMANCE&lt;/strong> of the group.&lt;/p></description></item><item><title>Basic Probability</title><link>https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/</guid><description>&lt;h1 id="basic-probability">
 Basic Probability
 
 &lt;a class="anchor" href="#basic-probability">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability models uncertainty:
what you &lt;em>don’t know&lt;/em> yet, but want to reason about.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Probability is a number between &lt;strong>0 and 1&lt;/strong> that measures how likely an event is.
The whole topic is about defining &lt;strong>events&lt;/strong> clearly and applying a few core rules consistently.&lt;/p>
&lt;/blockquote>
&lt;p>Probability quantifies uncertainty: a number between 0 and 1.&lt;/p>
&lt;ul>
&lt;li>0 means: impossible&lt;/li>
&lt;li>1 means: certain&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="terminology">
 Terminology
 
 &lt;a class="anchor" href="#terminology">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="random-experiment">
 Random experiment
 
 &lt;a class="anchor" href="#random-experiment">#&lt;/a>
 
&lt;/h3>
&lt;p>A random experiment is an action whose outcome is not known in advance.&lt;/p></description></item><item><title>Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</guid><description>&lt;h1 id="neural-networks">
 Neural Networks
 
 &lt;a class="anchor" href="#neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A &lt;strong>network of artificial neurons&lt;/strong> inspired by how neurons function in the &lt;strong>human brain&lt;/strong>.&lt;/li>
&lt;li>At its core - a &lt;strong>mathematical model&lt;/strong> designed to process and learn from data.&lt;/li>
&lt;li>Neural networks form the &lt;strong>foundation of &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>&lt;/strong> (involves training large and complex networks on vast amounts of data).&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 subgraph subGraph0[&amp;#34;Input Layer&amp;#34;]
 I1((&amp;#34;Input 1&amp;#34;))
 I2((&amp;#34;Input 2&amp;#34;))
 I3((&amp;#34;Input 3&amp;#34;))
 end
 subgraph subGraph1[&amp;#34;Hidden Layer&amp;#34;]
 H1((&amp;#34;Hidden 1&amp;#34;))
 H2((&amp;#34;Hidden 2&amp;#34;))
 H3((&amp;#34;Hidden 3&amp;#34;))
 end
 subgraph subGraph2[&amp;#34;Output Layer&amp;#34;]
 O((&amp;#34;Output&amp;#34;))
 end
 I1 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I2 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I3 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 H1 --&amp;gt; O
 H2 --&amp;gt; O
 H3 --&amp;gt; O

 style I1 fill:#C8E6C9
 style I2 fill:#C8E6C9
 style I3 fill:#C8E6C9
 style H1 stroke:#2962FF,fill:#BBDEFB
 style H2 fill:#BBDEFB
 style H3 fill:#BBDEFB
 style O fill:#FFCDD2
 style subGraph0 stroke:none,fill:transparent
 style subGraph1 stroke:none,fill:transparent
 style subGraph2 stroke:none,fill:transparent
&lt;/pre>

&lt;hr>
&lt;h3 id="structure-of-a-neural-network">
 Structure of a Neural Network
 
 &lt;a class="anchor" href="#structure-of-a-neural-network">#&lt;/a>
 
&lt;/h3>
&lt;p>A typical neural network has &lt;strong>three main layers&lt;/strong>:&lt;/p></description></item><item><title>Conditional Probability &amp; Bayes’ Theorem</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/</guid><description>&lt;h1 id="conditional-probability--bayes-theorem">
 Conditional Probability &amp;amp; Bayes’ Theorem
 
 &lt;a class="anchor" href="#conditional-probability--bayes-theorem">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability often changes when we &lt;strong>learn new information&lt;/strong>.&lt;/p>
&lt;p>Conditional probability and Bayes’ theorem give a structured way to &lt;strong>update beliefs&lt;/strong> using evidence.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Conditional probability updates probabilities after observing an event.&lt;/p>
&lt;p>Bayes’ theorem lets you estimate a hidden cause from observed evidence.&lt;/p>
&lt;p>Naïve Bayes turns Bayes’ theorem into a practical classifier by assuming conditional independence of features given the class.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD

A[Conditional&amp;lt;br/&amp;gt;probability] --&amp;gt;|foundation| B[Bayes&amp;lt;br/&amp;gt;theorem]
D[Independent&amp;lt;br/&amp;gt;events] --&amp;gt;|implies| C[Independence]
C --&amp;gt;|simplifies| A

E[Prior] --&amp;gt;|with likelihood| B
F[Likelihood] --&amp;gt;|updates| H[Posterior]
G[Evidence] --&amp;gt;|normalises| B
B --&amp;gt;|yields| H

I[Naïve&amp;lt;br/&amp;gt;Bayes] --&amp;gt;|uses| B
J[Naïve&amp;lt;br/&amp;gt;assumption] --&amp;gt;|assumes| C
K[Features] --&amp;gt;|given class| J
L[Class] --&amp;gt;|conditions| J
I --&amp;gt;|predicts| M[Classification]
M --&amp;gt;|selects| L

style A fill:#90CAF9,stroke:#1E88E5,color:#000
style B fill:#90CAF9,stroke:#1E88E5,color:#000
style C fill:#90CAF9,stroke:#1E88E5,color:#000

style D fill:#CE93D8,stroke:#8E24AA,color:#000
style E fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style G fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style K fill:#CE93D8,stroke:#8E24AA,color:#000
style L fill:#CE93D8,stroke:#8E24AA,color:#000

style H fill:#C8E6C9,stroke:#2E7D32,color:#000
style I fill:#C8E6C9,stroke:#2E7D32,color:#000
style M fill:#C8E6C9,stroke:#2E7D32,color:#000

&lt;/pre>

&lt;hr>
&lt;h2 id="quick-summary">
 Quick summary
 
 &lt;a class="anchor" href="#quick-summary">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Conditional probability:
updates probability after an event is known.&lt;/li>
&lt;li>Multiplication rule:
computes joint probability from conditional parts.&lt;/li>
&lt;li>Independence:
tested using 
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( P(A\cap B)=P(A)P(B) \)
 &lt;/span>

.&lt;/li>
&lt;li>Total probability:
breaks a probability into weighted cases.&lt;/li>
&lt;li>Bayes’ theorem:
reverses conditioning to infer causes from evidence.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="whats-next">
 What’s next
 
 &lt;a class="anchor" href="#whats-next">#&lt;/a>
 
&lt;/h2>
&lt;p>Probability Distributions&lt;br>
Move from events to random variables and distributions.&lt;/p></description></item><item><title>Machine Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/</link><pubDate>Tue, 06 Aug 2024 23:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/</guid><description>&lt;h1 id="machine-learning">
 Machine Learning
 
 &lt;a class="anchor" href="#machine-learning">#&lt;/a>
 
&lt;/h1>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ===== CLASS DEFINITIONS (Math-based colours) =====
 classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

 %% ===== ROOT =====
 ML: Machine Learning

 %% ===== SUPERVISED =====
 ML --&amp;gt; SL:::category
 SL: Supervised Learning

 SL --&amp;gt; Regression
 Regression --&amp;gt; LR:::algebra
 LR: Linear Regression

 LR --&amp;gt; NN:::algebra
 NN: Neural Network

 NN --&amp;gt; DT:::logic
 DT: Decision Tree

 SL --&amp;gt; Classification
 Classification --&amp;gt; NB:::probability
 NB: Naive Bayes

 NB --&amp;gt; KNN:::geometry
 KNN: k-Nearest Neighbours

 KNN --&amp;gt; SVM:::algebra
 SVM: Support Vector Machine
 
 %% ===== UNSUPERVISED =====
 ML --&amp;gt; USL:::category
 USL: Unsupervised Learning

 USL --&amp;gt; Clustering
 Clustering --&amp;gt; KM:::geometry
 KM: K-Means

 KM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 GMM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model

 %% ===== REINFORCEMENT =====
 ML --&amp;gt; RL:::category
 RL: Reinforcement Learning

 RL --&amp;gt; DM:::logic
 DM: Decision Making
&lt;/pre>

&lt;hr>
&lt;details >&lt;summary>Mathematical Legend&lt;/summary>
 &lt;div class="markdown-inner">
&lt;h3 id="algebra--linear-algebra-blue">
 Algebra / Linear Algebra (Blue)
 
 &lt;a class="anchor" href="#algebra--linear-algebra-blue">#&lt;/a>
 
&lt;/h3>
&lt;p>Used heavily when models rely on:&lt;/p></description></item><item><title>AI Stack</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-stack/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-stack/</guid><description>&lt;h1 id="ai-stack">
 AI Stack
 
 &lt;a class="anchor" href="#ai-stack">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>AI Stack&lt;/strong> describes the &lt;strong>layers required to build an end-to-end AI system&lt;/strong>, from infrastructure at the bottom to user-facing applications at the top.&lt;/p>
&lt;p>Different organisations represent the AI stack differently; this is a simplified conceptual view for learning.&lt;/p>
&lt;p>Each layer depends on the one below it.&lt;/p>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
graph TB

 subgraph APP[&amp;#34;Applications&amp;#34;]
 A[User Interfaces &amp;amp; Integrations]
 end

 subgraph ORCH[&amp;#34;Orchestration&amp;#34;]
 O[Workflows • Agents • Control Logic]
 end

 subgraph DATA[&amp;#34;Data&amp;#34;]
 D[Data Sources • Pipelines • Vector DBs]
 end

 subgraph MODEL[&amp;#34;Models&amp;#34;]
 M[ML • DL • Foundation Models • LLMs]
 end

 subgraph INFRA[&amp;#34;Infrastructure&amp;#34;]
 I[Cloud • On-prem • GPUs • Storage]
 end

 %% Styling
 style APP fill:#FFCCBC
 style ORCH fill:#90CAF9
 style DATA fill:#BBDEFB
 style MODEL fill:#C8E6C9
 style INFRA fill:#E1F5FE

 style A fill:#FFE0B2
 style O fill:#B3E5FC
 style D fill:#E3F2FD
 style M fill:#DCEDC8
 style I fill:#E1F5FE
&lt;/pre>

&lt;hr>
&lt;h2 id="1-infrastructure">
 1. Infrastructure
 
 &lt;a class="anchor" href="#1-infrastructure">#&lt;/a>
 
&lt;/h2>
&lt;p>The foundation that provides &lt;strong>compute and storage&lt;/strong>.&lt;/p></description></item><item><title>Artificial Neuron and Perceptron</title><link>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</guid><description>&lt;h1 id="artificial-neuron-and-perceptron">
 Artificial Neuron and Perceptron
 
 &lt;a class="anchor" href="#artificial-neuron-and-perceptron">#&lt;/a>
 
&lt;/h1>
&lt;blockquote class="book-hint info">
&lt;p>knowledge in neural networks is stored in &lt;strong>connection weights&lt;/strong>, and learning means &lt;strong>modifying those weights&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="biological-neuron">
 Biological Neuron
 
 &lt;a class="anchor" href="#biological-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.&lt;/p>
&lt;p>Core components:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Dendrites&lt;/strong>: receive signals from other neurons&lt;/li>
&lt;li>&lt;strong>Cell body (soma)&lt;/strong>: processes incoming signals&lt;/li>
&lt;li>&lt;strong>Axon&lt;/strong>: transmits the output signal&lt;/li>
&lt;li>&lt;strong>Synapses&lt;/strong>: connection points between neurons&lt;/li>
&lt;/ul>
&lt;p>Biological intuition:&lt;/p>
&lt;ul>
&lt;li>many inputs arrive to one neuron&lt;/li>
&lt;li>one neuron can connect out to many neurons&lt;/li>
&lt;li>massive parallelism enables fast perception and recognition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="artificial-neuron">
 Artificial Neuron
 
 &lt;a class="anchor" href="#artificial-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>An artificial neuron is a simplified computational model inspired by biological neurons.&lt;/p></description></item><item><title>ML Workflow</title><link>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</guid><description>&lt;h1 id="machine-learning-workflow">
 Machine learning Workflow
 
 &lt;a class="anchor" href="#machine-learning-workflow">#&lt;/a>
 
&lt;/h1>
&lt;p>Data is the foundation of any machine learning system.
Quality of data matters more than model complexity.&lt;/p>
&lt;h3 id="role-of-data">
 Role of Data
 
 &lt;a class="anchor" href="#role-of-data">#&lt;/a>
 
&lt;/h3>
&lt;p>Data determines:&lt;/p>
&lt;ul>
&lt;li>What patterns the model can learn&lt;/li>
&lt;li>How well it generalises&lt;/li>
&lt;li>Whether bias or noise is introduced&lt;/li>
&lt;/ul>
&lt;p>Bad data → bad model (even with perfect algorithms).&lt;/p>
&lt;hr>
&lt;h3 id="data-preprocessing-wrangling">
 Data Preprocessing, wrangling
 
 &lt;a class="anchor" href="#data-preprocessing-wrangling">#&lt;/a>
 
&lt;/h3>
&lt;p>Raw data is never ready for training.&lt;/p>
&lt;p>&lt;strong>Data Issues&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Noise
&lt;ul>
&lt;li>For &lt;strong>objects&lt;/strong>, noise is an &lt;strong>extraneous object&lt;/strong>&lt;/li>
&lt;li>For &lt;strong>attributes&lt;/strong>, noise refers to &lt;strong>modification of original values&lt;/strong>&lt;/li>
&lt;li>Use Log or Z Transfer to convert to mean&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Outliers
&lt;ul>
&lt;li>Data objects with characteristics that are considerably different than most of the other data objects in the data set&lt;/li>
&lt;li>Handle: Use &lt;strong>IQR&lt;/strong> method&lt;/li>
&lt;li>Find Lower and Upper Bound and &lt;strong>replace Outlier with Lower or Upper Bound&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Missing Values
&lt;ul>
&lt;li>Eliminate data objects or variables&lt;/li>
&lt;li>Handle: Estimate missing values
&lt;ul>
&lt;li>&lt;strong>Mean, Median or Mode&lt;/strong>&lt;/li>
&lt;li>Prefer &lt;strong>Median&lt;/strong> if there are missing &lt;strong>outliers&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Ignore the missing value during analysis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Duplicate Data
&lt;ul>
&lt;li>Major issue when merging data from heterogeneous sources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Inconsistent Codes
&lt;ul>
&lt;li>Find all Unique and transfer all inconsistent to&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Data Preprocessing techniques&lt;/strong>&lt;/p></description></item><item><title>Conditional Probability</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/</guid><description>&lt;h1 id="conditional-probability">
 Conditional Probability
 
 &lt;a class="anchor" href="#conditional-probability">#&lt;/a>
 
&lt;/h1>
&lt;p>Conditional probability updates the probability of an event when new information is available.&lt;/p>
&lt;p>It shows up whenever a question says:&lt;/p>
&lt;ul>
&lt;li>“given that…”&lt;/li>
&lt;li>“among those who…”&lt;/li>
&lt;li>“out of the items that…”&lt;/li>
&lt;li>“if it does not fail immediately…”&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Conditional probability is always:&lt;/p>
&lt;p>joint probability ÷ probability of the condition.&lt;/p>
&lt;p>The condition must not be an impossible event.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="prior-vs-posterior">
 Prior vs posterior
 
 &lt;a class="anchor" href="#prior-vs-posterior">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Prior probability:
probability with no condition (before new information)&lt;/p></description></item><item><title>Bayes’ Theorem</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/</guid><description>&lt;h1 id="bayes-theorem">
 Bayes’ Theorem
 
 &lt;a class="anchor" href="#bayes-theorem">#&lt;/a>
 
&lt;/h1>
&lt;h3 id="21-total-probability-needed-for-bayes">
 2.1 Total probability (needed for Bayes)
 
 &lt;a class="anchor" href="#21-total-probability-needed-for-bayes">#&lt;/a>
 
&lt;/h3>
&lt;p>Often we split the world into cases 
&lt;span>
 \( E_1,E_2,\dots,E_k \)
 &lt;/span>

 that:&lt;/p>
&lt;ul>
&lt;li>are mutually exclusive&lt;/li>
&lt;li>cover the whole sample space&lt;/li>
&lt;/ul>
&lt;p>Then for any event 
&lt;span>
 \( A \)
 &lt;/span>

:&lt;/p>
&lt;span style="color: red;">
 &lt;span>
 \[ 
P(A)=\sum_{i=1}^{k} P(A\mid E_i)\,P(E_i)
 \]
 &lt;/span>
&lt;/span>
&lt;p>Tree intuition:&lt;/p>


&lt;pre class="mermaid">
flowchart TD
 S[Start] --&amp;gt; E1[Case E1]
 S --&amp;gt; E2[Case E2]
 S --&amp;gt; E3[Case E3]
 E1 --&amp;gt; A1[&amp;#34;A happens&amp;#34;]
 E2 --&amp;gt; A2[&amp;#34;A happens&amp;#34;]
 E3 --&amp;gt; A3[&amp;#34;A happens&amp;#34;]
&lt;/pre>

&lt;hr>
&lt;h3 id="22-bayes-theorem-two-event-form">
 2.2 Bayes’ theorem (two-event form)
 
 &lt;a class="anchor" href="#22-bayes-theorem-two-event-form">#&lt;/a>
 
&lt;/h3>
&lt;p>Bayes&amp;rsquo; Theorem is a mathematical formula used to determine the &lt;strong>conditional probability of an event based on prior knowledge and new evidence&lt;/strong>.&lt;/p></description></item><item><title>Naïve Bayes</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/</guid><description>&lt;h1 id="naïve-bayes">
 Naïve Bayes
 
 &lt;a class="anchor" href="#na%c3%afve-bayes">#&lt;/a>
 
&lt;/h1>
&lt;p>Naïve Bayes is a &lt;strong>probabilistic classifier&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>Supervised Learning Problem&lt;/li>
&lt;li>Binary Classification - final target variable is considered in two classes&lt;/li>
&lt;li>Hypothesis is target which you want to classify&lt;/li>
&lt;li>Total Probability (Prior) of Yes and No is already calculated&lt;/li>
&lt;li>Post / Posterior is when you start studying data&lt;/li>
&lt;li>Based on max probability of hypotheses classify given instance into a class&lt;/li>
&lt;/ul>
&lt;p>It predicts a class label by computing:&lt;/p></description></item><item><title>Probability Distributions</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/</guid><description>&lt;h1 id="probability-distributions">
 Probability Distributions
 
 &lt;a class="anchor" href="#probability-distributions">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability distributions are the bridge between:
real-world randomness and mathematical modelling.&lt;/p>
&lt;p>A random experiment produces outcomes.
A random variable turns those outcomes into numbers.
A probability distribution tells you how likely each number (or range of numbers) is.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A distribution is a complete “story” about uncertainty:
what values are possible, how likely they are, and how we summarise them (mean, variance).&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
	PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; RV[&amp;#34;Random&amp;lt;br/&amp;gt;variables&amp;#34;]
	PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; DS[&amp;#34;Common&amp;lt;br/&amp;gt;distributions&amp;#34;]

	style PD fill:#90CAF9,stroke:#1E88E5,color:#000
	style RV fill:#90CAF9,stroke:#1E88E5,color:#000
	style DS fill:#90CAF9,stroke:#1E88E5,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="aiml-connection">
 AI/ML Connection
 
 &lt;a class="anchor" href="#aiml-connection">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Many ML models are probabilistic:
they assume data (or errors) follow a distribution.&lt;/li>
&lt;li>Loss functions often come from distribution assumptions:
squared loss aligns with Gaussian noise.&lt;/li>
&lt;li>Naïve Bayes (from the previous module) becomes practical once you can model:

&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( P(X\mid Y) \)
 &lt;/span>

 using suitable distributions.&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint warning">
&lt;p>In practice:
choosing a distribution is a modelling decision.
It affects:
prediction, uncertainty estimates, and what “rare” or “typical” means in your data.&lt;/p></description></item><item><title>LNN for Regression</title><link>https://arshadhs.github.io/docs/ai/deep-learning/030-linear-neural-networks-for-regression/</link><pubDate>Sun, 15 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/030-linear-neural-networks-for-regression/</guid><description>&lt;h1 id="linear-neural-networks-for-regression">
 Linear Neural Networks for Regression
 
 &lt;a class="anchor" href="#linear-neural-networks-for-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>linear neural network for regression&lt;/strong> is a model that predicts a &lt;strong>continuous&lt;/strong> target by taking a weighted sum of input features and applying the &lt;strong>identity activation&lt;/strong> (so the output can be any real number).&lt;/p>
&lt;ul>
&lt;li>Single neuron for regression (predicting &lt;em>how much&lt;/em> / &lt;em>how many&lt;/em>)&lt;/li>
&lt;li>Data + linear model (single neuron, no hidden layers) + squared loss&lt;/li>
&lt;li>Training using &lt;strong>batch gradient descent&lt;/strong> algorithm&lt;/li>
&lt;li>Prediction (inference)&lt;/li>
&lt;li>Eg: Auto MPG (UCI) style prediction with a single neuron (from-scratch code)&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 D[&amp;#34;Data&amp;lt;br/&amp;gt;X, y&amp;#34;] --&amp;gt; M[&amp;#34;Linear model&amp;lt;br/&amp;gt;w, b&amp;lt;br/&amp;gt;Single neuron&amp;#34;]
 M --&amp;gt; A[&amp;#34;Activation&amp;lt;br/&amp;gt;Identity&amp;#34;]
 A --&amp;gt; L[&amp;#34;Loss&amp;lt;br/&amp;gt;MSE (Squared error)&amp;#34;]
 L --&amp;gt; O[&amp;#34;Optimiser&amp;lt;br/&amp;gt;Batch Gradient DescentBatch GD / Mini-batch GD&amp;#34;]
 O --&amp;gt; P[&amp;#34;Parameters&amp;lt;br/&amp;gt;w, b&amp;#34;]
 P --&amp;gt; I[&amp;#34;Inference&amp;lt;br/&amp;gt;Predict ŷ (number) for new x&amp;#34;]

 %% Pastel colour scheme
 style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
 style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
 style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
 style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
 style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
 style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
 style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px
&lt;/pre>

&lt;hr>
&lt;h2 id="regression">
 Regression
 
 &lt;a class="anchor" href="#regression">#&lt;/a>
 
&lt;/h2>
&lt;p>Regression is a supervised learning task that predicts a continuous-valued output based on input features.&lt;/p></description></item><item><title>Generative AI</title><link>https://arshadhs.github.io/docs/ai/genai/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/</guid><description>&lt;h1 id="generative-ai">
 Generative AI
 
 &lt;a class="anchor" href="#generative-ai">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Generative Artificial Intelligence (GenAI)&lt;/strong> refers to a class of AI systems that can &lt;strong>generate new content&lt;/strong> such as text, images, audio, video, or code, rather than only making predictions or classifications.&lt;/p>
&lt;p>GenAI systems learn &lt;strong>patterns and representations from large datasets&lt;/strong> and use them to produce &lt;strong>novel outputs&lt;/strong> that resemble the data they were trained on.&lt;/p>
&lt;hr>
&lt;h2 id="how-generative-ai-differs-from-traditional-ai">
 How Generative AI Differs from Traditional AI
 
 &lt;a class="anchor" href="#how-generative-ai-differs-from-traditional-ai">#&lt;/a>
 
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Traditional AI&lt;/th>
 &lt;th>Generative AI&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Predicts or classifies&lt;/td>
 &lt;td>Generates new content&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Task-specific models&lt;/td>
 &lt;td>General-purpose models&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Fixed outputs&lt;/td>
 &lt;td>Open-ended outputs&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Often rule-based&lt;/td>
 &lt;td>Data-driven and probabilistic&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="core-idea-of-generative-ai">
 Core Idea of Generative AI
 
 &lt;a class="anchor" href="#core-idea-of-generative-ai">#&lt;/a>
 
&lt;/h2>

&lt;blockquote class='book-hint '>
 &lt;p>&lt;strong>Instead of learning “what label to assign”, Generative AI learns “how data is structured” and then creates new data following that structure.&lt;/strong>&lt;/p></description></item><item><title>AI Pipeline</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/</guid><description>&lt;h1 id="ai-pipeline">
 AI Pipeline
 
 &lt;a class="anchor" href="#ai-pipeline">#&lt;/a>
 
&lt;/h1>
&lt;p>The AI pipeline is a continuous process where data is collected, prepared, used to train models, evaluated for performance, and continuously improved after deployment.&lt;/p>
&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="collect-data">
 Collect Data
 
 &lt;a class="anchor" href="#collect-data">#&lt;/a>
 
&lt;/h2>
&lt;/li>
&lt;li>
&lt;h2 id="prepare-data">
 Prepare data
 
 &lt;a class="anchor" href="#prepare-data">#&lt;/a>
 
&lt;/h2>
&lt;/li>
&lt;li>
&lt;h2 id="train-model">
 Train Model
 
 &lt;a class="anchor" href="#train-model">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Iterate until model is good enough&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;h2 id="deploy-model">
 Deploy Model
 
 &lt;a class="anchor" href="#deploy-model">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Get data back&lt;/li>
&lt;li>Maintain &amp;amp; update model&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/div>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
timeline
 title AI Pipeline
 Collect Data : Data Ingestion
 : Data Understanding
 Prepare Data : Cleaning
 : Feature Engineering
 : Sampling
 Train Model : Model Training
 : Validation &amp;amp; Metrics
 Deploy Model : Deployment
 : Monitoring &amp;amp; Retraining
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/foundation/">
 AI Foundation
&lt;/a>&lt;/p></description></item><item><title>Regression(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</guid><description>&lt;h1 id="linear-regression">
 Linear Regression
 
 &lt;a class="anchor" href="#linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Linear Regression is a supervised 
&lt;span style="color: blue;">
 ML
&lt;/span> method used to predict a &lt;strong>numerical&lt;/strong> target by fitting a model that is &lt;strong>linear in its parameters&lt;/strong>.&lt;/p>
&lt;p>In 
&lt;span style="color: blue;">
 ML
&lt;/span>, linear models are a core baseline:
they’re fast, often surprisingly strong, and usually easy to interpret.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Linear Regression learns parameters by minimising a squared-error cost.
You can solve it directly (closed form) or iteratively (gradient descent),
and you can extend it using basis functions and regularisation.&lt;/p></description></item><item><title>Random Variables</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/</guid><description>&lt;h1 id="random-variables">
 Random Variables
 
 &lt;a class="anchor" href="#random-variables">#&lt;/a>
 
&lt;/h1>
&lt;p>A random variable is a way to attach numbers to outcomes of a random experiment.&lt;/p>
&lt;p>It lets us move from:
“what happened?”
to:
“what number should we analyse?”&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A random variable is a &lt;em>function&lt;/em> from the sample space to real numbers.
Once you define the random variable clearly, the rest (pmf/pdf/cdf, mean, variance) becomes systematic.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; RV[&amp;#34;Random&amp;lt;br/&amp;gt;variables&amp;#34;]

RV --&amp;gt; T[&amp;#34;Types&amp;#34;]
T --&amp;gt; RV1[&amp;#34;Discrete&amp;lt;br/&amp;gt;RVs&amp;#34;]
T --&amp;gt; RV2[&amp;#34;Continuous&amp;lt;br/&amp;gt;RVs&amp;#34;]

RV --&amp;gt; F[&amp;#34;PMF / PDF / CDF&amp;#34;]
RV --&amp;gt; S[&amp;#34;Mean / Variance&amp;lt;br/&amp;gt;Covariance&amp;#34;]
RV --&amp;gt; J[&amp;#34;Joint &amp;amp; Marginal&amp;lt;br/&amp;gt;distributions&amp;#34;]
RV --&amp;gt; X[&amp;#34;Transformations&amp;#34;]

style PD fill:#90CAF9,stroke:#1E88E5,color:#000
style RV fill:#90CAF9,stroke:#1E88E5,color:#000

style T fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style S fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style X fill:#CE93D8,stroke:#8E24AA,color:#000
style RV1 fill:#CE93D8,stroke:#8E24AA,color:#000
style RV2 fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="1-definition">
 1) Definition
 
 &lt;a class="anchor" href="#1-definition">#&lt;/a>
 
&lt;/h2>
&lt;p>Random variable:
a rule that assigns a number to each outcome.&lt;/p></description></item><item><title>Common Probability Distributions</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/</guid><description>&lt;h1 id="common-probability-distributions">
 Common Probability Distributions
 
 &lt;a class="anchor" href="#common-probability-distributions">#&lt;/a>
 
&lt;/h1>
&lt;p>Once you can describe a random variable using a pmf or pdf, the next step is to use
named distributions that appear repeatedly in real data and in ML models.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Named distributions give you ready-made probability models for common patterns:
binary outcomes, counts, and measurement noise.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; DS[&amp;#34;Common&amp;lt;br/&amp;gt;distributions&amp;#34;]

DS --&amp;gt; DIS[&amp;#34;Discrete&amp;#34;]
DS --&amp;gt; CON[&amp;#34;Continuous&amp;#34;]

DIS --&amp;gt; D1[&amp;#34;Bernoulli&amp;#34;]
DIS --&amp;gt; D2[&amp;#34;Binomial&amp;#34;]
DIS --&amp;gt; D3[&amp;#34;Poisson&amp;#34;]

CON --&amp;gt; D4[&amp;#34;Normal&amp;lt;br/&amp;gt;(Gaussian)&amp;#34;]
CON --&amp;gt; D5[&amp;#34;t / Chi-square / F&amp;lt;br/&amp;gt;(intro)&amp;#34;]

style PD fill:#90CAF9,stroke:#1E88E5,color:#000
style DS fill:#90CAF9,stroke:#1E88E5,color:#000

style DIS fill:#CE93D8,stroke:#8E24AA,color:#000
style CON fill:#CE93D8,stroke:#8E24AA,color:#000

style D1 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D2 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D3 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D4 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D5 fill:#C8E6C9,stroke:#2E7D32,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="1-bernoulli-distribution-binary">
 1) Bernoulli distribution (binary)
 
 &lt;a class="anchor" href="#1-bernoulli-distribution-binary">#&lt;/a>
 
&lt;/h2>
&lt;p>Use when:
one trial has two outcomes (success/failure).&lt;/p></description></item><item><title>Ordinary Least Squares</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</guid><description>&lt;h1 id="direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">
 Direct solution method - Ordinary Least Squares and the Line of Best Fit
 
 &lt;a class="anchor" href="#direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">#&lt;/a>
 
&lt;/h1>
&lt;p>It is possible to compute the best parameters for linear regression &lt;strong>in one shot&lt;/strong> (closed-form),
instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6&lt;/p>
&lt;p>For linear regression, the direct method is usually &lt;strong>Ordinary Least Squares (OLS)&lt;/strong>.&lt;/p>
&lt;p>Ordinary Least Squares (OLS) chooses the “best” line by &lt;strong>minimising squared prediction errors&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
OLS defines “best fit” as the line that minimises the total squared residual error across all data points.&lt;/p></description></item><item><title>Cost Function</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</guid><description>&lt;h1 id="cost-function">
 Cost Function
 
 &lt;a class="anchor" href="#cost-function">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>also known as an objective function&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>how far the predicted values are from the actual ones&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measure of the difference between predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>quantifies the error between a model&amp;rsquo;s predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measures the model’s error on a group of datapoints&lt;/p>
&lt;/li>
&lt;li>
&lt;p>method used to predict values by drawing the best-fit line through the data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>used to evaluate the accuracy of a model’s predictions&lt;/p></description></item><item><title>Gradient Descent Algorithm</title><link>https://arshadhs.github.io/docs/ai/deep-learning/035-gradient-descent-algorithm/</link><pubDate>Thu, 26 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/035-gradient-descent-algorithm/</guid><description>&lt;h1 id="gradient-descent-algorithm">
 Gradient Descent Algorithm
 
 &lt;a class="anchor" href="#gradient-descent-algorithm">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient Descent Algorithm (GDA) is&lt;/p>
&lt;ul>
&lt;li>an &lt;strong>optimisation method&lt;/strong>&lt;/li>
&lt;li>used to &lt;strong>train models&lt;/strong>&lt;/li>
&lt;li>by repeatedly updating parameters (weights and biases) to &lt;strong>reduce the loss&lt;/strong>&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>In deep learning, the default training approach is almost always &lt;strong>mini-batch gradient descent&lt;/strong>, usually with &lt;strong>Adam&lt;/strong> or &lt;strong>SGD + momentum&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;p>Gradient Descent is &lt;strong>used in both regression and classification&lt;/strong>.&lt;/p>
&lt;p>It’s not tied to the task type — it’s tied to the fact you have:&lt;/p></description></item><item><title>Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</guid><description>&lt;h1 id="gradient-descent-for-linear-regression">
 Gradient Descent for Linear Regression
 
 &lt;a class="anchor" href="#gradient-descent-for-linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Iterative method&lt;/strong>&lt;/li>
&lt;li>Types: batch / stochastic / mini-batch&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.&lt;/p>
&lt;/blockquote>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
GD[&amp;#34;Gradient&amp;lt;br/&amp;gt;Descent&amp;#34;] --&amp;gt;|minimises| CF[&amp;#34;Cost&amp;lt;br/&amp;gt;function&amp;#34;]
GD --&amp;gt;|updates| W[&amp;#34;Parameters&amp;lt;br/&amp;gt;(weights)&amp;#34;]
GD --&amp;gt;|uses| GR[&amp;#34;Gradient&amp;lt;br/&amp;gt;(slope)&amp;#34;]

GD --&amp;gt; H[&amp;#34;Hyperparameters&amp;#34;]
H --&amp;gt; LR[&amp;#34;Learning&amp;lt;br/&amp;gt;rate&amp;#34;]
H --&amp;gt; BS[&amp;#34;Batch&amp;lt;br/&amp;gt;size&amp;#34;]
H --&amp;gt; EP[&amp;#34;Epochs&amp;#34;]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-gd">
 Types of GD
 
 &lt;a class="anchor" href="#types-of-gd">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Gradient Descent&amp;lt;br/&amp;gt;types&amp;#34;] --&amp;gt; BGD[&amp;#34;Batch&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; SGD[&amp;#34;Stochastic&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; MGD[&amp;#34;Mini-batch&amp;lt;br/&amp;gt;GD&amp;#34;]

BGD --&amp;gt; ALL[&amp;#34;All data&amp;lt;br/&amp;gt;per step&amp;#34;]
BGD --&amp;gt; STB[&amp;#34;Smooth&amp;lt;br/&amp;gt;updates&amp;#34;]

SGD --&amp;gt; ONE[&amp;#34;1 sample&amp;lt;br/&amp;gt;per step&amp;#34;]
SGD --&amp;gt; FAST[&amp;#34;Quick&amp;lt;br/&amp;gt;progress&amp;#34;]
SGD --&amp;gt; NOISE[&amp;#34;Noisy&amp;lt;br/&amp;gt;updates&amp;#34;]

MGD --&amp;gt; MB[&amp;#34;Small batch&amp;lt;br/&amp;gt;per step&amp;#34;]
MGD --&amp;gt; PRACT[&amp;#34;Practical&amp;lt;br/&amp;gt;default&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h3 id="batch">
 Batch
 
 &lt;a class="anchor" href="#batch">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>Use only if you have huge compute and a lot of time to train&lt;/li>
&lt;/ul>
&lt;h3 id="sgd">
 SGD
 
 &lt;a class="anchor" href="#sgd">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>go-to solution&lt;/p></description></item><item><title>Hypothesis Testing</title><link>https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/</guid><description>&lt;h1 id="hypothesis-testing">
 Hypothesis Testing
 
 &lt;a class="anchor" href="#hypothesis-testing">#&lt;/a>
 
&lt;/h1>
&lt;p>Hypothesis testing is a structured way to decide:&lt;/p>
&lt;p>Is what we see in a sample just random variation,
or is there evidence of a real effect in the population?&lt;/p>
&lt;p>Hypothesis Testing topic sits inside &lt;strong>inferential statistics&lt;/strong>:
we use a &lt;strong>sample&lt;/strong> to make a statement about a &lt;strong>population&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>Sampling (random and stratified)&lt;/li>
&lt;li>Sampling distribution and Central Limit Theorem&lt;/li>
&lt;li>Estimation (confidence intervals and confidence level)&lt;/li>
&lt;li>Testing hypotheses (mean, proportion, ANOVA)&lt;/li>
&lt;li>Maximum likelihood (MLE)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
The logic is always the same:&lt;/p></description></item><item><title>LNN for Classification</title><link>https://arshadhs.github.io/docs/ai/deep-learning/040-linear-neural-networks-for-classification/</link><pubDate>Sun, 15 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/040-linear-neural-networks-for-classification/</guid><description>&lt;h1 id="linear-nn-for-classification">
 Linear NN for Classification
 
 &lt;a class="anchor" href="#linear-nn-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Linear Neural Network (LNN) for classification&lt;/strong> uses &lt;strong>no hidden layers&lt;/strong>.&lt;br>
It learns a &lt;strong>linear decision boundary&lt;/strong> and outputs &lt;strong>class probabilities&lt;/strong>, then converts them into predicted classes.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Neural-network view:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Binary classification&lt;/strong> → logistic regression (single neuron + sigmoid)&lt;/li>
&lt;li>&lt;strong>Multi-class classification&lt;/strong> → softmax regression (K output neurons + softmax)&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 D[&amp;#34;Data&amp;lt;br/&amp;gt;X, y&amp;#34;] --&amp;gt; M[&amp;#34;Linear model&amp;lt;br/&amp;gt;w, b&amp;#34;]
 M --&amp;gt; A[&amp;#34;Activation&amp;lt;br/&amp;gt;Sigmoid / Softmax&amp;#34;]
 A --&amp;gt; L[&amp;#34;Loss&amp;lt;br/&amp;gt;Cross-entropy&amp;#34;]
 L --&amp;gt; O[&amp;#34;Optimiser&amp;lt;br/&amp;gt;Mini-batch GD / Adam&amp;#34;]
 O --&amp;gt; P[&amp;#34;Updated parameters&amp;lt;br/&amp;gt;w, b&amp;#34;]
 P --&amp;gt; I[&amp;#34;Inference&amp;lt;br/&amp;gt;Probabilities → class&amp;#34;]

 %% Pastel colour scheme
 style D fill:#E3F2FD,stroke:#1E88E5,stroke-width:1px
 style M fill:#E8F5E9,stroke:#43A047,stroke-width:1px
 style A fill:#FFF3E0,stroke:#FB8C00,stroke-width:1px
 style L fill:#FCE4EC,stroke:#D81B60,stroke-width:1px
 style O fill:#F3E5F5,stroke:#8E24AA,stroke-width:1px
 style P fill:#E0F7FA,stroke:#00838F,stroke-width:1px
 style I fill:#F1F8E9,stroke:#558B2F,stroke-width:1px
&lt;/pre>

&lt;hr>
&lt;h2 id="classification">
 Classification
 
 &lt;a class="anchor" href="#classification">#&lt;/a>
 
&lt;/h2>
&lt;p>Classification predicts a &lt;strong>discrete class label&lt;/strong>.&lt;br>
Common settings:&lt;/p></description></item><item><title>Classification(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</guid><description>&lt;h1 id="linear-models-for-classification">
 Linear models for Classification
 
 &lt;a class="anchor" href="#linear-models-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>categorises data by finding a linear boundary (hyperplane) that separates classes&lt;/li>
&lt;li>calculating a weighted sum of input features plus bias&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Linear&amp;lt;br/&amp;gt;classification&amp;lt;br/&amp;gt;models&amp;#34;] --&amp;gt; P[&amp;#34;Perceptron&amp;#34;]
T --&amp;gt; LR[&amp;#34;Logistic&amp;lt;br/&amp;gt;regression&amp;#34;]
T --&amp;gt; SVM[&amp;#34;Linear&amp;lt;br/&amp;gt;SVM&amp;#34;]

P --&amp;gt;|uses| STEP[&amp;#34;Step&amp;lt;br/&amp;gt;activation&amp;#34;]
LR --&amp;gt;|uses| SIG[&amp;#34;Sigmoid&amp;lt;br/&amp;gt;+ log loss&amp;#34;]
SVM --&amp;gt;|uses| HNG[&amp;#34;Hinge&amp;lt;br/&amp;gt;loss&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style P fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style SVM fill:#C8E6C9,stroke:#2E7D32,color:#000

style STEP fill:#CE93D8,stroke:#8E24AA,color:#000
style SIG fill:#CE93D8,stroke:#8E24AA,color:#000
style HNG fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h2 id="discriminant-functions">
 Discriminant Functions
 
 &lt;a class="anchor" href="#discriminant-functions">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="decision-theory">
 Decision Theory
 
 &lt;a class="anchor" href="#decision-theory">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-discriminative-classifiers">
 Probabilistic Discriminative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-discriminative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h2 id="logistic-regression">
 Logistic Regression
 
 &lt;a class="anchor" href="#logistic-regression">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Supervised machine learning algorithm&lt;/li>
&lt;li>Binary &lt;strong>classification&lt;/strong> algorithm&lt;/li>
&lt;li>requires data to be linearly separable&lt;/li>
&lt;li>predicts the probability that an input belongs to a specific class&lt;/li>
&lt;li>uses &lt;strong>Sigmoid function&lt;/strong> to convert inputs into a probability value between 0 and 1&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Logistic regression predicts $P(y=1\mid x)$ using a sigmoid of a linear score $z=w\cdot x+b$,
then learns $w,b$ by maximising likelihood (equivalently minimising log-loss).&lt;/p></description></item><item><title>Foundation Models</title><link>https://arshadhs.github.io/docs/ai/genai/foundation-model/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/foundation-model/</guid><description>&lt;h1 id="foundation-model">
 Foundation Model
 
 &lt;a class="anchor" href="#foundation-model">#&lt;/a>
 
&lt;/h1>
&lt;p>AI models trained on massive datasets to perform a wide range of tasks with minimal fine-tuning.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>are large deep learning neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>are large AI models trained on &lt;strong>massive and diverse datasets&lt;/strong> (text, images, audio, or multiple modalities).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Contain &lt;strong>millions or billions of parameters&lt;/strong>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed to perform a &lt;strong>broad range of general tasks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed for &lt;strong>general-purpose intelligence&lt;/strong>, not a single task.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>acts as &lt;strong>base models&lt;/strong> for building specialised AI applications&lt;/p></description></item><item><title>LLM - Model</title><link>https://arshadhs.github.io/docs/ai/genai/llm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/llm/</guid><description>&lt;h1 id="llm--large-language-model">
 LLM – Large Language Model
 
 &lt;a class="anchor" href="#llm--large-language-model">#&lt;/a>
 
&lt;/h1>
&lt;p>Large Language Models (LLMs) are &lt;strong>advanced AI systems&lt;/strong> designed to process, understand, and generate &lt;strong>human-like text&lt;/strong>.&lt;/p>
&lt;p>They learn language by analysing &lt;strong>massive amounts of text data&lt;/strong>, discovering patterns in:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>meaning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>relationships between words and sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Built on &lt;strong>Deep Learning&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Implemented using &lt;strong>Neural Networks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Based on &lt;strong>Transformers&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often combined with tools like:&lt;/p>
&lt;ul>
&lt;li>Retrieval (RAG)&lt;/li>
&lt;li>Agents&lt;/li>
&lt;li>External APIs&lt;/li>
&lt;li>Memory systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-makes-an-llm-special">
 What makes an LLM special?
 
 &lt;a class="anchor" href="#what-makes-an-llm-special">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Built using &lt;strong>deep neural networks&lt;/strong>&lt;/li>
&lt;li>Trained on &lt;strong>very large datasets&lt;/strong> (books, articles, code, web text)&lt;/li>
&lt;li>Can perform many tasks &lt;strong>without task-specific training&lt;/strong>&lt;/li>
&lt;li>General-purpose language understanding, not single-task models&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="foundation-transformer-architecture">
 Foundation: Transformer Architecture
 
 &lt;a class="anchor" href="#foundation-transformer-architecture">#&lt;/a>
 
&lt;/h2>
&lt;p>LLMs are based on the &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/transformer/">Transformer Architecture&lt;/a>&lt;/strong>, which allows models to understand &lt;strong>context and long-range dependencies&lt;/strong> in text.&lt;/p></description></item><item><title>AI Agents</title><link>https://arshadhs.github.io/docs/ai/genai/ai-agents/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/ai-agents/</guid><description>&lt;h1 id="ai-agents">
 AI Agents
 
 &lt;a class="anchor" href="#ai-agents">#&lt;/a>
 
&lt;/h1>
&lt;p>Also referred to as Agentic AI.&lt;/p>
&lt;p>AI agents are &lt;strong>intelligent systems&lt;/strong> that can &lt;strong>plan, make decisions, and take actions&lt;/strong> to achieve goals with &lt;strong>minimal human intervention&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A common use case is &lt;strong>task automation&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>for example booking travel based on a user’s request.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AI agents typically build on &lt;strong>Generative AI&lt;/strong> and use &lt;strong>Large Language Models (LLMs)&lt;/strong> as the reasoning core.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Agents often interact with tools (APIs, databases, calendars) to complete multi-step workflows.&lt;/p></description></item><item><title>Retrieval-Augmented Generation (RAG)</title><link>https://arshadhs.github.io/docs/ai/genai/rag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/rag/</guid><description>&lt;h1 id="retrieval-augmented-generation-rag">
 Retrieval-Augmented Generation (RAG)
 
 &lt;a class="anchor" href="#retrieval-augmented-generation-rag">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Retrieval-Augmented Generation (RAG)&lt;/strong> is a system design pattern that improves an LLM’s answers by:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Retrieving&lt;/strong> relevant information from an external knowledge source, and then&lt;/li>
&lt;li>&lt;strong>Augmenting&lt;/strong> the LLM prompt with that retrieved context before generating the final response.&lt;/li>
&lt;/ol>
&lt;p>RAG helps an LLM &lt;strong>look things up first&lt;/strong>, then &lt;strong>answer using evidence&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="why-rag-is-useful">
 Why RAG is Useful
 
 &lt;a class="anchor" href="#why-rag-is-useful">#&lt;/a>
 
&lt;/h2>
&lt;p>RAG is commonly used when:&lt;/p>
&lt;ul>
&lt;li>Your knowledge is in &lt;strong>private documents&lt;/strong> (PDFs, policies, internal wiki)&lt;/li>
&lt;li>You need &lt;strong>up-to-date information&lt;/strong> (things not in the model’s training data)&lt;/li>
&lt;li>You want fewer &lt;strong>hallucinations&lt;/strong> by grounding answers in retrieved sources&lt;/li>
&lt;li>You want &lt;strong>traceability&lt;/strong> (show “where the answer came from”)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>RAG does not change the model weights.&lt;br>
It changes what the model &lt;em>sees&lt;/em> at inference time by adding retrieved context.&lt;/p></description></item><item><title>Mathematical Foundation</title><link>https://arshadhs.github.io/docs/ai/maths/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/</guid><description>&lt;h1 id="mathematical-foundations-for-machine-learning">
 Mathematical Foundations for Machine Learning
 
 &lt;a class="anchor" href="#mathematical-foundations-for-machine-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Machine Learning is built on &lt;strong>mathematical principles&lt;/strong> that allow models to:&lt;/p>
&lt;ul>
&lt;li>represent data&lt;/li>
&lt;li>learn patterns&lt;/li>
&lt;li>optimise performance&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 DATA[Data]
 MATH[Math Models]
 OPT[Optimisation]
 MODEL[Trained Model]

 DATA --&amp;gt; MATH
 MATH --&amp;gt; OPT
 OPT --&amp;gt; MODEL
&lt;/pre>

&lt;p>ML requires &lt;strong>core mathematical tools&lt;/strong> to understand how ML algorithms work internally. Algebra deals with relationships between variables and quantities, while Calculus focuses on change and optimization.&lt;/p></description></item><item><title>Deep Feedforward Neural Networks (DFNN) for Classification</title><link>https://arshadhs.github.io/docs/ai/deep-learning/050-deep-feedforward/</link><pubDate>Thu, 26 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/050-deep-feedforward/</guid><description>&lt;h1 id="deep-feedforward-neural-networks-dfnn-or-multi-layer-perceptrons-mlp-for-classification">
 Deep Feedforward Neural Networks (DFNN) or Multi Layer Perceptrons (MLP) for Classification
 
 &lt;a class="anchor" href="#deep-feedforward-neural-networks-dfnn-or-multi-layer-perceptrons-mlp-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Deep Feedforward Neural Network (DFNN)&lt;/strong>, also called a &lt;strong>Multi-Layer Perceptron (MLP)&lt;/strong>, is a neural network with one or more &lt;strong>hidden layers&lt;/strong> where information flows &lt;strong>forward only&lt;/strong> (no recurrence).&lt;br>
For classification, DFNNs learn &lt;strong>non-linear decision boundaries&lt;/strong> by combining hidden layers with &lt;strong>non-linear activation functions&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Core idea:&lt;/p>
&lt;ul>
&lt;li>A single neuron can only learn &lt;strong>linear&lt;/strong> boundaries.&lt;/li>
&lt;li>Adding &lt;strong>hidden layers + non-linearity&lt;/strong> allows DFNNs to solve problems like &lt;strong>XOR&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="mlp-as-solution-for-xor">
 MLP as solution for XOR
 
 &lt;a class="anchor" href="#mlp-as-solution-for-xor">#&lt;/a>
 
&lt;/h2>
&lt;p>A single perceptron fails on XOR because XOR is &lt;strong>not linearly separable&lt;/strong>.&lt;/p></description></item><item><title>Decision Tree</title><link>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</guid><description>&lt;h1 id="decision-tree">
 Decision Tree
 
 &lt;a class="anchor" href="#decision-tree">#&lt;/a>
 
&lt;/h1>
&lt;p>A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A decision tree grows by repeatedly splitting the training data into &lt;strong>purer&lt;/strong> subsets using an impurity measure
(Entropy / Gini / Classification Error).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="information-theory">
 Information Theory
 
 &lt;a class="anchor" href="#information-theory">#&lt;/a>
 
&lt;/h2>
&lt;p>Decision trees need a way to measure:
“How mixed are the class labels at a node?”&lt;/p></description></item><item><title>Prediction &amp; Forecasting</title><link>https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/</guid><description>&lt;h1 id="prediction--forecasting">
 Prediction &amp;amp; Forecasting
 
 &lt;a class="anchor" href="#prediction--forecasting">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="correlation">
 Correlation
 
 &lt;a class="anchor" href="#correlation">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="regression">
 Regression
 
 &lt;a class="anchor" href="#regression">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="time-series-analysis">
 Time Series Analysis
 
 &lt;a class="anchor" href="#time-series-analysis">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="introduction-components-of-time-series-data">
 Introduction, Components of time series data
 
 &lt;a class="anchor" href="#introduction-components-of-time-series-data">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="ma-model--basic-and-weighted-ma-model">
 MA model – basic and weighted MA model
 
 &lt;a class="anchor" href="#ma-model--basic-and-weighted-ma-model">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="time-series-models">
 Time series models
 
 &lt;a class="anchor" href="#time-series-models">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>AR Model&lt;/li>
&lt;li>ARIMA Model&lt;/li>
&lt;li>SARIMA,SARIMAX,VAR,VARMAX&lt;/li>
&lt;li>Simple exponential smoothing model&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;p>&lt;a href="">Prediction &amp;amp; Forecasting&lt;/a>&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">
 Statistics
&lt;/a>&lt;/p></description></item><item><title>Convolutional Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/060-cnn-fundamentals/</link><pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/060-cnn-fundamentals/</guid><description>&lt;h1 id="convolutional-neural-networks-cnn">
 Convolutional Neural Networks (CNN)
 
 &lt;a class="anchor" href="#convolutional-neural-networks-cnn">#&lt;/a>
 
&lt;/h1>
&lt;p>Convolutional Neural Networks (CNNs) are specialised neural networks designed for data with spatial structure, especially images. They became the standard model for computer vision because they preserve spatial locality, reuse the same pattern detector across the image, and build representations hierarchically. In practical terms, a CNN starts by learning simple features such as edges and corners, then combines them into textures, shapes, object parts, and finally full semantic categories.&lt;/p></description></item><item><title>Statistics</title><link>https://arshadhs.github.io/docs/ai/statistics/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/</guid><description>&lt;h1 id="statistics">
 Statistics
 
 &lt;a class="anchor" href="#statistics">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Statistical methods&lt;/strong> help you turn &lt;strong>raw data into reliable conclusions&lt;/strong>, while understanding &lt;strong>uncertainty, variability, and confidence&lt;/strong>.&lt;/p>
&lt;p>Statistics provides the &lt;strong>language and tools&lt;/strong> for reasoning about data, uncertainty, and inference.&lt;/p>
&lt;p>ML needs &lt;strong>understanding data behaviour&lt;/strong>, drawing conclusions, and validating machine learning models.&lt;/p>
&lt;ul>
&lt;li>Collect Data&lt;/li>
&lt;li>Present &amp;amp; Organise Data (in a systematic manner)&lt;/li>
&lt;li>Alalyse Data&lt;/li>
&lt;li>Infer about the Data&lt;/li>
&lt;li>Take Decision from the Data&lt;/li>
&lt;/ul>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/00_formulas/">Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/">Stats Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/">Basic Statistics&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/">Basic Probability&lt;/a>
 &lt;/li>
 
 
 
 
 
 
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/">Hypothesis Testing&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/">Prediction &amp;amp; Forecasting&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/">Gaussian Mixture model &amp;amp; Expectation Maximization&lt;/a>
 &lt;/li>
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/">Conditional Probability &amp;amp; Bayes’ Theorem&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/">Conditional Probability&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/">Bayes’ Theorem&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/">Naïve Bayes&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/">Probability Distributions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/">Random Variables&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/">Common Probability Distributions&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Statistics Topic&lt;/th>
 &lt;th>What you learn (plain English)&lt;/th>
 &lt;th>ML Connection&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>1. Basic Probability &amp;amp; Statistics&lt;/td>
 &lt;td>Summarise data;&lt;br>understand spread;&lt;br>basic probability rules&lt;/td>
 &lt;td>Data understanding (EDA), feature sanity checks,&lt;br>detecting outliers, interpreting “average behaviour”&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>2. Conditional Probability &amp;amp; Bayes&lt;/td>
 &lt;td>Update probability using new information;&lt;br>Bayes’ rule&lt;/td>
 &lt;td>Naïve Bayes, Bayesian thinking,&lt;br>posterior probabilities, probabilistic classification&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>3. Probability Distributions&lt;/td>
 &lt;td>Model randomness with distributions;&lt;br>expectation/variance/covariance&lt;/td>
 &lt;td>Likelihood models, noise assumptions (Gaussian), sampling,&lt;br>probabilistic modelling foundations&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>4. Hypothesis Testing&lt;/td>
 &lt;td>Sampling, CLT, confidence intervals,&lt;br>significance tests, ANOVA, MLE&lt;/td>
 &lt;td>A/B testing, evaluating model improvements,&lt;br>significance vs noise, parameter estimation (MLE)&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>5. Prediction &amp;amp; Forecasting&lt;/td>
 &lt;td>Correlation, regression,&lt;br>time series (AR/MA/ARIMA/SARIMA etc.)&lt;/td>
 &lt;td>Linear regression, forecasting, sequential data modelling, baseline predictive modelling&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>6. GMM &amp;amp; EM&lt;/td>
 &lt;td>Mixtures of Gaussians;&lt;br>iterative estimation with EM&lt;/td>
 &lt;td>Unsupervised learning (soft clustering),&lt;br>density estimation, latent-variable models&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
 A[&amp;#34;Statistical Methods&amp;lt;br/&amp;gt;AIML ZC418&amp;#34;] --&amp;gt; B[&amp;#34;1. Basic Probability and Statistics&amp;#34;]
 A --&amp;gt; C[&amp;#34;2. Conditional Probability and Bayes&amp;#34;]
 A --&amp;gt; D[&amp;#34;3. Probability Distributions&amp;#34;]
 A --&amp;gt; E[&amp;#34;4. Hypothesis Testing&amp;#34;]
 A --&amp;gt; F[&amp;#34;5. Prediction and Forecasting&amp;#34;]
 A --&amp;gt; G[&amp;#34;6. Gaussian Mixture Model and EM&amp;#34;]

 B --&amp;gt; B1[&amp;#34;Central Tendency&amp;lt;br/&amp;gt;Mean - Median - Mode&amp;#34;]
 B --&amp;gt; B2[&amp;#34;Variability&amp;lt;br/&amp;gt;Range - Variance - SD - Quartiles&amp;#34;]
 B --&amp;gt; B3[&amp;#34;Basic Probability Concepts&amp;#34;]
 B3 --&amp;gt; B31[&amp;#34;Axioms of Probability&amp;#34;]
 B3 --&amp;gt; B32[&amp;#34;Definition of Probability&amp;#34;]
 B3 --&amp;gt; B33[&amp;#34;Mutually Exclusive vs Independent&amp;#34;]

 C --&amp;gt; C1[&amp;#34;Conditional Probability&amp;#34;]
 C --&amp;gt; C2[&amp;#34;Independence (conditional)&amp;#34;]
 C --&amp;gt; C3[&amp;#34;Bayes Theorem&amp;#34;]
 C --&amp;gt; C4[&amp;#34;Naive Bayes (intro)&amp;#34;]

 D --&amp;gt; D1[&amp;#34;Random Variables&amp;lt;br/&amp;gt;Discrete and Continuous&amp;#34;]
 D --&amp;gt; D2[&amp;#34;Expectation - Variance - Covariance&amp;#34;]
 D --&amp;gt; D3[&amp;#34;Transformations of RVs&amp;#34;]
 D --&amp;gt; D4[&amp;#34;Key Distributions&amp;#34;]
 D4 --&amp;gt; D41[&amp;#34;Bernoulli&amp;#34;]
 D4 --&amp;gt; D42[&amp;#34;Binomial&amp;#34;]
 D4 --&amp;gt; D43[&amp;#34;Poisson&amp;#34;]
 D4 --&amp;gt; D44[&amp;#34;Normal (Gaussian)&amp;#34;]
 D4 --&amp;gt; D45[&amp;#34;t - Chi-square - F (intro)&amp;#34;]

 E --&amp;gt; E1[&amp;#34;Sampling&amp;lt;br/&amp;gt;Random and Stratified&amp;#34;]
 E --&amp;gt; E2[&amp;#34;Sampling Distributions&amp;lt;br/&amp;gt;CLT&amp;#34;]
 E --&amp;gt; E3[&amp;#34;Estimation&amp;lt;br/&amp;gt;Confidence Intervals&amp;#34;]
 E --&amp;gt; E4[&amp;#34;Hypothesis Tests&amp;lt;br/&amp;gt;Means and Proportions&amp;#34;]
 E --&amp;gt; E5[&amp;#34;ANOVA&amp;lt;br/&amp;gt;Single and Dual factor&amp;#34;]
 E --&amp;gt; E6[&amp;#34;Maximum Likelihood&amp;#34;]

 F --&amp;gt; F1[&amp;#34;Correlation&amp;#34;]
 F --&amp;gt; F2[&amp;#34;Regression&amp;#34;]
 F --&amp;gt; F3[&amp;#34;Time Series Basics&amp;lt;br/&amp;gt;Components&amp;#34;]
 F --&amp;gt; F4[&amp;#34;Moving Averages&amp;lt;br/&amp;gt;Simple and Weighted&amp;#34;]
 F --&amp;gt; F5[&amp;#34;Time Series Models&amp;#34;]
 F5 --&amp;gt; F51[&amp;#34;AR&amp;#34;]
 F5 --&amp;gt; F52[&amp;#34;ARMA / ARIMA&amp;#34;]
 F5 --&amp;gt; F53[&amp;#34;SARIMA / SARIMAX&amp;#34;]
 F5 --&amp;gt; F54[&amp;#34;VAR / VARMAX&amp;#34;]
 F --&amp;gt; F6[&amp;#34;Exponential Smoothing&amp;#34;]

 G --&amp;gt; G1[&amp;#34;GMM&amp;lt;br/&amp;gt;Mixture of Gaussians&amp;#34;]
 G --&amp;gt; G2[&amp;#34;EM Algorithm&amp;lt;br/&amp;gt;E-step - M-step&amp;#34;]

 B -.-&amp;gt; C
 C -.-&amp;gt; D
 D -.-&amp;gt; E
 E -.-&amp;gt; F
 F -.-&amp;gt; G
&lt;/pre>

&lt;hr>
&lt;h2 id="data---types">
 Data - Types
 
 &lt;a class="anchor" href="#data---types">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
	A[(Data)] --&amp;gt; B[&amp;#34;Categorical (Qualitative)&amp;#34;]
 A --&amp;gt; C[&amp;#34;Numerical (Quantitative)&amp;#34;]

 B --&amp;gt; B1[Nominal]
 B --&amp;gt; B2[Ordinal]

 C --&amp;gt; C1[Discrete]
 C --&amp;gt; C2[Continuous]

 C2 --&amp;gt; C21[Interval]
 C2 --&amp;gt; C22[Ratio]

 %% Styling
 style A fill:#E1F5FE,stroke:#333
 style B fill:#90CAF9,stroke:#333
 style B1 fill:#90CAF9,stroke:#333
 style B2 fill:#90CAF9,stroke:#333
 style C fill:#FFF9C4,stroke:#333
 style C1 fill:#FFF9C4,stroke:#333
 style C2 fill:#FFF9C4,stroke:#333
 style C21 fill:#FFF9C4,stroke:#333
 style C22 fill:#FFF9C4,stroke:#333
&lt;/pre>

&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="categorical-qualitative">
 Categorical (Qualitative)
 
 &lt;a class="anchor" href="#categorical-qualitative">#&lt;/a>
 
&lt;/h2>
&lt;p>express a qualitative attribute
e.g. hair color, eye color&lt;/p></description></item><item><title>Gaussian Mixture model &amp; Expectation Maximization</title><link>https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/</guid><description>&lt;h1 id="gaussian-mixture-model--expectation-maximization">
 Gaussian Mixture model &amp;amp; Expectation Maximization
 
 &lt;a class="anchor" href="#gaussian-mixture-model--expectation-maximization">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;p>&lt;a href="https://www.geeksforgeeks.org/machine-learning/gaussian-mixture-model/">Gaussian Mixture model&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://www.geeksforgeeks.org/machine-learning/ml-expectation-maximization-algorithm/">Expectation Maximization&lt;/a>&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">
 Statistics
&lt;/a>&lt;/p></description></item><item><title>Instance-based Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</guid><description>&lt;h1 id="instance-based-learning">
 Instance-based Learning
 
 &lt;a class="anchor" href="#instance-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Instance-based learning is a family of methods that &lt;strong>do not build one explicit global model during training&lt;/strong>. Instead, they &lt;strong>store training examples&lt;/strong> and delay most of the work until a new query arrives.&lt;/p>
&lt;p>When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.&lt;/p>
&lt;p>Instance-based Learning covers three linked ideas:&lt;/p></description></item><item><title>Deep CNN Architectures</title><link>https://arshadhs.github.io/docs/ai/deep-learning/065-deep-cnn-architectures/</link><pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/065-deep-cnn-architectures/</guid><description>&lt;h1 id="deep-cnn-architectures">
 Deep CNN Architectures
 
 &lt;a class="anchor" href="#deep-cnn-architectures">#&lt;/a>
 
&lt;/h1>
&lt;p>Once the basic ideas of convolution, pooling, channels, and classifier heads are understood, the next step is to study how successful CNN architectures are designed in practice. The history of deep CNNs is not just a list of famous models. It is a progression of design ideas: smaller filters, more depth, better optimisation, bottlenecks, multi-scale processing, residual connections, and transfer learning.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>&lt;strong>Key takeaway:&lt;/strong>&lt;br>
Deep CNN architectures evolved by solving specific problems one by one: &lt;strong>LeNet&lt;/strong> established the template, &lt;strong>AlexNet&lt;/strong> proved deep learning could dominate large-scale vision, &lt;strong>VGG&lt;/strong> simplified the design, &lt;strong>NiN&lt;/strong> introduced powerful &lt;code>1 × 1&lt;/code> ideas, &lt;strong>GoogLeNet&lt;/strong> made multi-scale processing efficient, and &lt;strong>ResNet&lt;/strong> solved the optimisation problem of very deep networks.&lt;/p></description></item><item><title>CNN Pipeline</title><link>https://arshadhs.github.io/docs/ai/deep-learning/067-cnn-model/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/067-cnn-model/</guid><description>&lt;h1 id="cnn-pipeline-preprocessing--models">
 CNN Pipeline: Preprocessing &amp;amp; Models
 
 &lt;a class="anchor" href="#cnn-pipeline-preprocessing--models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Understand CNN concepts deeply&lt;/li>
&lt;li>Build CNN models step-by-step&lt;/li>
&lt;li>Apply CNNs in assignments using Keras&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Think of CNN as a pipeline:
Image → Features → Patterns → Prediction&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h1 id="1-image-representation">
 1. Image Representation
 
 &lt;a class="anchor" href="#1-image-representation">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
X \in \mathbb{R}^{H \times W \times C}
 \]
 &lt;/span>
&lt;/span>
&lt;ul>
&lt;li>H = Height&lt;/li>
&lt;li>W = Width&lt;/li>
&lt;li>C = Channels&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="2-convolution-operation">
 2. Convolution Operation
 
 &lt;a class="anchor" href="#2-convolution-operation">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;span>
 \[ 
Z(i,j) = \sum_{m,n} X(i+m, j+n) \cdot K(m,n)
 \]
 &lt;/span>
&lt;/span>
&lt;ul>
&lt;li>Sliding filter extracts features&lt;/li>
&lt;li>Produces feature maps&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="3-stride--padding">
 3. Stride &amp;amp; Padding
 
 &lt;a class="anchor" href="#3-stride--padding">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;span>
 \[ 
Output = \frac{N - F + 2P}{S} + 1
 \]
 &lt;/span>
&lt;/span>
&lt;hr>
&lt;h1 id="4-activation-relu">
 4. Activation (ReLU)
 
 &lt;a class="anchor" href="#4-activation-relu">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;span>
 \[ 
ReLU(x) = max(0, x)
 \]
 &lt;/span>
&lt;/span>
&lt;hr>
&lt;h1 id="5-pooling">
 5. Pooling
 
 &lt;a class="anchor" href="#5-pooling">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Max Pooling → strongest feature&lt;/li>
&lt;li>Average Pooling → smooth&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="6-global-average-pooling">
 6. Global Average Pooling
 
 &lt;a class="anchor" href="#6-global-average-pooling">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;span>
 \[ 
y_k = \frac{1}{HW} \sum_{i,j} x_{i,j,k}
 \]
 &lt;/span>
&lt;/span>
&lt;hr>
&lt;h1 id="7-loss-function">
 7. Loss Function
 
 &lt;a class="anchor" href="#7-loss-function">#&lt;/a>
 
&lt;/h1>
&lt;span style="color: green;">
 &lt;span>
 \[ 
L = - \sum y \log(\hat{y})
 \]
 &lt;/span>
&lt;/span>
&lt;hr>
&lt;h1 id="8-cnn-architecture">
 8. CNN Architecture
 
 &lt;a class="anchor" href="#8-cnn-architecture">#&lt;/a>
 
&lt;/h1>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">graph LR
A[Input Image] --&amp;gt; B[Conv]
B --&amp;gt; C[ReLU]
C --&amp;gt; D[Pooling]
D --&amp;gt; E[Conv Layers]
E --&amp;gt; F[Flatten / GAP]
F --&amp;gt; G[Dense]
G --&amp;gt; H[Output]&lt;/pre>
&lt;hr>
&lt;h1 id="9-training">
 9. Training
 
 &lt;a class="anchor" href="#9-training">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Forward pass&lt;/li>
&lt;li>Loss computation&lt;/li>
&lt;li>Backpropagation&lt;/li>
&lt;li>Weight update&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="10-keras-implementation">
 10. Keras Implementation
 
 &lt;a class="anchor" href="#10-keras-implementation">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="model">
 Model
 
 &lt;a class="anchor" href="#model">#&lt;/a>
 
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-python" data-lang="python">&lt;span style="display:flex;">&lt;span>&lt;span style="color:#f92672">from&lt;/span> tensorflow.keras.models &lt;span style="color:#f92672">import&lt;/span> Sequential
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>&lt;span style="color:#f92672">from&lt;/span> tensorflow.keras.layers &lt;span style="color:#f92672">import&lt;/span> Conv2D, MaxPooling2D, Dense, Flatten
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model &lt;span style="color:#f92672">=&lt;/span> Sequential()
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(Conv2D(&lt;span style="color:#ae81ff">32&lt;/span>, (&lt;span style="color:#ae81ff">3&lt;/span>,&lt;span style="color:#ae81ff">3&lt;/span>), activation&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;relu&amp;#39;&lt;/span>, input_shape&lt;span style="color:#f92672">=&lt;/span>(&lt;span style="color:#ae81ff">64&lt;/span>,&lt;span style="color:#ae81ff">64&lt;/span>,&lt;span style="color:#ae81ff">3&lt;/span>)))
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(MaxPooling2D((&lt;span style="color:#ae81ff">2&lt;/span>,&lt;span style="color:#ae81ff">2&lt;/span>)))
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(Conv2D(&lt;span style="color:#ae81ff">64&lt;/span>, (&lt;span style="color:#ae81ff">3&lt;/span>,&lt;span style="color:#ae81ff">3&lt;/span>), activation&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;relu&amp;#39;&lt;/span>))
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(MaxPooling2D((&lt;span style="color:#ae81ff">2&lt;/span>,&lt;span style="color:#ae81ff">2&lt;/span>)))
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(Flatten())
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(Dense(&lt;span style="color:#ae81ff">128&lt;/span>, activation&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;relu&amp;#39;&lt;/span>))
&lt;/span>&lt;/span>&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>add(Dense(&lt;span style="color:#ae81ff">1&lt;/span>, activation&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;sigmoid&amp;#39;&lt;/span>))
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="compile">
 Compile
 
 &lt;a class="anchor" href="#compile">#&lt;/a>
 
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-python" data-lang="python">&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>compile(optimizer&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;adam&amp;#39;&lt;/span>, loss&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#e6db74">&amp;#39;binary_crossentropy&amp;#39;&lt;/span>, metrics&lt;span style="color:#f92672">=&lt;/span>[&lt;span style="color:#e6db74">&amp;#39;accuracy&amp;#39;&lt;/span>])
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="train">
 Train
 
 &lt;a class="anchor" href="#train">#&lt;/a>
 
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-python" data-lang="python">&lt;span style="display:flex;">&lt;span>model&lt;span style="color:#f92672">.&lt;/span>fit(X_train, y_train, epochs&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">10&lt;/span>, batch_size&lt;span style="color:#f92672">=&lt;/span>&lt;span style="color:#ae81ff">32&lt;/span>)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h2 id="predict">
 Predict
 
 &lt;a class="anchor" href="#predict">#&lt;/a>
 
&lt;/h2>
&lt;div class="highlight">&lt;pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;">&lt;code class="language-python" data-lang="python">&lt;span style="display:flex;">&lt;span>pred &lt;span style="color:#f92672">=&lt;/span> model&lt;span style="color:#f92672">.&lt;/span>predict(X_test)
&lt;/span>&lt;/span>&lt;/code>&lt;/pre>&lt;/div>&lt;hr>
&lt;h1 id="11-tips">
 11. Tips
 
 &lt;a class="anchor" href="#11-tips">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Normalize images&lt;/li>
&lt;li>Use small filters&lt;/li>
&lt;li>Avoid too many dense layers&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="12-summary">
 12. Summary
 
 &lt;a class="anchor" href="#12-summary">#&lt;/a>
 
&lt;/h1>
&lt;blockquote class="book-hint info">
&lt;p>CNN = Automatic feature extractor + classifier&lt;/p></description></item><item><title>Recurrent Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/070-recurrent-nn/</link><pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/070-recurrent-nn/</guid><description>&lt;h1 id="recurrent-neural-networks">
 Recurrent Neural Networks
 
 &lt;a class="anchor" href="#recurrent-neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;p>Recurrent Neural Networks (RNNs) are neural networks designed for &lt;strong>sequential data&lt;/strong>, where the order of inputs matters and the model must use information from earlier time steps to interpret later ones. Unlike a feedforward network, an RNN does not process each input in isolation. It carries a &lt;strong>hidden state&lt;/strong> from one time step to the next, so the network can build a running summary of what it has seen so far.&lt;/p></description></item><item><title>Support Vector Machine</title><link>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</guid><description>&lt;h1 id="support-vector-machine-svm">
 Support Vector Machine (SVM)
 
 &lt;a class="anchor" href="#support-vector-machine-svm">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Support Vector Machine (SVM)&lt;/strong> is a &lt;strong>supervised machine learning algorithm&lt;/strong> used for:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Classification&lt;/strong> (most common)&lt;/li>
&lt;li>&lt;strong>Regression&lt;/strong> (SVR – Support Vector Regression)&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>Find the decision boundary that separates classes with the &lt;strong>maximum margin&lt;/strong>.&lt;/p>
&lt;/blockquote>&lt;blockquote class="book-hint default">
&lt;p>A Support Vector Machine is a supervised learning algorithm that finds an optimal hyperplane by maximising the margin between classes, using support vectors and kernel functions to handle non-linear data.&lt;/p></description></item><item><title>Deep Recurrent Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/075-recurrent-nn-deep/</link><pubDate>Sun, 19 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/075-recurrent-nn-deep/</guid><description>&lt;h1 id="deep-recurrent-neural-networks">
 Deep Recurrent Neural Networks
 
 &lt;a class="anchor" href="#deep-recurrent-neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;p>Vanilla RNNs introduce the hidden-state idea, but they struggle on longer and more complex sequences because gradients can vanish across time. Deep recurrent models extend the RNN idea in two important ways:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>make the recurrent architecture richer&lt;/strong>, for example by stacking multiple recurrent layers or using information from both directions,&lt;/li>
&lt;li>&lt;strong>use gates and memory cells&lt;/strong> to control what should be remembered, forgotten, updated, and exposed.&lt;/li>
&lt;/ol>
&lt;p>This is why practical recurrent modelling usually moves from a simple RNN to &lt;strong>stacked RNNs, bidirectional RNNs, GRUs, or LSTMs&lt;/strong>.&lt;/p></description></item><item><title>Attention Mechanism</title><link>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</guid><description>&lt;h1 id="attention-mechanism">
 Attention Mechanism
 
 &lt;a class="anchor" href="#attention-mechanism">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Queries, Keys, and Values&lt;/li>
&lt;li>Attention Pooling by Similarity&lt;/li>
&lt;li>Attention Pooling via Nadaraya–Watson Regression&lt;/li>
&lt;li>Attention Scoring Functions&lt;/li>
&lt;li>Dot Product Attention&lt;/li>
&lt;li>Convenience Functions&lt;/li>
&lt;li>Scaled Dot Product Attention&lt;/li>
&lt;li>Additive Attention&lt;/li>
&lt;li>Bahdanau Attention Mechanism&lt;/li>
&lt;li>Multi-Head Attention&lt;/li>
&lt;li>Self-Attention&lt;/li>
&lt;li>Positional Encoding&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_builders-guide/model-construction.html">Ch 10&lt;/a>, &lt;a href="https://d2l.ai/chapter_convolutional-neural-networks/index.html">Ch7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Bayesian Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</guid><description>&lt;h1 id="bayesian-learning">
 Bayesian Learning
 
 &lt;a class="anchor" href="#bayesian-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="mle-hypothesis">
 MLE Hypothesis
 
 &lt;a class="anchor" href="#mle-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="map-hypothesis">
 MAP Hypothesis
 
 &lt;a class="anchor" href="#map-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayes-rule">
 Bayes Rule
 
 &lt;a class="anchor" href="#bayes-rule">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="optimal-bayes-classifier">
 Optimal Bayes Classifier
 
 &lt;a class="anchor" href="#optimal-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="naïve-bayes-classifier">
 Naïve Bayes Classifier
 
 &lt;a class="anchor" href="#na%c3%afve-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-generative-classifiers">
 Probabilistic Generative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-generative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayesian-linear-regression">
 Bayesian Linear Regression
 
 &lt;a class="anchor" href="#bayesian-linear-regression">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Transformer</title><link>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</guid><description>&lt;h1 id="transformer">
 Transformer
 
 &lt;a class="anchor" href="#transformer">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>is an architecture of neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>based on the multi-head attention mechanism&lt;/p>
&lt;/li>
&lt;li>
&lt;p>text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table&lt;/p>
&lt;/li>
&lt;li>
&lt;p>takes a text sequence as input and produces another text sequence as output&lt;/p>
&lt;/li>
&lt;li>
&lt;p>foundation for modern &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/genai/llm/">Large Language Models (LLMs)&lt;/a>&lt;/strong> like ChatGPT and Gemini&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transformer architecture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Model, Positionwise Feed-Forward Networks, Residual Connection and Layer Normalization&lt;/p></description></item><item><title>Ensemble Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</guid><description>&lt;h1 id="ensemble-learning">
 Ensemble Learning
 
 &lt;a class="anchor" href="#ensemble-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="combining-classifiers">
 Combining Classifiers
 
 &lt;a class="anchor" href="#combining-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bagging">
 Bagging
 
 &lt;a class="anchor" href="#bagging">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="random-forest">
 Random Forest
 
 &lt;a class="anchor" href="#random-forest">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="boosting">
 Boosting
 
 &lt;a class="anchor" href="#boosting">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="adaboost">
 ADABoost
 
 &lt;a class="anchor" href="#adaboost">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="gradient-boosting">
 Gradient Boosting
 
 &lt;a class="anchor" href="#gradient-boosting">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="xgboost">
 XGBoost
 
 &lt;a class="anchor" href="#xgboost">#&lt;/a>
 
&lt;/h3>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Optimisation of Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</guid><description>&lt;h1 id="optimisation-of-deep-models">
 Optimisation of Deep models
 
 &lt;a class="anchor" href="#optimisation-of-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Goal of Optimization&lt;/li>
&lt;li>Optimization Challenges in Deep Learning&lt;/li>
&lt;li>Gradient Descent&lt;/li>
&lt;li>Stochastic Gradient Descent&lt;/li>
&lt;li>Minibatch Stochastic Gradient Descent&lt;/li>
&lt;li>Momentum&lt;/li>
&lt;li>Adagrad and Algorithm&lt;/li>
&lt;li>RMSProp and Algorithm&lt;/li>
&lt;li>Adadelta and Algorithm&lt;/li>
&lt;li>Adam and Algorithm&lt;/li>
&lt;li>Code Implementation and comparison of algorithms (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (Ch12)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Evaluation/Comparison</title><link>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</guid><description>&lt;h1 id="machine-learning-model-evaluationcomparison">
 Machine Learning Model Evaluation/Comparison
 
 &lt;a class="anchor" href="#machine-learning-model-evaluationcomparison">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="comparing-machine-learning-models">
 Comparing Machine Learning Models
 
 &lt;a class="anchor" href="#comparing-machine-learning-models">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">
 Emerging requirements e.g., bias, fairness, interpretability of ML models
 
 &lt;a class="anchor" href="#emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Regularisation for Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</guid><description>&lt;h1 id="regularisation-for-deep-models">
 Regularisation for Deep models
 
 &lt;a class="anchor" href="#regularisation-for-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Generalization for regression&lt;/li>
&lt;li>Training Error and Generalization Error&lt;/li>
&lt;li>Underfitting or Overfitting&lt;/li>
&lt;li>Model Selection&lt;/li>
&lt;li>Weight Decay and Norms&lt;/li>
&lt;li>Generalization in Classification&lt;/li>
&lt;li>Environment and Distribution Shift&lt;/li>
&lt;li>Generalization in Deep Learning&lt;/li>
&lt;li>Dropout&lt;/li>
&lt;li>Batch Normalization&lt;/li>
&lt;li>Layer Normalization&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_introduction/index.html">T1 – Ch 3.6, 3.7, T1 - Ch 4.6, 4.7, T1 - Ch 5.5, 5.6, T1 - Ch 8.5, T1 - Ch 11.7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Linear Algebra</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/</guid><description>&lt;h1 id="linear-algebra">
 Linear Algebra
 
 &lt;a class="anchor" href="#linear-algebra">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>study of vectors and matrices&lt;/strong> is called Linear Algebra.&lt;/p>
&lt;p>Linear Algebra provides the &lt;strong>mathematical language&lt;/strong> used &lt;strong>to represent data, transformations, and structure&lt;/strong> in ML.&lt;/p>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/">Linear Systems&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/">Systems of Linear Equations&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/">Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/matrix-transposition/">Matrix Transposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/">Solving Linear Systems&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/">Forward and Backward Substitution&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/">Inverse Matrix&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/">Convex Combination&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/">Vector Spaces&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/020-basis-and-rank/">Basis and Rank&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/">Linear Independence&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/030-norm/">Norm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/">Inner Products and Dot Product&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/050-lengths-and-distances/">Lengths and Distances&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/">Angles and Orthogonality&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/070-orthonormal-basis/">Orthonormal Basis&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/">Feature Space&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/">Cauchy–Schwarz&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">Matrix Decompositions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/">Characteristic Polynomial&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/">Determinant and Trace&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/">Eigenvalues and Eigenvectors&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/">Cholesky Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/">Eigen Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/">Diagonalization&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/">Singular Value Decomposition (SVD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/">Matrix Approximation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">Dimensionality reduction and PCA&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;h2 id="why-linear-algebra-matters-in-ml">
 Why Linear Algebra Matters in ML
 
 &lt;a class="anchor" href="#why-linear-algebra-matters-in-ml">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Every machine learning model uses matrices&lt;/li>
&lt;li>All data in ML is represented using &lt;strong>vectors and matrices&lt;/strong>&lt;/li>
&lt;li>Neural networks are pipelines of matrix operations&lt;/li>
&lt;li>Models apply &lt;strong>matrix transformations&lt;/strong> to data&lt;/li>
&lt;li>Optimisation relies on linear algebra operations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-to-learn">
 What to Learn
 
 &lt;a class="anchor" href="#what-to-learn">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Scalars, vectors, and matrices&lt;/li>
&lt;li>Vector operations (addition, dot product)&lt;/li>
&lt;li>Matrix multiplication &lt;em>(critical)&lt;/em>&lt;/li>
&lt;li>Identity matrices and transpose&lt;/li>
&lt;li>Eigenvalues and eigenvectors &lt;em>(conceptual understanding)&lt;/em>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;ul>
&lt;li>&lt;strong>Scalar&lt;/strong> → a number&lt;/li>
&lt;li>&lt;strong>Vector&lt;/strong> → a directed point&lt;/li>
&lt;li>&lt;strong>Matrix&lt;/strong> → a space transformer&lt;/li>
&lt;li>&lt;strong>Linear transformation&lt;/strong> → structured mapping&lt;/li>
&lt;li>&lt;strong>Feature&lt;/strong> → one axis&lt;/li>
&lt;li>&lt;strong>Feature space&lt;/strong> → where data lives&lt;/li>
&lt;li>&lt;strong>Vector space&lt;/strong> → where vectors live&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/">
 Mathematical Foundation
&lt;/a>&lt;/p></description></item><item><title>Linear Systems</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/</guid><description>&lt;h1 id="linear-systems">
 Linear Systems
 
 &lt;a class="anchor" href="#linear-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>How systems of linear equations are represented and solved using matrices.&lt;/p>
&lt;ul>
&lt;li>the study of vectors and rules to manipulate
vectors&lt;/li>
&lt;li>describe multiple linear equations solved simultaneously&lt;/li>
&lt;li>connect algebraic equations with matrix representations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/matrix_vector_operations.png" alt="Matrix" />&lt;/p>
&lt;hr>
&lt;h2 id="idea-of-closure">
 Idea of Closure
 
 &lt;a class="anchor" href="#idea-of-closure">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>performing a specific operation (like addition or multiplication) on members of a set always produces a result that belongs to the same set&lt;/p>
&lt;/li>
&lt;li>
&lt;p>idea of closure is fundamental to defining a &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/linear-algebra/01-linear-systems">Vector space&lt;/a>&lt;/strong> because it ensures that performing arithmetic operations (addition and scalar multiplication) on vectors within a set does not produce a new element outside that set.&lt;/p></description></item><item><title>Systems of Linear Equations</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/</guid><description>&lt;h1 id="systems-of-linear-equations">
 Systems of Linear Equations
 
 &lt;a class="anchor" href="#systems-of-linear-equations">#&lt;/a>
 
&lt;/h1>
&lt;p>A system of linear equations can be written compactly as:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
A\mathbf{x}=\mathbf{b}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>This represents:&lt;/p>
&lt;ul>
&lt;li>a &lt;strong>linear transformation&lt;/strong> applied to an unknown vector (\mathbf{x})&lt;/li>
&lt;li>producing an output vector (\mathbf{b})&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="key-components">
 Key components
 
 &lt;a class="anchor" href="#key-components">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="coefficient-matrix-a">
 Coefficient matrix (A)
 
 &lt;a class="anchor" href="#coefficient-matrix-a">#&lt;/a>
 
&lt;/h3>
&lt;p>(A) contains the coefficients of the variables.&lt;/p></description></item><item><title>Calculus</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/</guid><description>&lt;h1 id="calculus">
 Calculus
 
 &lt;a class="anchor" href="#calculus">#&lt;/a>
 
&lt;/h1>
&lt;p>Calculus is:&lt;/p>
&lt;ul>
&lt;li>the mathematical framework for understanding and controlling how quantities change&lt;/li>
&lt;li>the mathematics of &lt;strong>change&lt;/strong> and &lt;strong>accumulation&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>It helps answer:&lt;/p>
&lt;ul>
&lt;li>How fast is something changing &lt;strong>right now&lt;/strong>?&lt;/li>
&lt;li>What happens when inputs change &lt;strong>slightly&lt;/strong>?&lt;/li>
&lt;li>Where is something &lt;strong>maximum or minimum&lt;/strong>?&lt;/li>
&lt;/ul>
&lt;p>It answers two big questions:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>How fast is something changing right now?&lt;/strong> → derivatives (differentiation)&lt;/li>
&lt;li>&lt;strong>How much has accumulated over an interval?&lt;/strong> → integrals (integration)&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
 A[Calculus] --&amp;gt; B[Limits]
 B --&amp;gt; C[Continuity]
 B --&amp;gt; D[Derivatives]
 B --&amp;gt; E[Integrals]
 D --&amp;gt; F[Optimisation: maxima/minima]
 D --&amp;gt; G[ML: gradients &amp;amp; learning]
 E --&amp;gt; H[Accumulation: area/total change]
&lt;/pre>

&lt;hr>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">Vector Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">Continuous Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">Nonlinear Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="differential-calculus-rates-of-change">
 Differential Calculus (Rates of Change)
 
 &lt;a class="anchor" href="#differential-calculus-rates-of-change">#&lt;/a>
 
&lt;/h2>
&lt;p>Studies &lt;strong>how things change&lt;/strong>.&lt;/p></description></item><item><title>Matrices</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/</guid><description>&lt;h1 id="matrices">
 Matrices
 
 &lt;a class="anchor" href="#matrices">#&lt;/a>
 
&lt;/h1>
&lt;p>Matrices are the &lt;strong>core data structure of linear algebra&lt;/strong> and the &lt;strong>workhorse of machine learning&lt;/strong>.&lt;br>
Almost every ML model can be described as a sequence of matrix operations.&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://arshadhs.github.io/docs/ai/maths/linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="matrix">
 Matrix
 
 &lt;a class="anchor" href="#matrix">#&lt;/a>
 
&lt;/h2>
&lt;p>A &lt;strong>matrix&lt;/strong> is a rectangular array of numbers arranged in &lt;strong>rows and columns&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
A \in \mathbb{R}^{m \times n}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>An ( m \times n ) matrix has:&lt;/p></description></item><item><title>Solving Linear Systems</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/</guid><description>&lt;h1 id="solving-linear-systems">
 Solving Linear Systems
 
 &lt;a class="anchor" href="#solving-linear-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>Solve using:&lt;/p>
&lt;ul>
&lt;li>Substitution Method&lt;/li>
&lt;li>Elimination Method (Multiple &amp;amp; then Subtract)&lt;/li>
&lt;li>Cross Multiplication&lt;/li>
&lt;/ul>
&lt;p>Linear system can have:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>no solution&lt;/strong>&lt;/li>
&lt;li>&lt;strong>a unique solution&lt;/strong>&lt;/li>
&lt;li>&lt;strong>infinitely many solutions&lt;/strong>&lt;/li>
&lt;/ul>
&lt;h2 id="positive-definite-matrices">
 Positive Definite Matrices
 
 &lt;a class="anchor" href="#positive-definite-matrices">#&lt;/a>
 
&lt;/h2>
&lt;p>A square matrix is positive definite if pre-multiplying and post-multiplying it by the same vector always gives a positive number as a result, independently of how we choose the vector.&lt;/p>
&lt;p>Positive definite symmetric matrices have the property that all their eigenvalues are positive.&lt;/p></description></item><item><title>Forward and Backward Substitution</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/</guid><description>&lt;h1 id="forward-and-backward-substitution">
 Forward and Backward Substitution
 
 &lt;a class="anchor" href="#forward-and-backward-substitution">#&lt;/a>
 
&lt;/h1>
&lt;p>Forward and backward substitution are efficient algorithms used to solve linear systems when the coefficient matrix is &lt;strong>triangular&lt;/strong>.&lt;/p>
&lt;p>They are typically used after:&lt;/p>
&lt;ul>
&lt;li>Gaussian elimination&lt;/li>
&lt;li>LU decomposition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="1-forward-substitution-lower-triangular-systems">
 1. Forward Substitution (Lower Triangular Systems)
 
 &lt;a class="anchor" href="#1-forward-substitution-lower-triangular-systems">#&lt;/a>
 
&lt;/h1>
&lt;p>Used to solve:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
L\mathbf{x} = \mathbf{b}
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>where (L) is a &lt;strong>lower triangular matrix&lt;/strong>:&lt;/p></description></item><item><title>Inverse Matrix</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/</guid><description>&lt;h1 id="inverse-matrix">
 Inverse Matrix
 
 &lt;a class="anchor" href="#inverse-matrix">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>inverse of a matrix&lt;/strong> is a matrix that, when multiplied with the original matrix, produces the &lt;strong>identity matrix&lt;/strong>.&lt;/p>
&lt;p>A square matrix (A) is &lt;strong>invertible&lt;/strong> if there exists a matrix (A^{-1}) such that:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
AA^{-1} = A^{-1}A = I
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>Here:&lt;/p></description></item><item><title>Convex Combination</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/</link><pubDate>Thu, 29 Jan 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/</guid><description>&lt;h1 id="convex-combination-of-two-points">
 Convex Combination of Two Points
 
 &lt;a class="anchor" href="#convex-combination-of-two-points">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>convex combination&lt;/strong> describes how to form a point between two points using weighted averages.&lt;/p>
&lt;p>It is a fundamental building block in several advanced fields:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Linear Algebra &amp;amp; Geometry&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Optimization Theory&lt;/strong>&lt;/li>
&lt;li>&lt;strong>Machine Learning&lt;/strong> (Specifically in SVMs, clustering, and data interpolation)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>Given two points (or vectors) $\mathbf{x}_1, \mathbf{x}_2 \in \mathbb{R}^n$, a convex combination of these points is defined as:&lt;/p>
$$\mathbf{x} = \lambda \mathbf{x}_1 + (1 - \lambda)\mathbf{x}_2$$&lt;p>&lt;strong>Where:&lt;/strong>&lt;/p></description></item><item><title>Vector Spaces</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/</guid><description>&lt;h1 id="vector-spaces">
 Vector Spaces
 
 &lt;a class="anchor" href="#vector-spaces">#&lt;/a>
 
&lt;/h1>
&lt;p>A vector space is the mathematical “home” where vectors live and where addition and scaling are valid operations.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A vector space is a set closed under vector addition and scalar multiplication.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Machine learning operates in vector spaces.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>covers independence, bases, rank, and geometric tools like norms and inner products that are used to measure length, distance, and angles.&lt;/p>
&lt;/li>
&lt;/ul>
&lt;p>A &lt;strong>vector space&lt;/strong> is a set of vectors that follows &lt;strong>ten axioms&lt;/strong>, defined under two operations:&lt;/p></description></item><item><title>Feature Space</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/</guid><description>&lt;h1 id="feature">
 Feature
 
 &lt;a class="anchor" href="#feature">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>feature&lt;/strong> is an individual measurable property or characteristic of a data point used as input to a machine learning model.&lt;/p>
&lt;p>Each feature corresponds to &lt;strong>one dimension&lt;/strong>.&lt;/p>

&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \[ 
x_i \in \mathbb{R}
 \]
 &lt;/span>


&lt;p>A data point with ( d ) features is represented as:&lt;/p></description></item><item><title>Cauchy–Schwarz</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/</guid><description>&lt;h1 id="cauchyschwarz-inequality">
 Cauchy–Schwarz Inequality
 
 &lt;a class="anchor" href="#cauchyschwarz-inequality">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>Cauchy–Schwarz Inequality&lt;/strong> is one of the most important results in linear algebra.&lt;/p>
&lt;p>It places a fundamental bound on the inner product of two vectors.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>If you see &lt;strong>angle&lt;/strong>, &lt;strong>cosine&lt;/strong>, &lt;strong>similarity&lt;/strong>, or &lt;strong>inner product bounds&lt;/strong>&lt;br>
→ think &lt;strong>Cauchy–Schwarz Inequality&lt;/strong>&lt;/p>
&lt;p>Key Idea:
The inner product (dot product) can never exceed the product of magnitudes.
This ensures all geometric interpretations (angles, cosine) are valid.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="statement-of-the-inequality">
 Statement of the Inequality
 
 &lt;a class="anchor" href="#statement-of-the-inequality">#&lt;/a>
 
&lt;/h2>
&lt;p>For any vectors:&lt;/p></description></item><item><title>Matrix Decompositions</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/</guid><description>&lt;h1 id="matrix-decompositions">
 Matrix Decompositions
 
 &lt;a class="anchor" href="#matrix-decompositions">#&lt;/a>
 
&lt;/h1>
&lt;p>Decompositions reveal structure in matrices and power algorithms like PCA.&lt;/p>
&lt;p>Matrix decompositions break complex matrices into simpler parts.&lt;/p>
&lt;p>From the lecture introduction, matrices are used to describe mappings and transformations of vectors.&lt;/p>
&lt;p>That is why decomposition is important:
it lets us understand a complicated transformation by rewriting it using simpler building blocks.&lt;/p>
&lt;p>In the slides, the topic is introduced as part of three closely connected goals:
how to summarise matrices,
how matrices can be decomposed,
and how the decompositions can be used for matrix approximations.&lt;/p></description></item><item><title>Characteristic Polynomial</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/</guid><description>&lt;h1 id="characteristic-polynomial">
 Characteristic Polynomial
 
 &lt;a class="anchor" href="#characteristic-polynomial">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>characteristic polynomial&lt;/strong> of a square matrix is the key tool used to compute &lt;strong>eigenvalues&lt;/strong>.&lt;/p>
&lt;p>It connects:&lt;/p>
&lt;ul>
&lt;li>Determinants&lt;/li>
&lt;li>Trace&lt;/li>
&lt;li>Eigenvalues&lt;/li>
&lt;li>Matrix structure&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>Let&lt;br>

&lt;span>
 \( A \in \mathbb{R}^{n \times n} \)
 &lt;/span>

&lt;br>
and 
&lt;span>
 \( \lambda \in \mathbb{R} \)
 &lt;/span>

.&lt;/p>
&lt;p>The &lt;strong>characteristic polynomial&lt;/strong> of (A) is defined as:&lt;/p>
&lt;blockquote class="book-hint danger">
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>
 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>
&lt;span>
 \[ 
p_A(\lambda) = \det(A - \lambda I)
 \]
 &lt;/span>
&lt;/blockquote>
&lt;p>It is a polynomial in 
&lt;span>
 \( \lambda \)
 &lt;/span>

 of degree (n).&lt;/p></description></item><item><title>Determinant and Trace</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/</guid><description>&lt;h1 id="determinant-and-trace">
 Determinant and Trace
 
 &lt;a class="anchor" href="#determinant-and-trace">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;h2 id="minor">
 Minor
 
 &lt;a class="anchor" href="#minor">#&lt;/a>
 
&lt;/h2>
&lt;p>The &lt;strong>minor&lt;/strong> of an element 
&lt;span>
 \( a_{ij} \)
 &lt;/span>

 is the determinant of the smaller square matrix formed by:&lt;/p>
&lt;ul>
&lt;li>removing &lt;strong>row&lt;/strong> 
&lt;span>
 \( i \)
 &lt;/span>

&lt;/li>
&lt;li>removing &lt;strong>column&lt;/strong> 
&lt;span>
 \( j \)
 &lt;/span>

&lt;/li>
&lt;/ul>
&lt;p>The minor is denoted 
&lt;span>
 \( M_{ij} \)
 &lt;/span>

.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Minors are used to compute &lt;strong>cofactors&lt;/strong>, which are used for determinants and inverses (via adjoint/adjugate).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="cofactor">
 Cofactor
 
 &lt;a class="anchor" href="#cofactor">#&lt;/a>
 
&lt;/h2>
&lt;p>The &lt;strong>cofactor&lt;/strong> of 
&lt;span>
 \( a_{ij} \)
 &lt;/span>

, denoted 
&lt;span>
 \( C_{ij} \)
 &lt;/span>

, is:&lt;/p></description></item><item><title>Eigenvalues and Eigenvectors</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/</guid><description>&lt;h1 id="eigenvalues-and-eigenvectors">
 Eigenvalues and Eigenvectors
 
 &lt;a class="anchor" href="#eigenvalues-and-eigenvectors">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Eigenvalues give scaling.&lt;/li>
&lt;li>Eigenvectors define invariant directions of transformation.&lt;/li>
&lt;/ul>
&lt;p>Eigenvalues and eigenvectors describe directions that remain unchanged under a linear transformation, except for scaling.&lt;/p>
&lt;p>From lectures:
matrix multiplication represents a transformation of space.&lt;br>
Most vectors change direction and magnitude.&lt;br>
Some special vectors only scale.&lt;br>
These are eigenvectors.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
A matrix transformation stretches or compresses vectors.
Eigenvectors are directions that remain unchanged.
Eigenvalues tell how much scaling happens.&lt;/p></description></item><item><title>Cholesky Decomposition</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/</guid><description>&lt;h1 id="cholesky-decomposition">
 Cholesky Decomposition
 
 &lt;a class="anchor" href="#cholesky-decomposition">#&lt;/a>
 
&lt;/h1>
&lt;p>Cholesky decomposition is a special matrix factorisation used for symmetric positive definite matrices.&lt;/p>
&lt;p>From lecture discussions, this decomposition is powerful because it reduces a matrix into a triangular form, making computations easier and more stable.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Cholesky decomposition expresses a matrix as a product of a lower triangular matrix and its transpose.
It is efficient and numerically stable.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h2>
&lt;p>For a symmetric positive definite matrix:&lt;/p></description></item><item><title>Eigen Decomposition</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/</guid><description>&lt;h1 id="eigen-decomposition">
 Eigen Decomposition
 
 &lt;a class="anchor" href="#eigen-decomposition">#&lt;/a>
 
&lt;/h1>
&lt;p>Eigen decomposition expresses a matrix using its eigenvectors and eigenvalues.&lt;/p>
&lt;p>From lecture discussions, this is one of the most important ways to understand the internal structure of a matrix.&lt;/p>
&lt;p>Instead of treating the matrix as a black box, eigen decomposition reveals its fundamental directions and scaling behaviour.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
Eigen decomposition rewrites a matrix in terms of directions (eigenvectors) and scaling factors (eigenvalues).
This makes complex transformations easier to understand and compute.&lt;/p></description></item><item><title>Diagonalization</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/</guid><description>&lt;h1 id="diagonalization">
 Diagonalization
 
 &lt;a class="anchor" href="#diagonalization">#&lt;/a>
 
&lt;/h1>
&lt;p>Diagonalisation expresses a matrix using its eigenvectors and eigenvalues when possible.&lt;/p>
&lt;p>From lecture explanation, diagonalisation is one of the most powerful tools because it converts a complicated matrix into a much simpler form.&lt;/p>
&lt;p>Instead of working with a full matrix, we work with a diagonal matrix, which is much easier to analyse and compute.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
If a matrix has enough independent eigenvectors, it can be rewritten as a diagonal matrix using a change of basis.
This simplifies matrix operations significantly.&lt;/p></description></item><item><title>Singular Value Decomposition (SVD)</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/</link><pubDate>Wed, 18 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/</guid><description>&lt;h1 id="singular-value-decomposition-svd">
 Singular Value Decomposition (SVD)
 
 &lt;a class="anchor" href="#singular-value-decomposition-svd">#&lt;/a>
 
&lt;/h1>
&lt;p>Singular Value Decomposition (SVD) is one of the most important matrix decomposition techniques in linear algebra and machine learning.&lt;/p>
&lt;p>It factorises any matrix into three simpler matrices that reveal its structure.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key Idea:
SVD decomposes a matrix into rotations + scaling.
It tells us how data is transformed along orthogonal directions.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h1 id="definition">
 Definition
 
 &lt;a class="anchor" href="#definition">#&lt;/a>
 
&lt;/h1>
&lt;p>For any matrix in real space:

&lt;span style="color: green;">
 &lt;span>
 \[ 
A \in \mathbb{R}^{m \times n}
 \]
 &lt;/span>

&lt;/span>&lt;/p></description></item><item><title>Matrix Approximation</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/</guid><description>&lt;h1 id="matrix-approximation">
 Matrix Approximation
 
 &lt;a class="anchor" href="#matrix-approximation">#&lt;/a>
 
&lt;/h1>
&lt;p>Low-rank approximation keeps the most important structure while reducing noise and computation.&lt;/p>
&lt;hr>
&lt;h2 id="low-rank-approximation">
 Low-Rank Approximation
 
 &lt;a class="anchor" href="#low-rank-approximation">#&lt;/a>
 
&lt;/h2>
&lt;p>Used for:&lt;/p>
&lt;ul>
&lt;li>Dimensionality reduction&lt;/li>
&lt;li>Noise removal&lt;/li>
&lt;li>Efficient computation&lt;/li>
&lt;/ul>
&lt;p>Forms the basis of &lt;strong>PCA&lt;/strong>.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">
 Matrix Decompositions
&lt;/a>&lt;/p></description></item><item><title>Vector Calculus</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/</guid><description>&lt;h1 id="vector-calculus">
 Vector Calculus
 
 &lt;a class="anchor" href="#vector-calculus">#&lt;/a>
 
&lt;/h1>
&lt;p>Vector calculus extends differentiation to multivariate and vector-valued functions.&lt;/p>
&lt;p>Gradients power learning. This section builds differentiation skills needed for backpropagation.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD

 %% Core Node
 PD[&amp;#34;Partial Derivatives&amp;#34;]

 %% Supporting Concepts
 DQ[&amp;#34;Difference Quotient&amp;#34;]
 JH[&amp;#34;Jacobian / Hessian&amp;#34;]
 TS[&amp;#34;Taylor Series&amp;#34;]

 %% Application Chapters
 CH6[&amp;#34;&amp;lt;br/&amp;gt;Probability&amp;#34;]
 CH7[&amp;#34;&amp;lt;br/&amp;gt;Optimization&amp;#34;]
 CH9[&amp;#34;&amp;lt;br/&amp;gt;Regression&amp;#34;]
 CH10[&amp;#34;&amp;lt;br/&amp;gt;Dimensionality Reduction&amp;#34;]
 CH11[&amp;#34;&amp;lt;br/&amp;gt;Density Estimation&amp;#34;]
 CH12[&amp;#34;&amp;lt;br/&amp;gt;Classification&amp;#34;]

 %% Relationships
 DQ --&amp;gt;|defines| PD
 PD --&amp;gt;|collected in| JH
 JH --&amp;gt;|used in| TS
 JH --&amp;gt;|used in| CH6
	
 PD --&amp;gt;|used in| CH7
 PD --&amp;gt;|used in| CH9
 PD --&amp;gt;|used in| CH10
 PD --&amp;gt;|used in| CH11
 PD --&amp;gt;|used in| CH12

 %% Styling (Your Soft Academic Palette)
 style PD fill:#90CAF9,stroke:#1E88E5,color:#000

 style DQ fill:#CE93D8,stroke:#8E24AA,color:#000
 style JH fill:#CE93D8,stroke:#8E24AA,color:#000
 style TS fill:#CE93D8,stroke:#8E24AA,color:#000
 style CH6 fill:#CE93D8,stroke:#8E24AA,color:#000
	
 style CH7 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH9 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH10 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH11 fill:#C8E6C9,stroke:#2E7D32,color:#000
 style CH12 fill:#C8E6C9,stroke:#2E7D32,color:#000

&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Continuous Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/</guid><description>&lt;h1 id="continuous-optimisation">
 Continuous Optimisation
 
 &lt;a class="anchor" href="#continuous-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Optimisation finds parameters that minimise (or maximise) an objective function.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Optimisation using Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/</guid><description>&lt;h1 id="optimisation-using-gradient-descent">
 Optimisation using Gradient Descent
 
 &lt;a class="anchor" href="#optimisation-using-gradient-descent">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an optimisation algorithm used to train ML and neural networks.&lt;/p>
&lt;ul>
&lt;li>Gradient descent updates parameters by moving opposite the gradient.&lt;/li>
&lt;/ul>
&lt;p>Trains ML models by minimising errors:&lt;/p>
&lt;ul>
&lt;li>between predicted and actual results&lt;/li>
&lt;li>by iteratively adjusting its parameters&lt;/li>
&lt;li>moves step‑by‑step in the direction of the steepest decrease in the loss function, it helps ML models learn the best possible weights for better predictions&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="types-of-gradient-gescent-learning-algorithms">
 Types of Gradient Gescent learning algorithms
 
 &lt;a class="anchor" href="#types-of-gradient-gescent-learning-algorithms">#&lt;/a>
 
&lt;/h2>
&lt;ol>
&lt;li>Batch gradient descent&lt;/li>
&lt;li>Stochastic gradient descent&lt;/li>
&lt;li>Mini-batch gradient descent&lt;/li>
&lt;/ol>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Constrained Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/</guid><description>&lt;h1 id="constrained-optimisation">
 Constrained Optimisation
 
 &lt;a class="anchor" href="#constrained-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Optimisation with constraints (equalities/inequalities).&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Lagrange Multipliers</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/</guid><description>&lt;h1 id="lagrange-multipliers">
 Lagrange Multipliers
 
 &lt;a class="anchor" href="#lagrange-multipliers">#&lt;/a>
 
&lt;/h1>
&lt;p>Transforms constrained problems into unconstrained ones using Lagrangians.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Convex Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/</guid><description>&lt;h1 id="convex-optimisation">
 Convex Optimisation
 
 &lt;a class="anchor" href="#convex-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;p>Convex objectives have a single global minimum, making optimisation reliable.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">
 Continuous Optimisation
&lt;/a>&lt;/p></description></item><item><title>Nonlinear Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/</guid><description>&lt;h1 id="nonlinear-optimisation-in-machine-learning">
 Nonlinear Optimisation in Machine Learning
 
 &lt;a class="anchor" href="#nonlinear-optimisation-in-machine-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Practical training challenges and modern optimisers used in ML.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">
 Calculus
&lt;/a>&lt;/p></description></item><item><title>Challenges in Gradient-Based Optimisation</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/</guid><description>&lt;h1 id="challenges-in-gradient-based-optimisation">
 Challenges in Gradient-Based Optimisation
 
 &lt;a class="anchor" href="#challenges-in-gradient-based-optimisation">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Local optima and flat regions&lt;/li>
&lt;li>Differential curvature&lt;/li>
&lt;li>Difficult topologies (cliffs and valleys)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Stochastic Gradient Descent (SGD)</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/</guid><description>&lt;h1 id="stochastic-gradient-descent-sgd">
 Stochastic Gradient Descent (SGD)
 
 &lt;a class="anchor" href="#stochastic-gradient-descent-sgd">#&lt;/a>
 
&lt;/h1>
&lt;p>SGD uses mini-batches to trade exact gradients for speed and generalisation.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Momentum-Based Learning</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/</guid><description>&lt;h1 id="momentum-based-learning">
 Momentum-Based Learning
 
 &lt;a class="anchor" href="#momentum-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Momentum smooths updates and helps traverse valleys efficiently.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Adaptive Methods: AdaGrad, RMSProp, Adam</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/</guid><description>&lt;h1 id="adaptive-methods-adagrad-rmsprop-adam">
 Adaptive Methods: AdaGrad, RMSProp, Adam
 
 &lt;a class="anchor" href="#adaptive-methods-adagrad-rmsprop-adam">#&lt;/a>
 
&lt;/h1>
&lt;p>Adaptive methods adjust learning rates per-parameter.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Tuning Hyperparameters and Preprocessing</title><link>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/</guid><description>&lt;h1 id="tuning-hyperparameters-and-preprocessing">
 Tuning Hyperparameters and Preprocessing
 
 &lt;a class="anchor" href="#tuning-hyperparameters-and-preprocessing">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Learning rate schedules&lt;/li>
&lt;li>Initialisation&lt;/li>
&lt;li>Tuning hyperparameters&lt;/li>
&lt;li>Importance of feature preprocessing&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">
 Nonlinear Optimisation
&lt;/a>&lt;/p></description></item><item><title>Dimensionality reduction and PCA</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/</guid><description>&lt;h1 id="dimensionality-reduction-and-pca">
 Dimensionality reduction and PCA
 
 &lt;a class="anchor" href="#dimensionality-reduction-and-pca">#&lt;/a>
 
&lt;/h1>
&lt;p>PCA and SVM connect linear algebra, geometry, and optimisation.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/">
 Linear Algebra
&lt;/a>&lt;/p></description></item><item><title>Principal Component Analysis (PCA)</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/</guid><description>&lt;h1 id="principal-component-analysis-pca">
 Principal Component Analysis (PCA)
 
 &lt;a class="anchor" href="#principal-component-analysis-pca">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>dimensionality reduction technique&lt;/li>
&lt;li>helps us to &lt;strong>reduce the number of features&lt;/strong> in a dataset while keeping the most important information.&lt;/li>
&lt;li>changes complex datasets by transforming correlated features into a smaller set of uncorrelated components.&lt;/li>
&lt;li>uses &lt;strong>linear algebra&lt;/strong> to transform data into &lt;strong>new features&lt;/strong> called principal components.&lt;/li>
&lt;li>finds these by calculating &lt;strong>eigenvectors (directions)&lt;/strong> and &lt;strong>eigenvalues (importance)&lt;/strong> from the &lt;strong>covariance matrix&lt;/strong>.&lt;/li>
&lt;li>PCA &lt;strong>selects the top components with the highest eigenvalues&lt;/strong> and &lt;strong>projects the data onto them simplify the dataset&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint default">
&lt;p>PCA prioritizes the directions where the data varies the most because more variation = more useful information.&lt;/p></description></item><item><title>PCA Theory</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/</guid><description>&lt;h1 id="pca-theory">
 PCA Theory
 
 &lt;a class="anchor" href="#pca-theory">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Problem setting&lt;/li>
&lt;li>Maximum variance perspective&lt;/li>
&lt;li>Projection perspective&lt;/li>
&lt;li>Eigenvector and low-rank approximations&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>PCA in Practice</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/</guid><description>&lt;h1 id="pca-in-practice">
 PCA in Practice
 
 &lt;a class="anchor" href="#pca-in-practice">#&lt;/a>
 
&lt;/h1>
&lt;p>Key steps of PCA in practice, including considerations in high dimensions.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Latent Variable Perspective</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/</guid><description>&lt;h1 id="latent-variable-perspective">
 Latent Variable Perspective
 
 &lt;a class="anchor" href="#latent-variable-perspective">#&lt;/a>
 
&lt;/h1>
&lt;p>PCA can be interpreted as modelling data using a smaller number of latent variables.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Mathematical Preliminaries of SVM</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/</guid><description>&lt;h1 id="mathematical-preliminaries-of-svm">
 Mathematical Preliminaries of SVM
 
 &lt;a class="anchor" href="#mathematical-preliminaries-of-svm">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Primal and dual perspectives&lt;/li>
&lt;li>Geometry of margins&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>Nonlinear SVM and Kernels</title><link>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/</guid><description>&lt;h1 id="nonlinear-svm-and-kernels">
 Nonlinear SVM and Kernels
 
 &lt;a class="anchor" href="#nonlinear-svm-and-kernels">#&lt;/a>
 
&lt;/h1>
&lt;p>Kernels allow inner products in high-dimensional feature spaces without explicit mapping.&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">
 Dimensionality reduction and PCA
&lt;/a>&lt;/p></description></item><item><title>AI Learning Resources</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-notes/</link><pubDate>Sat, 03 Jan 2026 12:00:00 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-notes/</guid><description>&lt;h1 id="ai-learning-resources">
 AI Learning Resources
 
 &lt;a class="anchor" href="#ai-learning-resources">#&lt;/a>
 
&lt;/h1>
&lt;p>A curated list of &lt;strong>high-quality online courses&lt;/strong> to learn Artificial Intelligence, Machine Learning, and Deep Learning from reputable universities and organisations.&lt;/p>
&lt;hr>
&lt;h2 id="recommended-books--references">
 Recommended Books &amp;amp; References
 
 &lt;a class="anchor" href="#recommended-books--references">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h3 id="deep-neural-networks-dnn">
 Deep Neural Networks (DNN)
 
 &lt;a class="anchor" href="#deep-neural-networks-dnn">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Deep Learning&lt;/strong>. MIT Press.&lt;br>
Goodfellow, I., Bengio, Y., &amp;amp; Courville, A. (2016). (Vol. 1, No. 2).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Introduction to Deep Learning&lt;/strong>. MIT Press.&lt;br>
Eugene, C. (2019).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Deep Learning with Python&lt;/strong>. Simon &amp;amp; Schuster.&lt;br>
Chollet, F. (2021).&lt;/p></description></item><item><title>ML Pipeline</title><link>https://arshadhs.github.io/docs/ai/machine-learning/99-ml-pipeline-model/</link><pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/99-ml-pipeline-model/</guid><description>&lt;h1 id="machine-learning-pipeline-preprocessing--models">
 Machine Learning Pipeline: Preprocessing &amp;amp; Models
 
 &lt;a class="anchor" href="#machine-learning-pipeline-preprocessing--models">#&lt;/a>
 
&lt;/h1>
&lt;p>This page explains both &lt;strong>data preprocessing&lt;/strong> and &lt;strong>model development concepts&lt;/strong> in a clear, structured way to support understanding.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>A complete ML pipeline includes preprocessing, feature engineering, feature selection, and model training.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h1 id="1-data-preprocessing-overview">
 1. Data Preprocessing Overview
 
 &lt;a class="anchor" href="#1-data-preprocessing-overview">#&lt;/a>
 
&lt;/h1>
&lt;p>Raw data is often:&lt;/p>
&lt;ul>
&lt;li>Noisy&lt;/li>
&lt;li>Incomplete&lt;/li>
&lt;li>Inconsistent&lt;/li>
&lt;/ul>
&lt;p>Preprocessing ensures data is suitable for machine learning.&lt;/p>
&lt;hr>
&lt;h1 id="2-missing-values">
 2. Missing Values
 
 &lt;a class="anchor" href="#2-missing-values">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Why they occur&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Sensor errors&lt;/li>
&lt;li>Data collection issues&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Methods&lt;/strong>&lt;/p></description></item></channel></rss>