<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on Arshad Siddiqui</title><link>https://arshadhs.github.io/tags/ai/</link><description>Recent content in AI on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 22 Apr 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://arshadhs.github.io/tags/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Formula Sheet</title><link>https://arshadhs.github.io/docs/ai/statistics/00_formulas/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/00_formulas/</guid><description>&lt;h1 id="formula-sheet">
 Formula Sheet
 
 &lt;a class="anchor" href="#formula-sheet">#&lt;/a>
 
&lt;/h1>
&lt;p>This page is a quick reference of &lt;strong>definitions + formulas&lt;/strong>, grouped by the modules.&lt;/p>
&lt;hr>
&lt;h2 id="notation">
 Notation
 
 &lt;a class="anchor" href="#notation">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Sample size: 
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( n \)
 &lt;/span>

 (sample), 
&lt;span>
 \( N \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Sample mean: 
&lt;span>
 \( \bar{x} \)
 &lt;/span>

, population mean: 
&lt;span>
 \( \mu \)
 &lt;/span>

&lt;/li>
&lt;li>Sample variance: 
&lt;span>
 \( s^2 \)
 &lt;/span>

, population variance: 
&lt;span>
 \( \sigma^2 \)
 &lt;/span>

&lt;/li>
&lt;li>Sample SD: 
&lt;span>
 \( s \)
 &lt;/span>

, population SD: 
&lt;span>
 \( \sigma \)
 &lt;/span>

&lt;/li>
&lt;li>Complement: 
&lt;span>
 \( A^c \)
 &lt;/span>

&lt;/li>
&lt;li>Intersection (“and”): 
&lt;span>
 \( A\cap B \)
 &lt;/span>

, union (“or”): 
&lt;span>
 \( A\cup B \)
 &lt;/span>

&lt;/li>
&lt;li>Conditional probability: 
&lt;span>
 \( P(A\mid B) \)
 &lt;/span>

&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h1 id="1-basic-probability--statistics">
 1. Basic Probability &amp;amp; Statistics
 
 &lt;a class="anchor" href="#1-basic-probability--statistics">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="11-measures-of-central-tendency">
 1.1 Measures of Central Tendency
 
 &lt;a class="anchor" href="#11-measures-of-central-tendency">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="arithmetic-mean">
 Arithmetic mean
 
 &lt;a class="anchor" href="#arithmetic-mean">#&lt;/a>
 
&lt;/h3>
&lt;p>Sample mean (ungrouped):&lt;/p></description></item><item><title>Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</guid><description>&lt;h1 id="supervised-learning">
 Supervised Learning
 
 &lt;a class="anchor" href="#supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Trained using &lt;strong>labelled data&lt;/strong>.&lt;br>
Each example in the training set includes the &lt;strong>correct output&lt;/strong>.&lt;br>
The algorithm learns to &lt;strong>generalise&lt;/strong> and make predictions on unseen data.&lt;br>
Generally more &lt;strong>accurate&lt;/strong> than unsupervised methods.&lt;br>
Requires &lt;strong>human intervention&lt;/strong> for labelling and setup.&lt;br>
Widely used due to its &lt;strong>accuracy and efficiency&lt;/strong>.&lt;br>
Produces &lt;strong>highly accurate results&lt;/strong> when trained on good-quality labelled data.&lt;/p>
&lt;hr>
&lt;h2 id="classification">
 Classification
 
 &lt;a class="anchor" href="#classification">#&lt;/a>
 
&lt;/h2>
&lt;p>Output is &lt;strong>discrete&lt;/strong> (e.g. Yes/No, Spam/Not Spam).&lt;br>
Used for &lt;strong>categorising data&lt;/strong> into predefined classes.&lt;br>
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).&lt;/p></description></item><item><title>Artificial Intelligence</title><link>https://arshadhs.github.io/docs/ai/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/</guid><description>&lt;h1 id="my-ai-notes">
 My AI Notes
 
 &lt;a class="anchor" href="#my-ai-notes">#&lt;/a>
 
&lt;/h1>
&lt;p>Learning how machines learn! My working notes as I learn AI.&lt;/p>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 AI[Artificial Intelligence]
 ML[Machine Learning]
 DL[Deep Learning]
 FM[Foundation Models]
 LLM[LLM Models]

 AI --&amp;gt; ML
 ML --&amp;gt; DL
 DL --&amp;gt; FM
 FM --&amp;gt; LLM

 style AI fill:#E1F5FE
 style ML fill:#C8E6C9
 style DL fill:#90CAF9
 style FM fill:#64B5F6
 style LLM fill:#FFCCBC
&lt;/pre>

&lt;hr>
&lt;ul>
&lt;li>Mathematical Foundations for Machine Learning&lt;/li>
&lt;li>Statistical Methods&lt;/li>
&lt;li>Machine Learning&lt;/li>
&lt;li>Deep Neural Networks&lt;/li>
&lt;/ul>
&lt;hr>




&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/">AI Foundation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stages/">AI Stages: ANI, AGI, ASI&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stack/">AI Stack&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/">AI Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-notes/">AI Learning Resources&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">Machine Learning&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/">Supervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/">Unsupervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/">Semi-Supervised Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/">Reinforcement Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/">ML Workflow&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/">Regression(Linear Models)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/">Ordinary Least Squares&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/">Cost Function&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/">Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/">Classification(Linear Models)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/">Decision Tree&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/">Instance-based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/">Support Vector Machine&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/">Bayesian Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/">Ensemble Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/">Evaluation/Comparison&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/99-ml-pipeline-model/">ML Pipeline&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/">Generative AI&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/foundation-model/">Foundation Models&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/llm/">LLM - Model&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/ai-agents/">AI Agents&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/genai/rag/">Retrieval-Augmented Generation (RAG)&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/">Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/">Artificial Neuron and Perceptron&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/030-linear-neural-networks-for-regression/">LNN for Regression&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/035-gradient-descent-algorithm/">Gradient Descent Algorithm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/040-linear-neural-networks-for-classification/">LNN for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/050-deep-feedforward/">Deep Feedforward Neural Networks (DFNN) for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/060-cnn-fundamentals/">Convolutional Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/065-deep-cnn-architectures/">Deep CNN Architectures&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/067-cnn-model/">CNN Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/070-recurrent-nn/">Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/075-recurrent-nn-deep/">Deep Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/">Attention Mechanism&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/">Transformer&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/">Optimisation of Deep models&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/">Regularisation for Deep models&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/">Mathematical Foundation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/">Linear Algebra&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/">Linear Systems&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/010-systems-of-linear-equations/">Systems of Linear Equations&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/020-matrices/">Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/matrix-transposition/">Matrix Transposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/030-solving-linear-systems/">Solving Linear Systems&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/forward-backward/">Forward and Backward Substitution&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/inverse-matrix/">Inverse Matrix&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/01-linear-systems/convex/">Convex Combination&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/">Vector Spaces&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/020-basis-and-rank/">Basis and Rank&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/010-linear-independence/">Linear Independence&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/030-norm/">Norm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/040-inner-products/">Inner Products and Dot Product&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/050-lengths-and-distances/">Lengths and Distances&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/060-angles-and-orthogonality/">Angles and Orthogonality&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/070-orthonormal-basis/">Orthonormal Basis&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/feature-space/">Feature Space&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/02-vector-spaces/cauchyschwarz/">Cauchy–Schwarz&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/">Matrix Decompositions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/special-matrices/">Special Matrices&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/characteristic-polynomial/">Characteristic Polynomial&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/010-determinant-and-trace/">Determinant and Trace&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/020-eigenvalues-and-eigenvectors/">Eigenvalues and Eigenvectors&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/030-cholesky-decomposition/">Cholesky Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/040-eigen-decomposition/">Eigen Decomposition&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/diagonalization/">Diagonalization&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/050-singular-value-decomposition/">Singular Value Decomposition (SVD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/03-matrix-decomposition/060-matrix-approximation/">Matrix Approximation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/">Dimensionality reduction and PCA&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca/">Principal Component Analysis (PCA)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-theory/">PCA Theory&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/pca-practice/">PCA in Practice&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/latent-variable-view/">Latent Variable Perspective&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/svm-mathematical-foundations/">Mathematical Preliminaries of SVM&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/010-linear-algebra/07-dimensionality-reduction/kernels/">Nonlinear SVM and Kernels&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/">Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 
 
 
 
 
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/">Vector Calculus&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/010-univariate-differentiation/">Differentiation of Univariate Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/020-partial-derivatives-and-gradients/">Partial Differentiation and Gradients&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/030-vector-and-matrix-gradients/">Gradients of Vector-Valued and Matrix Functions&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/050-gradient-identities/">Useful Gradient Identities&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/060-backpropagation/">Backpropagation and Automatic Differentiation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/070-higher-order-derivatives/">Higher-order derivatives&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/080-taylors-series/">Taylor’s series&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/04-vector-calculus/090-maxima-and-minima/">Maxima and Minima&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/">Continuous Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/gradient-descent/">Optimisation using Gradient Descent&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/constrained-optimisation/">Constrained Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/lagrange-multipliers/">Lagrange Multipliers&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/05-optimisation/convex-optimisation/">Convex Optimisation&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/">Nonlinear Optimisation&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/optimisation-challenges/">Challenges in Gradient-Based Optimisation&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/stochastic-gradient-descent/">Stochastic Gradient Descent (SGD)&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/momentum-methods/">Momentum-Based Learning&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/adaptive-methods/">Adaptive Methods: AdaGrad, RMSProp, Adam&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/maths/020-calculus/06-nonlinear-optimisation/hyperparameter-tuning/">Tuning Hyperparameters and Preprocessing&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">Statistics&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/00_formulas/">Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/">Stats Formula Sheet&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/">Basic Statistics&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/">Basic Probability&lt;/a>
 &lt;/li>
 
 
 
 
 
 
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/">Hypothesis Testing&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/">Prediction &amp;amp; Forecasting&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/">Gaussian Mixture model &amp;amp; Expectation Maximization&lt;/a>
 &lt;/li>
 
 

 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/">Conditional Probability &amp;amp; Bayes’ Theorem&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/">Conditional Probability&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/">Bayes’ Theorem&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/">Naïve Bayes&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/">Probability Distributions&lt;/a>

 
 



&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/">Random Variables&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/">Common Probability Distributions&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>

 &lt;/li>
 
&lt;/ul>


&lt;hr>
&lt;ul>
&lt;li>Machine Learning → The broad field where systems learn patterns from data to make predictions or decisions.&lt;/li>
&lt;li>Neural Networks → A subset of machine learning that uses interconnected artificial neurons to model complex relationships.&lt;/li>
&lt;li>Deep Learning → A subset of neural networks that uses many hidden layers to learn high-level features from large datasets.&lt;/li>
&lt;li>Foundation Models → Large deep learning models trained on massive datasets and reused across many tasks using transfer learning.&lt;/li>
&lt;li>LLMs (Large Language Models) → A specialised type of foundation model focused on understanding and generating human language.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
AI[&amp;#34;Artificial&amp;lt;br/&amp;gt;Intelligence&amp;#34;]
ML[&amp;#34;Machine&amp;lt;br/&amp;gt;Learning&amp;#34;]
NN[&amp;#34;Neural&amp;lt;br/&amp;gt;Networks&amp;#34;]
DL[&amp;#34;Deep&amp;lt;br/&amp;gt;Learning&amp;#34;]
FM[&amp;#34;Foundation&amp;lt;br/&amp;gt;Models&amp;#34;]
LLM[&amp;#34;LLM&amp;lt;br/&amp;gt;Models&amp;#34;]

AI --&amp;gt; ML
ML --&amp;gt; NN
NN --&amp;gt; DL
DL --&amp;gt; FM
FM --&amp;gt; LLM

LR[&amp;#34;Linear&amp;lt;br/&amp;gt;Regression&amp;#34;]
DT[&amp;#34;Decision&amp;lt;br/&amp;gt;Trees&amp;#34;]
ML --&amp;gt; LR
ML --&amp;gt; DT

MLP[&amp;#34;MLP&amp;#34;]
CNN[&amp;#34;CNN&amp;#34;]
NN --&amp;gt; MLP
NN --&amp;gt; CNN

CNNDL[&amp;#34;CNN&amp;lt;br/&amp;gt;(deep)&amp;#34;]
RNN[&amp;#34;RNN&amp;#34;]
DL --&amp;gt; CNNDL
DL --&amp;gt; RNN

BERT[&amp;#34;BERT&amp;#34;]
CLIP[&amp;#34;CLIP&amp;#34;]
FM --&amp;gt; BERT
FM --&amp;gt; CLIP

GPT[&amp;#34;GPT&amp;#34;]
LLAMA[&amp;#34;LLaMA&amp;#34;]
LLM --&amp;gt; GPT
LLM --&amp;gt; LLAMA

TEXT[&amp;#34;Text&amp;#34;]
IMAGE[&amp;#34;Images&amp;#34;]
AUDIO[&amp;#34;Audio&amp;#34;]
VIDEO[&amp;#34;Video&amp;#34;]
LLM --&amp;gt; TEXT
LLM --&amp;gt; IMAGE
LLM --&amp;gt; AUDIO
LLM --&amp;gt; VIDEO

style AI fill:#90CAF9,stroke:#1E88E5,color:#000
style ML fill:#90CAF9,stroke:#1E88E5,color:#000
style NN fill:#90CAF9,stroke:#1E88E5,color:#000

style DL fill:#CE93D8,stroke:#8E24AA,color:#000
style FM fill:#CE93D8,stroke:#8E24AA,color:#000

style LLM fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style DT fill:#C8E6C9,stroke:#2E7D32,color:#000
style MLP fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style CNNDL fill:#C8E6C9,stroke:#2E7D32,color:#000
style RNN fill:#C8E6C9,stroke:#2E7D32,color:#000
style BERT fill:#C8E6C9,stroke:#2E7D32,color:#000
style CLIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style GPT fill:#C8E6C9,stroke:#2E7D32,color:#000
style LLAMA fill:#C8E6C9,stroke:#2E7D32,color:#000
style TEXT fill:#C8E6C9,stroke:#2E7D32,color:#000
style IMAGE fill:#C8E6C9,stroke:#2E7D32,color:#000
style AUDIO fill:#C8E6C9,stroke:#2E7D32,color:#000
style VIDEO fill:#C8E6C9,stroke:#2E7D32,color:#000
&lt;/pre>

&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/ai_ml_dl_ds_diagram.png" alt="AI, ML, DL, and Data Science Diagram" />&lt;/p></description></item><item><title>Stats Formula Sheet</title><link>https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/</link><pubDate>Wed, 25 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/ism-formula-sheet/</guid><description>&lt;h1 id="stats-formula-sheet">
 Stats Formula Sheet
 
 &lt;a class="anchor" href="#stats-formula-sheet">#&lt;/a>
 
&lt;/h1>
&lt;p>Keep this page as a quick reference of &lt;strong>definitions + formulas&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="notation">
 Notation
 
 &lt;a class="anchor" href="#notation">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Sample size: 
&lt;span>
 \( n \)
 &lt;/span>

 (sample), 
&lt;span>
 \( N \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Mean: 
&lt;span>
 \( \bar{x} \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \mu \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Variance: 
&lt;span>
 \( s^2 \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \sigma^2 \)
 &lt;/span>

 (population)&lt;/li>
&lt;li>Standard deviation: 
&lt;span>
 \( s \)
 &lt;/span>

 (sample), 
&lt;span>
 \( \sigma \)
 &lt;/span>

 (population)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="module-1-basic-statistics">
 Module 1: Basic Statistics
 
 &lt;a class="anchor" href="#module-1-basic-statistics">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="measures-of-central-tendency">
 Measures of Central Tendency
 
 &lt;a class="anchor" href="#measures-of-central-tendency">#&lt;/a>
 
&lt;/h3>
&lt;p>&lt;strong>Sample mean (ungrouped):&lt;/strong>&lt;/p></description></item><item><title>Unsupervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</guid><description>&lt;h1 id="unsupervised-learning">
 Unsupervised Learning
 
 &lt;a class="anchor" href="#unsupervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Works on &lt;strong>unlabelled raw data&lt;/strong>.&lt;/li>
&lt;li>The algorithm &lt;strong>discovers hidden patterns&lt;/strong> without prior knowledge of outcomes.&lt;/li>
&lt;li>Requires &lt;strong>no human intervention&lt;/strong> during training.&lt;/li>
&lt;li>Does not make direct predictions — it &lt;strong>groups or organises data&lt;/strong> instead.&lt;/li>
&lt;li>Carries a &lt;strong>higher risk&lt;/strong> because there’s no ground truth to verify results.&lt;/li>
&lt;li>Common techniques include &lt;strong>Clustering&lt;/strong>, &lt;strong>Association&lt;/strong>, and &lt;strong>Dimensionality Reduction&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ML maths-based colours (same palette as supervised)
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151

 %% Root
 USL: Unsupervised Learning

 %% Main branches
 USL --&amp;gt; CLU:::category
 CLU: Clustering

 USL --&amp;gt; DR:::category
 DR: Dimensionality Reduction

 %% Clustering algorithms
 CLU --&amp;gt; KM:::geometry
 KM: K-Means

 CLU --&amp;gt; HC:::geometry
 HC: Hierarchical Clustering

 CLU --&amp;gt; DB:::geometry
 DB: DBSCAN

 %% Probabilistic models
 USL --&amp;gt; PM:::category
 PM: Probabilistic Models

 PM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 PM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;h2 id="clustering">
 Clustering
 
 &lt;a class="anchor" href="#clustering">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Groups &lt;strong>similar data points&lt;/strong> together based on shared features.&lt;/li>
&lt;li>Commonly used for &lt;strong>market segmentation&lt;/strong>, &lt;strong>image compression&lt;/strong>, and &lt;strong>anomaly detection&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-types-of-clustering">
 Common Types of Clustering
 
 &lt;a class="anchor" href="#common-types-of-clustering">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>K-Means Clustering&lt;/strong> – Divides data into &lt;em>K&lt;/em> groups based on similarity.&lt;/li>
&lt;li>&lt;strong>Hierarchical Clustering&lt;/strong> – Builds a hierarchy (tree) of clusters.&lt;/li>
&lt;li>&lt;strong>DBSCAN (Density-Based Spatial Clustering)&lt;/strong> – Groups points close in density; identifies noise/outliers.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="association">
 Association
 
 &lt;a class="anchor" href="#association">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Identifies &lt;strong>relationships or correlations&lt;/strong> between variables in a dataset.&lt;/li>
&lt;li>Commonly used in &lt;strong>market basket analysis&lt;/strong> (e.g. &amp;ldquo;Customers who bought X also bought Y&amp;rdquo;).&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Apriori Algorithm&lt;/strong> – Finds frequent itemsets and generates association rules.&lt;/li>
&lt;li>&lt;strong>Eclat Algorithm&lt;/strong> – Similar to Apriori but uses set intersections for faster computation.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="dimensionality-reduction">
 Dimensionality Reduction
 
 &lt;a class="anchor" href="#dimensionality-reduction">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Reduces the &lt;strong>number of input variables&lt;/strong> to simplify data.&lt;/li>
&lt;li>Helps remove noise and redundancy.&lt;/li>
&lt;li>Commonly used in &lt;strong>data pre-processing&lt;/strong> and &lt;strong>visualisation&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques-1">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques-1">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Principal Component Analysis (PCA)&lt;/strong> – Projects data onto fewer dimensions while keeping most variance.&lt;/li>
&lt;li>&lt;strong>Linear Discriminant Analysis (LDA)&lt;/strong> – Focuses on class separation.&lt;/li>
&lt;li>&lt;strong>t-SNE (t-Distributed Stochastic Neighbour Embedding)&lt;/strong> – Used for visualising high-dimensional data.&lt;/li>
&lt;li>&lt;strong>Autoencoders&lt;/strong> – Neural networks that compress and reconstruct data.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
mindmap
 root(Unsupervised Learning)
 Clustering
 K Means
 Hierarchical Clustering
 DBSCAN
 Dimensionality Reduction
 PCA
 t SNE
 Autoencoders
 Probabilistic Models
 Gaussian Mixture Model
 Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Semi-Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</guid><description>&lt;h1 id="semi-supervised-learning">
 Semi-Supervised Learning
 
 &lt;a class="anchor" href="#semi-supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A combination of &lt;strong>labelled&lt;/strong> and &lt;strong>unlabelled data&lt;/strong>.&lt;/li>
&lt;li>Useful when labelling large datasets is &lt;strong>expensive or time-consuming&lt;/strong>.&lt;/li>
&lt;li>Works well with &lt;strong>high-volume datasets&lt;/strong> (e.g. millions of images).&lt;/li>
&lt;li>Only a &lt;strong>small fraction of data&lt;/strong> is labelled (e.g. a few thousand).&lt;/li>
&lt;li>The algorithm learns from both labelled examples and structure in unlabelled data.&lt;/li>
&lt;li>&lt;strong>Ideal for medical imaging&lt;/strong> where labelled data is limited.&lt;/li>
&lt;li>For example, a &lt;strong>radiologist&lt;/strong> can label a small set of medical scans,&lt;br>
and the model uses that to learn from thousands of unlabelled scans.&lt;/li>
&lt;li>Helps improve &lt;strong>accuracy and generalisation&lt;/strong> with minimal manual effort.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Reinforcement Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-reinforcement/</guid><description>&lt;h1 id="reinforcement-learning-rl">
 Reinforcement Learning (RL)
 
 &lt;a class="anchor" href="#reinforcement-learning-rl">#&lt;/a>
 
&lt;/h1>
&lt;p>RL is learning by &lt;strong>trial and error&lt;/strong>.&lt;/p>
&lt;p>Reinforcement Learning (RL) is a type of machine learning where an &lt;strong>autonomous agent learns to make decisions by interacting with an environment&lt;/strong>.&lt;/p>
&lt;p>Instead of being told the correct answer, the agent:&lt;/p>
&lt;ul>
&lt;li>takes actions&lt;/li>
&lt;li>observes outcomes&lt;/li>
&lt;li>receives rewards or penalties&lt;/li>
&lt;li>gradually learns a strategy that maximises long-term reward&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>&lt;strong>Reinforcement Learning teaches an agent how to act, not what to predict.&lt;/strong>&lt;/p></description></item><item><title>AI Foundation</title><link>https://arshadhs.github.io/docs/ai/foundation/</link><pubDate>Mon, 26 Jan 2026 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/</guid><description>&lt;h1 id="ai">
 AI
 
 &lt;a class="anchor" href="#ai">#&lt;/a>
 
&lt;/h1>
&lt;p>A selection of notes that didn&amp;rsquo;t fit elsewhere or are being worked on!.&lt;/p>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stages/">AI Stages: ANI, AGI, ASI&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-stack/">AI Stack&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/">AI Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/foundation/ai-notes/">AI Learning Resources&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>
&lt;a href="https://arshadhs.github.io/">Home&lt;/a></description></item><item><title>AI Stages: ANI, AGI, ASI</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-stages/</guid><description>&lt;h1 id="ai-development-stages-ani--agi--asi">
 AI Development Stages: ANI → AGI → ASI
 
 &lt;a class="anchor" href="#ai-development-stages-ani--agi--asi">#&lt;/a>
 
&lt;/h1>
&lt;p>Artificial Intelligence is often described in &lt;strong>three stages&lt;/strong>, based on capability and scope:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>ANI:&lt;/strong> Task-specific intelligence (today’s AI)&lt;/li>
&lt;li>&lt;strong>AGI:&lt;/strong> Human-level general intelligence (future goal)&lt;/li>
&lt;li>&lt;strong>ASI:&lt;/strong> Beyond human intelligence (theoretical)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;img src="https://arshadhs.github.io/images/ai/ai_stages.png" alt="AI Stages" />&lt;/p>
&lt;hr>
&lt;h2 id="ani--artificial-narrow-intelligence">
 ANI — Artificial Narrow Intelligence
 
 &lt;a class="anchor" href="#ani--artificial-narrow-intelligence">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>also called &lt;strong>Weak AI&lt;/strong>&lt;/li>
&lt;li>designed to perform &lt;strong>one specific task&lt;/strong>&lt;/li>
&lt;li>Operates within a &lt;strong>predefined environment&lt;/strong>&lt;/li>
&lt;li>Cannot generalise beyond its training&lt;/li>
&lt;li>&lt;strong>Most AI systems today are ANI&lt;/strong>&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>examples&lt;/strong>&lt;/p></description></item><item><title>Basic Statistics</title><link>https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/01_basic_statistics/</guid><description>&lt;h1 id="basic-statistics">
 Basic Statistics
 
 &lt;a class="anchor" href="#basic-statistics">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Statistics&lt;/strong>: describes data (what you &lt;em>see&lt;/em>).&lt;br>
&lt;strong>Probability&lt;/strong>: models uncertainty (what you &lt;em>don’t know&lt;/em> yet).&lt;/p>
&lt;ul>
&lt;li>Summarise a dataset using central tendency and variability&lt;/li>
&lt;li>Explain core probability ideas using simple examples&lt;/li>
&lt;li>Apply the axioms of probability&lt;/li>
&lt;li>Distinguish mutually exclusive vs independent events&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
 A[Dataset] --&amp;gt; B[Central Tendency]
 A --&amp;gt; C[Variability]
 B --&amp;gt; B1[Mean]
 B --&amp;gt; B2[Median]
 B --&amp;gt; B3[Mode]
 C --&amp;gt; C1[Range]
 C --&amp;gt; C2[Variance]
 C --&amp;gt; C3[Standard Deviation]
 C --&amp;gt; C4[IQR]
&lt;/pre>

&lt;hr>
&lt;h2 id="measures-of-central-tendency">
 Measures of Central Tendency
 
 &lt;a class="anchor" href="#measures-of-central-tendency">#&lt;/a>
 
&lt;/h2>
&lt;p>Central tendency tells you where the “middle” of the data is.
Describes a set of scores with a &lt;strong>single number&lt;/strong> that describes the &lt;strong>PERFORMANCE&lt;/strong> of the group.&lt;/p></description></item><item><title>Basic Probability</title><link>https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/01_basic_probability/</guid><description>&lt;h1 id="basic-probability">
 Basic Probability
 
 &lt;a class="anchor" href="#basic-probability">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability models uncertainty:
what you &lt;em>don’t know&lt;/em> yet, but want to reason about.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Probability is a number between &lt;strong>0 and 1&lt;/strong> that measures how likely an event is.
The whole topic is about defining &lt;strong>events&lt;/strong> clearly and applying a few core rules consistently.&lt;/p>
&lt;/blockquote>
&lt;p>Probability quantifies uncertainty: a number between 0 and 1.&lt;/p>
&lt;ul>
&lt;li>0 means: impossible&lt;/li>
&lt;li>1 means: certain&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="terminology">
 Terminology
 
 &lt;a class="anchor" href="#terminology">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="random-experiment">
 Random experiment
 
 &lt;a class="anchor" href="#random-experiment">#&lt;/a>
 
&lt;/h3>
&lt;p>A random experiment is an action whose outcome is not known in advance.&lt;/p></description></item><item><title>Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</guid><description>&lt;h1 id="neural-networks">
 Neural Networks
 
 &lt;a class="anchor" href="#neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A &lt;strong>network of artificial neurons&lt;/strong> inspired by how neurons function in the &lt;strong>human brain&lt;/strong>.&lt;/li>
&lt;li>At its core - a &lt;strong>mathematical model&lt;/strong> designed to process and learn from data.&lt;/li>
&lt;li>Neural networks form the &lt;strong>foundation of &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>&lt;/strong> (involves training large and complex networks on vast amounts of data).&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 subgraph subGraph0[&amp;#34;Input Layer&amp;#34;]
 I1((&amp;#34;Input 1&amp;#34;))
 I2((&amp;#34;Input 2&amp;#34;))
 I3((&amp;#34;Input 3&amp;#34;))
 end
 subgraph subGraph1[&amp;#34;Hidden Layer&amp;#34;]
 H1((&amp;#34;Hidden 1&amp;#34;))
 H2((&amp;#34;Hidden 2&amp;#34;))
 H3((&amp;#34;Hidden 3&amp;#34;))
 end
 subgraph subGraph2[&amp;#34;Output Layer&amp;#34;]
 O((&amp;#34;Output&amp;#34;))
 end
 I1 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I2 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I3 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 H1 --&amp;gt; O
 H2 --&amp;gt; O
 H3 --&amp;gt; O

 style I1 fill:#C8E6C9
 style I2 fill:#C8E6C9
 style I3 fill:#C8E6C9
 style H1 stroke:#2962FF,fill:#BBDEFB
 style H2 fill:#BBDEFB
 style H3 fill:#BBDEFB
 style O fill:#FFCDD2
 style subGraph0 stroke:none,fill:transparent
 style subGraph1 stroke:none,fill:transparent
 style subGraph2 stroke:none,fill:transparent
&lt;/pre>

&lt;hr>
&lt;h3 id="structure-of-a-neural-network">
 Structure of a Neural Network
 
 &lt;a class="anchor" href="#structure-of-a-neural-network">#&lt;/a>
 
&lt;/h3>
&lt;p>A typical neural network has &lt;strong>three main layers&lt;/strong>:&lt;/p></description></item><item><title>Conditional Probability &amp; Bayes’ Theorem</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/</guid><description>&lt;h1 id="conditional-probability--bayes-theorem">
 Conditional Probability &amp;amp; Bayes’ Theorem
 
 &lt;a class="anchor" href="#conditional-probability--bayes-theorem">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability often changes when we &lt;strong>learn new information&lt;/strong>.&lt;/p>
&lt;p>Conditional probability and Bayes’ theorem give a structured way to &lt;strong>update beliefs&lt;/strong> using evidence.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Conditional probability updates probabilities after observing an event.&lt;/p>
&lt;p>Bayes’ theorem lets you estimate a hidden cause from observed evidence.&lt;/p>
&lt;p>Naïve Bayes turns Bayes’ theorem into a practical classifier by assuming conditional independence of features given the class.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD

A[Conditional&amp;lt;br/&amp;gt;probability] --&amp;gt;|foundation| B[Bayes&amp;lt;br/&amp;gt;theorem]
D[Independent&amp;lt;br/&amp;gt;events] --&amp;gt;|implies| C[Independence]
C --&amp;gt;|simplifies| A

E[Prior] --&amp;gt;|with likelihood| B
F[Likelihood] --&amp;gt;|updates| H[Posterior]
G[Evidence] --&amp;gt;|normalises| B
B --&amp;gt;|yields| H

I[Naïve&amp;lt;br/&amp;gt;Bayes] --&amp;gt;|uses| B
J[Naïve&amp;lt;br/&amp;gt;assumption] --&amp;gt;|assumes| C
K[Features] --&amp;gt;|given class| J
L[Class] --&amp;gt;|conditions| J
I --&amp;gt;|predicts| M[Classification]
M --&amp;gt;|selects| L

style A fill:#90CAF9,stroke:#1E88E5,color:#000
style B fill:#90CAF9,stroke:#1E88E5,color:#000
style C fill:#90CAF9,stroke:#1E88E5,color:#000

style D fill:#CE93D8,stroke:#8E24AA,color:#000
style E fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style G fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style K fill:#CE93D8,stroke:#8E24AA,color:#000
style L fill:#CE93D8,stroke:#8E24AA,color:#000

style H fill:#C8E6C9,stroke:#2E7D32,color:#000
style I fill:#C8E6C9,stroke:#2E7D32,color:#000
style M fill:#C8E6C9,stroke:#2E7D32,color:#000

&lt;/pre>

&lt;hr>
&lt;h2 id="quick-summary">
 Quick summary
 
 &lt;a class="anchor" href="#quick-summary">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Conditional probability:
updates probability after an event is known.&lt;/li>
&lt;li>Multiplication rule:
computes joint probability from conditional parts.&lt;/li>
&lt;li>Independence:
tested using 
&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( P(A\cap B)=P(A)P(B) \)
 &lt;/span>

.&lt;/li>
&lt;li>Total probability:
breaks a probability into weighted cases.&lt;/li>
&lt;li>Bayes’ theorem:
reverses conditioning to infer causes from evidence.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="whats-next">
 What’s next
 
 &lt;a class="anchor" href="#whats-next">#&lt;/a>
 
&lt;/h2>
&lt;p>Probability Distributions&lt;br>
Move from events to random variables and distributions.&lt;/p></description></item><item><title>Machine Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/</link><pubDate>Tue, 06 Aug 2024 23:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/</guid><description>&lt;h1 id="machine-learning">
 Machine Learning
 
 &lt;a class="anchor" href="#machine-learning">#&lt;/a>
 
&lt;/h1>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ===== CLASS DEFINITIONS (Math-based colours) =====
 classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

 %% ===== ROOT =====
 ML: Machine Learning

 %% ===== SUPERVISED =====
 ML --&amp;gt; SL:::category
 SL: Supervised Learning

 SL --&amp;gt; Regression
 Regression --&amp;gt; LR:::algebra
 LR: Linear Regression

 LR --&amp;gt; NN:::algebra
 NN: Neural Network

 NN --&amp;gt; DT:::logic
 DT: Decision Tree

 SL --&amp;gt; Classification
 Classification --&amp;gt; NB:::probability
 NB: Naive Bayes

 NB --&amp;gt; KNN:::geometry
 KNN: k-Nearest Neighbours

 KNN --&amp;gt; SVM:::algebra
 SVM: Support Vector Machine
 
 %% ===== UNSUPERVISED =====
 ML --&amp;gt; USL:::category
 USL: Unsupervised Learning

 USL --&amp;gt; Clustering
 Clustering --&amp;gt; KM:::geometry
 KM: K-Means

 KM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 GMM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model

 %% ===== REINFORCEMENT =====
 ML --&amp;gt; RL:::category
 RL: Reinforcement Learning

 RL --&amp;gt; DM:::logic
 DM: Decision Making
&lt;/pre>

&lt;hr>
&lt;details >&lt;summary>Mathematical Legend&lt;/summary>
 &lt;div class="markdown-inner">
&lt;h3 id="algebra--linear-algebra-blue">
 Algebra / Linear Algebra (Blue)
 
 &lt;a class="anchor" href="#algebra--linear-algebra-blue">#&lt;/a>
 
&lt;/h3>
&lt;p>Used heavily when models rely on:&lt;/p></description></item><item><title>AI Stack</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-stack/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-stack/</guid><description>&lt;h1 id="ai-stack">
 AI Stack
 
 &lt;a class="anchor" href="#ai-stack">#&lt;/a>
 
&lt;/h1>
&lt;p>The &lt;strong>AI Stack&lt;/strong> describes the &lt;strong>layers required to build an end-to-end AI system&lt;/strong>, from infrastructure at the bottom to user-facing applications at the top.&lt;/p>
&lt;p>Different organisations represent the AI stack differently; this is a simplified conceptual view for learning.&lt;/p>
&lt;p>Each layer depends on the one below it.&lt;/p>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
graph TB

 subgraph APP[&amp;#34;Applications&amp;#34;]
 A[User Interfaces &amp;amp; Integrations]
 end

 subgraph ORCH[&amp;#34;Orchestration&amp;#34;]
 O[Workflows • Agents • Control Logic]
 end

 subgraph DATA[&amp;#34;Data&amp;#34;]
 D[Data Sources • Pipelines • Vector DBs]
 end

 subgraph MODEL[&amp;#34;Models&amp;#34;]
 M[ML • DL • Foundation Models • LLMs]
 end

 subgraph INFRA[&amp;#34;Infrastructure&amp;#34;]
 I[Cloud • On-prem • GPUs • Storage]
 end

 %% Styling
 style APP fill:#FFCCBC
 style ORCH fill:#90CAF9
 style DATA fill:#BBDEFB
 style MODEL fill:#C8E6C9
 style INFRA fill:#E1F5FE

 style A fill:#FFE0B2
 style O fill:#B3E5FC
 style D fill:#E3F2FD
 style M fill:#DCEDC8
 style I fill:#E1F5FE
&lt;/pre>

&lt;hr>
&lt;h2 id="1-infrastructure">
 1. Infrastructure
 
 &lt;a class="anchor" href="#1-infrastructure">#&lt;/a>
 
&lt;/h2>
&lt;p>The foundation that provides &lt;strong>compute and storage&lt;/strong>.&lt;/p></description></item><item><title>Artificial Neuron and Perceptron</title><link>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</guid><description>&lt;h1 id="artificial-neuron-and-perceptron">
 Artificial Neuron and Perceptron
 
 &lt;a class="anchor" href="#artificial-neuron-and-perceptron">#&lt;/a>
 
&lt;/h1>
&lt;blockquote class="book-hint info">
&lt;p>knowledge in neural networks is stored in &lt;strong>connection weights&lt;/strong>, and learning means &lt;strong>modifying those weights&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="biological-neuron">
 Biological Neuron
 
 &lt;a class="anchor" href="#biological-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.&lt;/p>
&lt;p>Core components:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Dendrites&lt;/strong>: receive signals from other neurons&lt;/li>
&lt;li>&lt;strong>Cell body (soma)&lt;/strong>: processes incoming signals&lt;/li>
&lt;li>&lt;strong>Axon&lt;/strong>: transmits the output signal&lt;/li>
&lt;li>&lt;strong>Synapses&lt;/strong>: connection points between neurons&lt;/li>
&lt;/ul>
&lt;p>Biological intuition:&lt;/p>
&lt;ul>
&lt;li>many inputs arrive to one neuron&lt;/li>
&lt;li>one neuron can connect out to many neurons&lt;/li>
&lt;li>massive parallelism enables fast perception and recognition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="artificial-neuron">
 Artificial Neuron
 
 &lt;a class="anchor" href="#artificial-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>An artificial neuron is a simplified computational model inspired by biological neurons.&lt;/p></description></item><item><title>ML Workflow</title><link>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</guid><description>&lt;h1 id="machine-learning-workflow">
 Machine learning Workflow
 
 &lt;a class="anchor" href="#machine-learning-workflow">#&lt;/a>
 
&lt;/h1>
&lt;p>Data is the foundation of any machine learning system.
Quality of data matters more than model complexity.&lt;/p>
&lt;h3 id="role-of-data">
 Role of Data
 
 &lt;a class="anchor" href="#role-of-data">#&lt;/a>
 
&lt;/h3>
&lt;p>Data determines:&lt;/p>
&lt;ul>
&lt;li>What patterns the model can learn&lt;/li>
&lt;li>How well it generalises&lt;/li>
&lt;li>Whether bias or noise is introduced&lt;/li>
&lt;/ul>
&lt;p>Bad data → bad model (even with perfect algorithms).&lt;/p>
&lt;hr>
&lt;h3 id="data-preprocessing-wrangling">
 Data Preprocessing, wrangling
 
 &lt;a class="anchor" href="#data-preprocessing-wrangling">#&lt;/a>
 
&lt;/h3>
&lt;p>Raw data is never ready for training.&lt;/p>
&lt;p>&lt;strong>Data Issues&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Noise
&lt;ul>
&lt;li>For &lt;strong>objects&lt;/strong>, noise is an &lt;strong>extraneous object&lt;/strong>&lt;/li>
&lt;li>For &lt;strong>attributes&lt;/strong>, noise refers to &lt;strong>modification of original values&lt;/strong>&lt;/li>
&lt;li>Use Log or Z Transfer to convert to mean&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Outliers
&lt;ul>
&lt;li>Data objects with characteristics that are considerably different than most of the other data objects in the data set&lt;/li>
&lt;li>Handle: Use &lt;strong>IQR&lt;/strong> method&lt;/li>
&lt;li>Find Lower and Upper Bound and &lt;strong>replace Outlier with Lower or Upper Bound&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Missing Values
&lt;ul>
&lt;li>Eliminate data objects or variables&lt;/li>
&lt;li>Handle: Estimate missing values
&lt;ul>
&lt;li>&lt;strong>Mean, Median or Mode&lt;/strong>&lt;/li>
&lt;li>Prefer &lt;strong>Median&lt;/strong> if there are missing &lt;strong>outliers&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Ignore the missing value during analysis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Duplicate Data
&lt;ul>
&lt;li>Major issue when merging data from heterogeneous sources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Inconsistent Codes
&lt;ul>
&lt;li>Find all Unique and transfer all inconsistent to&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Data Preprocessing techniques&lt;/strong>&lt;/p></description></item><item><title>Conditional Probability</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/021_conditional_prob/</guid><description>&lt;h1 id="conditional-probability">
 Conditional Probability
 
 &lt;a class="anchor" href="#conditional-probability">#&lt;/a>
 
&lt;/h1>
&lt;p>Conditional probability updates the probability of an event when new information is available.&lt;/p>
&lt;p>It shows up whenever a question says:&lt;/p>
&lt;ul>
&lt;li>“given that…”&lt;/li>
&lt;li>“among those who…”&lt;/li>
&lt;li>“out of the items that…”&lt;/li>
&lt;li>“if it does not fail immediately…”&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Conditional probability is always:&lt;/p>
&lt;p>joint probability ÷ probability of the condition.&lt;/p>
&lt;p>The condition must not be an impossible event.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="prior-vs-posterior">
 Prior vs posterior
 
 &lt;a class="anchor" href="#prior-vs-posterior">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>
&lt;p>Prior probability:
probability with no condition (before new information)&lt;/p></description></item><item><title>Bayes’ Theorem</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/022_bayes_theorem/</guid><description>&lt;h1 id="bayes-theorem">
 Bayes’ Theorem
 
 &lt;a class="anchor" href="#bayes-theorem">#&lt;/a>
 
&lt;/h1>
&lt;h3 id="21-total-probability-needed-for-bayes">
 2.1 Total probability (needed for Bayes)
 
 &lt;a class="anchor" href="#21-total-probability-needed-for-bayes">#&lt;/a>
 
&lt;/h3>
&lt;p>Often we split the world into cases 
&lt;span>
 \( E_1,E_2,\dots,E_k \)
 &lt;/span>

 that:&lt;/p>
&lt;ul>
&lt;li>are mutually exclusive&lt;/li>
&lt;li>cover the whole sample space&lt;/li>
&lt;/ul>
&lt;p>Then for any event 
&lt;span>
 \( A \)
 &lt;/span>

:&lt;/p>
&lt;span style="color: red;">
 &lt;span>
 \[ 
P(A)=\sum_{i=1}^{k} P(A\mid E_i)\,P(E_i)
 \]
 &lt;/span>
&lt;/span>
&lt;p>Tree intuition:&lt;/p>


&lt;pre class="mermaid">
flowchart TD
 S[Start] --&amp;gt; E1[Case E1]
 S --&amp;gt; E2[Case E2]
 S --&amp;gt; E3[Case E3]
 E1 --&amp;gt; A1[&amp;#34;A happens&amp;#34;]
 E2 --&amp;gt; A2[&amp;#34;A happens&amp;#34;]
 E3 --&amp;gt; A3[&amp;#34;A happens&amp;#34;]
&lt;/pre>

&lt;hr>
&lt;h3 id="22-bayes-theorem-two-event-form">
 2.2 Bayes’ theorem (two-event form)
 
 &lt;a class="anchor" href="#22-bayes-theorem-two-event-form">#&lt;/a>
 
&lt;/h3>
&lt;p>Bayes&amp;rsquo; Theorem is a mathematical formula used to determine the &lt;strong>conditional probability of an event based on prior knowledge and new evidence&lt;/strong>.&lt;/p></description></item><item><title>Naïve Bayes</title><link>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/conditional-probability/023_naive_bayes/</guid><description>&lt;h1 id="naïve-bayes">
 Naïve Bayes
 
 &lt;a class="anchor" href="#na%c3%afve-bayes">#&lt;/a>
 
&lt;/h1>
&lt;p>Naïve Bayes is a &lt;strong>probabilistic classifier&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>Supervised Learning Problem&lt;/li>
&lt;li>Binary Classification - final target variable is considered in two classes&lt;/li>
&lt;li>Hypothesis is target which you want to classify&lt;/li>
&lt;li>Total Probability (Prior) of Yes and No is already calculated&lt;/li>
&lt;li>Post / Posterior is when you start studying data&lt;/li>
&lt;li>Based on max probability of hypotheses classify given instance into a class&lt;/li>
&lt;/ul>
&lt;p>It predicts a class label by computing:&lt;/p></description></item><item><title>Probability Distributions</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/</guid><description>&lt;h1 id="probability-distributions">
 Probability Distributions
 
 &lt;a class="anchor" href="#probability-distributions">#&lt;/a>
 
&lt;/h1>
&lt;p>Probability distributions are the bridge between:
real-world randomness and mathematical modelling.&lt;/p>
&lt;p>A random experiment produces outcomes.
A random variable turns those outcomes into numbers.
A probability distribution tells you how likely each number (or range of numbers) is.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A distribution is a complete “story” about uncertainty:
what values are possible, how likely they are, and how we summarise them (mean, variance).&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
	PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; RV[&amp;#34;Random&amp;lt;br/&amp;gt;variables&amp;#34;]
	PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; DS[&amp;#34;Common&amp;lt;br/&amp;gt;distributions&amp;#34;]

	style PD fill:#90CAF9,stroke:#1E88E5,color:#000
	style RV fill:#90CAF9,stroke:#1E88E5,color:#000
	style DS fill:#90CAF9,stroke:#1E88E5,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="aiml-connection">
 AI/ML Connection
 
 &lt;a class="anchor" href="#aiml-connection">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Many ML models are probabilistic:
they assume data (or errors) follow a distribution.&lt;/li>
&lt;li>Loss functions often come from distribution assumptions:
squared loss aligns with Gaussian noise.&lt;/li>
&lt;li>Naïve Bayes (from the previous module) becomes practical once you can model:

&lt;link rel="stylesheet" href="https://arshadhs.github.io/katex/katex.min.css" />
&lt;script defer src="https://arshadhs.github.io/katex/katex.min.js">&lt;/script>

 &lt;script defer src="https://arshadhs.github.io/katex/auto-render.min.js" onload="renderMathInElement(document.body, {
 &amp;#34;delimiters&amp;#34;: [
 {&amp;#34;left&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$$&amp;#34;, &amp;#34;display&amp;#34;: true},
 {&amp;#34;left&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;$&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\(&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\)&amp;#34;, &amp;#34;display&amp;#34;: false},
 {&amp;#34;left&amp;#34;: &amp;#34;\\[&amp;#34;, &amp;#34;right&amp;#34;: &amp;#34;\\]&amp;#34;, &amp;#34;display&amp;#34;: true}
 ]
});">&lt;/script>

&lt;span>
 \( P(X\mid Y) \)
 &lt;/span>

 using suitable distributions.&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint warning">
&lt;p>In practice:
choosing a distribution is a modelling decision.
It affects:
prediction, uncertainty estimates, and what “rare” or “typical” means in your data.&lt;/p></description></item><item><title>Generative AI</title><link>https://arshadhs.github.io/docs/ai/genai/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/</guid><description>&lt;h1 id="generative-ai">
 Generative AI
 
 &lt;a class="anchor" href="#generative-ai">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Generative Artificial Intelligence (GenAI)&lt;/strong> refers to a class of AI systems that can &lt;strong>generate new content&lt;/strong> such as text, images, audio, video, or code, rather than only making predictions or classifications.&lt;/p>
&lt;p>GenAI systems learn &lt;strong>patterns and representations from large datasets&lt;/strong> and use them to produce &lt;strong>novel outputs&lt;/strong> that resemble the data they were trained on.&lt;/p>
&lt;hr>
&lt;h2 id="how-generative-ai-differs-from-traditional-ai">
 How Generative AI Differs from Traditional AI
 
 &lt;a class="anchor" href="#how-generative-ai-differs-from-traditional-ai">#&lt;/a>
 
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Traditional AI&lt;/th>
 &lt;th>Generative AI&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>Predicts or classifies&lt;/td>
 &lt;td>Generates new content&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Task-specific models&lt;/td>
 &lt;td>General-purpose models&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Fixed outputs&lt;/td>
 &lt;td>Open-ended outputs&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>Often rule-based&lt;/td>
 &lt;td>Data-driven and probabilistic&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="core-idea-of-generative-ai">
 Core Idea of Generative AI
 
 &lt;a class="anchor" href="#core-idea-of-generative-ai">#&lt;/a>
 
&lt;/h2>

&lt;blockquote class='book-hint '>
 &lt;p>&lt;strong>Instead of learning “what label to assign”, Generative AI learns “how data is structured” and then creates new data following that structure.&lt;/strong>&lt;/p></description></item><item><title>AI Pipeline</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/</link><pubDate>Thu, 04 Jul 2024 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-pipeline/</guid><description>&lt;h1 id="ai-pipeline">
 AI Pipeline
 
 &lt;a class="anchor" href="#ai-pipeline">#&lt;/a>
 
&lt;/h1>
&lt;p>The AI pipeline is a continuous process where data is collected, prepared, used to train models, evaluated for performance, and continuously improved after deployment.&lt;/p>
&lt;div class="book-steps ">
&lt;ol>
&lt;li>
&lt;h2 id="collect-data">
 Collect Data
 
 &lt;a class="anchor" href="#collect-data">#&lt;/a>
 
&lt;/h2>
&lt;/li>
&lt;li>
&lt;h2 id="prepare-data">
 Prepare data
 
 &lt;a class="anchor" href="#prepare-data">#&lt;/a>
 
&lt;/h2>
&lt;/li>
&lt;li>
&lt;h2 id="train-model">
 Train Model
 
 &lt;a class="anchor" href="#train-model">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Iterate until model is good enough&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>
&lt;h2 id="deploy-model">
 Deploy Model
 
 &lt;a class="anchor" href="#deploy-model">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Get data back&lt;/li>
&lt;li>Maintain &amp;amp; update model&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ol>
&lt;/div>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
timeline
 title AI Pipeline
 Collect Data : Data Ingestion
 : Data Understanding
 Prepare Data : Cleaning
 : Feature Engineering
 : Sampling
 Train Model : Model Training
 : Validation &amp;amp; Metrics
 Deploy Model : Deployment
 : Monitoring &amp;amp; Retraining
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/foundation/">
 AI Foundation
&lt;/a>&lt;/p></description></item><item><title>Regression(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</guid><description>&lt;h1 id="linear-regression">
 Linear Regression
 
 &lt;a class="anchor" href="#linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Linear Regression is a supervised 
&lt;span style="color: blue;">
 ML
&lt;/span> method used to predict a &lt;strong>numerical&lt;/strong> target by fitting a model that is &lt;strong>linear in its parameters&lt;/strong>.&lt;/p>
&lt;p>In 
&lt;span style="color: blue;">
 ML
&lt;/span>, linear models are a core baseline:
they’re fast, often surprisingly strong, and usually easy to interpret.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Linear Regression learns parameters by minimising a squared-error cost.
You can solve it directly (closed form) or iteratively (gradient descent),
and you can extend it using basis functions and regularisation.&lt;/p></description></item><item><title>Random Variables</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/random-variables/</guid><description>&lt;h1 id="random-variables">
 Random Variables
 
 &lt;a class="anchor" href="#random-variables">#&lt;/a>
 
&lt;/h1>
&lt;p>A random variable is a way to attach numbers to outcomes of a random experiment.&lt;/p>
&lt;p>It lets us move from:
“what happened?”
to:
“what number should we analyse?”&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A random variable is a &lt;em>function&lt;/em> from the sample space to real numbers.
Once you define the random variable clearly, the rest (pmf/pdf/cdf, mean, variance) becomes systematic.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; RV[&amp;#34;Random&amp;lt;br/&amp;gt;variables&amp;#34;]

RV --&amp;gt; T[&amp;#34;Types&amp;#34;]
T --&amp;gt; RV1[&amp;#34;Discrete&amp;lt;br/&amp;gt;RVs&amp;#34;]
T --&amp;gt; RV2[&amp;#34;Continuous&amp;lt;br/&amp;gt;RVs&amp;#34;]

RV --&amp;gt; F[&amp;#34;PMF / PDF / CDF&amp;#34;]
RV --&amp;gt; S[&amp;#34;Mean / Variance&amp;lt;br/&amp;gt;Covariance&amp;#34;]
RV --&amp;gt; J[&amp;#34;Joint &amp;amp; Marginal&amp;lt;br/&amp;gt;distributions&amp;#34;]
RV --&amp;gt; X[&amp;#34;Transformations&amp;#34;]

style PD fill:#90CAF9,stroke:#1E88E5,color:#000
style RV fill:#90CAF9,stroke:#1E88E5,color:#000

style T fill:#CE93D8,stroke:#8E24AA,color:#000
style F fill:#CE93D8,stroke:#8E24AA,color:#000
style S fill:#CE93D8,stroke:#8E24AA,color:#000
style J fill:#CE93D8,stroke:#8E24AA,color:#000
style X fill:#CE93D8,stroke:#8E24AA,color:#000
style RV1 fill:#CE93D8,stroke:#8E24AA,color:#000
style RV2 fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="1-definition">
 1) Definition
 
 &lt;a class="anchor" href="#1-definition">#&lt;/a>
 
&lt;/h2>
&lt;p>Random variable:
a rule that assigns a number to each outcome.&lt;/p></description></item><item><title>Common Probability Distributions</title><link>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/</link><pubDate>Sun, 22 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/probability_distributions/common-distributions/</guid><description>&lt;h1 id="common-probability-distributions">
 Common Probability Distributions
 
 &lt;a class="anchor" href="#common-probability-distributions">#&lt;/a>
 
&lt;/h1>
&lt;p>Once you can describe a random variable using a pmf or pdf, the next step is to use
named distributions that appear repeatedly in real data and in ML models.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Named distributions give you ready-made probability models for common patterns:
binary outcomes, counts, and measurement noise.&lt;/p>
&lt;/blockquote>
&lt;hr>


&lt;pre class="mermaid">
flowchart TD
PD[&amp;#34;Probability&amp;lt;br/&amp;gt;distributions&amp;#34;] --&amp;gt; DS[&amp;#34;Common&amp;lt;br/&amp;gt;distributions&amp;#34;]

DS --&amp;gt; DIS[&amp;#34;Discrete&amp;#34;]
DS --&amp;gt; CON[&amp;#34;Continuous&amp;#34;]

DIS --&amp;gt; D1[&amp;#34;Bernoulli&amp;#34;]
DIS --&amp;gt; D2[&amp;#34;Binomial&amp;#34;]
DIS --&amp;gt; D3[&amp;#34;Poisson&amp;#34;]

CON --&amp;gt; D4[&amp;#34;Normal&amp;lt;br/&amp;gt;(Gaussian)&amp;#34;]
CON --&amp;gt; D5[&amp;#34;t / Chi-square / F&amp;lt;br/&amp;gt;(intro)&amp;#34;]

style PD fill:#90CAF9,stroke:#1E88E5,color:#000
style DS fill:#90CAF9,stroke:#1E88E5,color:#000

style DIS fill:#CE93D8,stroke:#8E24AA,color:#000
style CON fill:#CE93D8,stroke:#8E24AA,color:#000

style D1 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D2 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D3 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D4 fill:#C8E6C9,stroke:#2E7D32,color:#000
style D5 fill:#C8E6C9,stroke:#2E7D32,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="1-bernoulli-distribution-binary">
 1) Bernoulli distribution (binary)
 
 &lt;a class="anchor" href="#1-bernoulli-distribution-binary">#&lt;/a>
 
&lt;/h2>
&lt;p>Use when:
one trial has two outcomes (success/failure).&lt;/p></description></item><item><title>Ordinary Least Squares</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</guid><description>&lt;h1 id="direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">
 Direct solution method - Ordinary Least Squares and the Line of Best Fit
 
 &lt;a class="anchor" href="#direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">#&lt;/a>
 
&lt;/h1>
&lt;p>It is possible to compute the best parameters for linear regression &lt;strong>in one shot&lt;/strong> (closed-form),
instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6&lt;/p>
&lt;p>For linear regression, the direct method is usually &lt;strong>Ordinary Least Squares (OLS)&lt;/strong>.&lt;/p>
&lt;p>Ordinary Least Squares (OLS) chooses the “best” line by &lt;strong>minimising squared prediction errors&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
OLS defines “best fit” as the line that minimises the total squared residual error across all data points.&lt;/p></description></item><item><title>Cost Function</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</guid><description>&lt;h1 id="cost-function">
 Cost Function
 
 &lt;a class="anchor" href="#cost-function">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>also known as an objective function&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>how far the predicted values are from the actual ones&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measure of the difference between predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>quantifies the error between a model&amp;rsquo;s predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measures the model’s error on a group of datapoints&lt;/p>
&lt;/li>
&lt;li>
&lt;p>method used to predict values by drawing the best-fit line through the data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>used to evaluate the accuracy of a model’s predictions&lt;/p></description></item><item><title>Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</guid><description>&lt;h1 id="gradient-descent-for-linear-regression">
 Gradient Descent for Linear Regression
 
 &lt;a class="anchor" href="#gradient-descent-for-linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Iterative method&lt;/strong>&lt;/li>
&lt;li>Types: batch / stochastic / mini-batch&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.&lt;/p>
&lt;/blockquote>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
GD[&amp;#34;Gradient&amp;lt;br/&amp;gt;Descent&amp;#34;] --&amp;gt;|minimises| CF[&amp;#34;Cost&amp;lt;br/&amp;gt;function&amp;#34;]
GD --&amp;gt;|updates| W[&amp;#34;Parameters&amp;lt;br/&amp;gt;(weights)&amp;#34;]
GD --&amp;gt;|uses| GR[&amp;#34;Gradient&amp;lt;br/&amp;gt;(slope)&amp;#34;]

GD --&amp;gt; H[&amp;#34;Hyperparameters&amp;#34;]
H --&amp;gt; LR[&amp;#34;Learning&amp;lt;br/&amp;gt;rate&amp;#34;]
H --&amp;gt; BS[&amp;#34;Batch&amp;lt;br/&amp;gt;size&amp;#34;]
H --&amp;gt; EP[&amp;#34;Epochs&amp;#34;]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-gd">
 Types of GD
 
 &lt;a class="anchor" href="#types-of-gd">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Gradient Descent&amp;lt;br/&amp;gt;types&amp;#34;] --&amp;gt; BGD[&amp;#34;Batch&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; SGD[&amp;#34;Stochastic&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; MGD[&amp;#34;Mini-batch&amp;lt;br/&amp;gt;GD&amp;#34;]

BGD --&amp;gt; ALL[&amp;#34;All data&amp;lt;br/&amp;gt;per step&amp;#34;]
BGD --&amp;gt; STB[&amp;#34;Smooth&amp;lt;br/&amp;gt;updates&amp;#34;]

SGD --&amp;gt; ONE[&amp;#34;1 sample&amp;lt;br/&amp;gt;per step&amp;#34;]
SGD --&amp;gt; FAST[&amp;#34;Quick&amp;lt;br/&amp;gt;progress&amp;#34;]
SGD --&amp;gt; NOISE[&amp;#34;Noisy&amp;lt;br/&amp;gt;updates&amp;#34;]

MGD --&amp;gt; MB[&amp;#34;Small batch&amp;lt;br/&amp;gt;per step&amp;#34;]
MGD --&amp;gt; PRACT[&amp;#34;Practical&amp;lt;br/&amp;gt;default&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h3 id="batch">
 Batch
 
 &lt;a class="anchor" href="#batch">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>Use only if you have huge compute and a lot of time to train&lt;/li>
&lt;/ul>
&lt;h3 id="sgd">
 SGD
 
 &lt;a class="anchor" href="#sgd">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>go-to solution&lt;/p></description></item><item><title>Deep Learning</title><link>https://arshadhs.github.io/docs/ai/deep-learning/</link><pubDate>Wed, 22 Apr 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/</guid><description>&lt;h1 id="deep-learning">
 Deep Learning
 
 &lt;a class="anchor" href="#deep-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Subset of ML&lt;/li>
&lt;li>focuses on algorithms inspired by the structure and function of the brain called &lt;strong>Artificial Neural Networks&lt;/strong>.&lt;/li>
&lt;li>A &lt;a href="https://arshadhs.github.io/docs/ai/neural-network/">neural network&lt;/a> with multiple hidden layers and multiple nodes in each hidden layer is known as a deep learning system or a deep neural network.&lt;/li>
&lt;li>Allows systems to &lt;strong>automatically learn hierarchical representations&lt;/strong> (features) from raw input, such as images, sound, or text.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="operational-steps-for-neural-architectures">
 Operational Steps for Neural Architectures
 
 &lt;a class="anchor" href="#operational-steps-for-neural-architectures">#&lt;/a>
 
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Step&lt;/th>
 &lt;th>Perceptron (Boolean/Logic)&lt;/th>
 &lt;th>Linear Regression Network&lt;/th>
 &lt;th>Binary Classification (Logistic)&lt;/th>
 &lt;th>DFNN / MLP (Classification)&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;strong>1. Input&lt;/strong>&lt;/td>
 &lt;td>Take binary or discrete inputs 
&lt;span>
 \( x_1, \dots, x_n \)
 &lt;/span>

&lt;/td>
 &lt;td>Take numerical features 
&lt;span>
 \( x \)
 &lt;/span>

&lt;/td>
 &lt;td>Take numerical features 
&lt;span>
 \( x \)
 &lt;/span>

&lt;/td>
 &lt;td>Take high-dimensional numerical or categorical features&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>2. Weighted Sum&lt;/strong>&lt;/td>
 &lt;td>Single calculation: 
&lt;span>
 \( z = \sum (w_i x_i) + b \)
 &lt;/span>

&lt;/td>
 &lt;td>Single calculation: 
&lt;span>
 \( \hat{y} = w_0 + w_1 x \)
 &lt;/span>

&lt;/td>
 &lt;td>Single calculation: 
&lt;span>
 \( z = W x + b \)
 &lt;/span>

&lt;/td>
 &lt;td>Multiple stages: 
&lt;span>
 \( z^{[l]} = W^{[l]} a^{[l-1]} + b^{[l]} \)
 &lt;/span>

 for each layer 
&lt;span>
 \( l \)
 &lt;/span>

&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>3. Activation&lt;/strong>&lt;/td>
 &lt;td>Step Function: Output 1 if 
&lt;span>
 \( z \geq 0 \)
 &lt;/span>

, else 0&lt;/td>
 &lt;td>Identity: The output remains 
&lt;span>
 \( z \)
 &lt;/span>

 (no non-linear change)&lt;/td>
 &lt;td>Sigmoid: Maps 
&lt;span>
 \( z \)
 &lt;/span>

 to a probability between 0 and 1&lt;/td>
 &lt;td>ReLU for hidden layers; Softmax/Sigmoid for the output layer&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>4. Loss / Error&lt;/strong>&lt;/td>
 &lt;td>Error = Target − Output&lt;/td>
 &lt;td>Mean Squared Error (MSE): 
&lt;span>
 \( J = \frac{1}{2N} \sum (Y - \hat{y})^2 \)
 &lt;/span>

&lt;/td>
 &lt;td>Binary Cross-Entropy (BCE): penalises based on probability distance&lt;/td>
 &lt;td>BCE or Categorical Cross-Entropy for multiple classes&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>5. Optimisation&lt;/strong>&lt;/td>
 &lt;td>Update weights only on misclassification&lt;/td>
 &lt;td>Gradient Descent: compute gradients at initialization and update weights&lt;/td>
 &lt;td>Backpropagation: compute error signals 
&lt;span>
 \( \delta \)
 &lt;/span>

 and gradients 
&lt;span>
 \( dW \)
 &lt;/span>

&lt;/td>
 &lt;td>Backpropagation: recursive chain rule to update all hidden layer weights&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>6. Output&lt;/strong>&lt;/td>
 &lt;td>Discrete Boolean value (0 or 1)&lt;/td>
 &lt;td>Continuous numerical value (e.g., house prices)&lt;/td>
 &lt;td>Single probability score or class label&lt;/td>
 &lt;td>A vector of probabilities for multiple classes&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>




&lt;ul>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/">Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/">Artificial Neuron and Perceptron&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/030-linear-neural-networks-for-regression/">LNN for Regression&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/035-gradient-descent-algorithm/">Gradient Descent Algorithm&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/040-linear-neural-networks-for-classification/">LNN for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/050-deep-feedforward/">Deep Feedforward Neural Networks (DFNN) for Classification&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/060-cnn-fundamentals/">Convolutional Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/065-deep-cnn-architectures/">Deep CNN Architectures&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/067-cnn-model/">CNN Pipeline&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/070-recurrent-nn/">Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/075-recurrent-nn-deep/">Deep Recurrent Neural Networks&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/">Attention Mechanism&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/">Transformer&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/">Optimisation of Deep models&lt;/a>
 &lt;/li>
 
 
 
 
 &lt;li>
 &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/">Regularisation for Deep models&lt;/a>
 &lt;/li>
 
 

 
 
&lt;/ul>


&lt;hr>


&lt;pre class="mermaid">
flowchart LR
 %% Input Layer
 subgraph subGraph0[&amp;#34;Input Layer&amp;#34;]
 I1((&amp;#34;Input 1&amp;#34;))
 I2((&amp;#34;Input 2&amp;#34;))
 I3((&amp;#34;Input 3&amp;#34;))
 end

 %% Hidden Layers
 subgraph subGraph1[&amp;#34;Hidden Layer 1&amp;#34;]
 H1a((&amp;#34;H1-1&amp;#34;))
 H1b((&amp;#34;H1-2&amp;#34;))
 H1c((&amp;#34;H1-3&amp;#34;))
 end

 subgraph subGraph2[&amp;#34;Hidden Layer 2&amp;#34;]
 H2a((&amp;#34;H2-1&amp;#34;))
 H2b((&amp;#34;H2-2&amp;#34;))
 H2c((&amp;#34;H2-3&amp;#34;))
 end

 subgraph subGraph3[&amp;#34;Hidden Layer 3&amp;#34;]
 H3a((&amp;#34;H3-1&amp;#34;))
 H3b((&amp;#34;H3-2&amp;#34;))
 H3c((&amp;#34;H3-3&amp;#34;))
 end

 %% Output Layer
 subgraph subGraph4[&amp;#34;Output Layer&amp;#34;]
 O((&amp;#34;Output&amp;#34;))
 end

 %% Connections: Input to Hidden Layer 1
 I1 --&amp;gt; H1a &amp;amp; H1b &amp;amp; H1c
 I2 --&amp;gt; H1a &amp;amp; H1b &amp;amp; H1c
 I3 --&amp;gt; H1a &amp;amp; H1b &amp;amp; H1c

 %% Connections: Hidden Layer 1 to Hidden Layer 2
 H1a --&amp;gt; H2a &amp;amp; H2b &amp;amp; H2c
 H1b --&amp;gt; H2a &amp;amp; H2b &amp;amp; H2c
 H1c --&amp;gt; H2a &amp;amp; H2b &amp;amp; H2c

 %% Connections: Hidden Layer 2 to Hidden Layer 3
 H2a --&amp;gt; H3a &amp;amp; H3b &amp;amp; H3c
 H2b --&amp;gt; H3a &amp;amp; H3b &amp;amp; H3c
 H2c --&amp;gt; H3a &amp;amp; H3b &amp;amp; H3c

 %% Connections: Hidden Layer 3 to Output
 H3a --&amp;gt; O
 H3b --&amp;gt; O
 H3c --&amp;gt; O

 %% Styling
 style I1 fill:#C8E6C9
 style I2 fill:#C8E6C9
 style I3 fill:#C8E6C9
 style H1a fill:#BBDEFB
 style H1b fill:#BBDEFB
 style H1c fill:#BBDEFB
 style H2a fill:#90CAF9
 style H2b fill:#90CAF9
 style H2c fill:#90CAF9
 style H3a fill:#64B5F6
 style H3b fill:#64B5F6
 style H3c fill:#64B5F6
 style O fill:#FFCDD2
 style subGraph0 stroke:none,fill:transparent
 style subGraph1 stroke:none,fill:transparent
 style subGraph2 stroke:none,fill:transparent
 style subGraph3 stroke:none,fill:transparent
 style subGraph4 stroke:none,fill:transparent
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-neural-networks">
 Types of Neural Networks
 
 &lt;a class="anchor" href="#types-of-neural-networks">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Standard NN - Small and Standard for a smaller and simpler data (e.g. Real Estate&lt;/li>
&lt;li>CNN - Convolution - used for Images (e.g. Photo Tagging, Object Detection)&lt;/li>
&lt;li>RNN - Recurrent - used for Text (e.g. Speech Recognition, Translation)&lt;/li>
&lt;li>Hybrid NN (e.g. Autonoumous Driving)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="components-of-dl">
 Components of DL
 
 &lt;a class="anchor" href="#components-of-dl">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Data&lt;/li>
&lt;li>Learning Algorithm : How to transform data&lt;/li>
&lt;li>&lt;strong>Loss Function&lt;/strong>: Objective function that &lt;strong>quantifies how well is model doing?&lt;/strong> lower the loss function, the better the model. So loss function will try to quantify how well or badly the model is learning or the model is doing.&lt;/li>
&lt;li>Optimnisation Algorithm: in order &lt;strong>to adjust the loss function&lt;/strong>, Learning Algorithm will try to &lt;strong>optimize our algorithm&lt;/strong>. searching for the best possible parameters for minimizing the loss function. Popular optimization algorithms for deep learning are based on an approach called &lt;strong>gradient descent&lt;/strong>.&lt;/li>
&lt;li>Model&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="operational-steps-for-neural-architectures-1">
 Operational Steps for Neural Architectures
 
 &lt;a class="anchor" href="#operational-steps-for-neural-architectures-1">#&lt;/a>
 
&lt;/h2>
&lt;table>
 &lt;thead>
 &lt;tr>
 &lt;th>Step&lt;/th>
 &lt;th>Perceptron (Boolean/Logic)&lt;/th>
 &lt;th>Linear Regression Network&lt;/th>
 &lt;th>Binary Classification (Logistic)&lt;/th>
 &lt;th>DFNN / MLP (Classification)&lt;/th>
 &lt;/tr>
 &lt;/thead>
 &lt;tbody>
 &lt;tr>
 &lt;td>&lt;strong>1. Input&lt;/strong>&lt;/td>
 &lt;td>Binary/discrete inputs 
&lt;span>
 \( x_1, \dots, x_n \)
 &lt;/span>

&lt;/td>
 &lt;td>Numerical features 
&lt;span>
 \( x \)
 &lt;/span>

&lt;/td>
 &lt;td>Numerical features 
&lt;span>
 \( x \)
 &lt;/span>

&lt;/td>
 &lt;td>High-dimensional numerical or categorical features&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>2. Weighted Sum&lt;/strong>&lt;/td>
 &lt;td>
&lt;span>
 \( z = \sum (w_i x_i) + b \)
 &lt;/span>

&lt;/td>
 &lt;td>
&lt;span>
 \( \hat{y} = w_0 + w_1 x \)
 &lt;/span>

&lt;/td>
 &lt;td>
&lt;span>
 \( z = W x + b \)
 &lt;/span>

&lt;/td>
 &lt;td>
&lt;span>
 \( z^{[l]} = W^{[l]} a^{[l-1]} + b^{[l]} \)
 &lt;/span>

&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>3. Activation&lt;/strong>&lt;/td>
 &lt;td>Step: 1 if 
&lt;span>
 \( z \geq 0 \)
 &lt;/span>

, else 0&lt;/td>
 &lt;td>Identity: output = 
&lt;span>
 \( z \)
 &lt;/span>

&lt;/td>
 &lt;td>Sigmoid: maps 
&lt;span>
 \( z \)
 &lt;/span>

 to probability&lt;/td>
 &lt;td>ReLU (hidden), Softmax/Sigmoid (output)&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>4. Loss / Error&lt;/strong>&lt;/td>
 &lt;td>Error = Target − Output&lt;/td>
 &lt;td>
&lt;span>
 \( J = \frac{1}{2N} \sum (Y - \hat{y})^2 \)
 &lt;/span>

&lt;/td>
 &lt;td>Binary Cross-Entropy (BCE)&lt;/td>
 &lt;td>BCE or Categorical Cross-Entropy&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>5. Optimisation&lt;/strong>&lt;/td>
 &lt;td>Update on misclassification&lt;/td>
 &lt;td>Gradient Descent&lt;/td>
 &lt;td>Backpropagation (single layer)&lt;/td>
 &lt;td>Backpropagation (multi-layer chain rule)&lt;/td>
 &lt;/tr>
 &lt;tr>
 &lt;td>&lt;strong>6. Output&lt;/strong>&lt;/td>
 &lt;td>Boolean (0 or 1)&lt;/td>
 &lt;td>Continuous value&lt;/td>
 &lt;td>Probability score&lt;/td>
 &lt;td>Probability vector (multi-class)&lt;/td>
 &lt;/tr>
 &lt;/tbody>
&lt;/table>
&lt;hr>
&lt;h2 id="applications">
 Applications
 
 &lt;a class="anchor" href="#applications">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Computer Vision (e.g., face detection, medical imaging)&lt;/li>
&lt;li>Natural Language Processing (e.g., ChatGPT, translation)&lt;/li>
&lt;li>Self Driving Cars&lt;/li>
&lt;li>Speech Assistants (e.g., Alexa, Siri)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="intution">
 Intution
 
 &lt;a class="anchor" href="#intution">#&lt;/a>
 
&lt;/h2>
&lt;p>Deep Learning is the methodology, DNN is a model.&lt;/p></description></item><item><title>Hypothesis Testing</title><link>https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/</link><pubDate>Thu, 12 Mar 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/04_hypothesis_testing/</guid><description>&lt;h1 id="hypothesis-testing">
 Hypothesis Testing
 
 &lt;a class="anchor" href="#hypothesis-testing">#&lt;/a>
 
&lt;/h1>
&lt;p>Hypothesis testing is a structured way to decide:&lt;/p>
&lt;p>Is what we see in a sample just random variation,
or is there evidence of a real effect in the population?&lt;/p>
&lt;p>Hypothesis Testing topic sits inside &lt;strong>inferential statistics&lt;/strong>:
we use a &lt;strong>sample&lt;/strong> to make a statement about a &lt;strong>population&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>Sampling (random and stratified)&lt;/li>
&lt;li>Sampling distribution and Central Limit Theorem&lt;/li>
&lt;li>Estimation (confidence intervals and confidence level)&lt;/li>
&lt;li>Testing hypotheses (mean, proportion, ANOVA)&lt;/li>
&lt;li>Maximum likelihood (MLE)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
The logic is always the same:&lt;/p></description></item><item><title>Classification(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</guid><description>&lt;h1 id="linear-models-for-classification">
 Linear models for Classification
 
 &lt;a class="anchor" href="#linear-models-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>categorises data by finding a linear boundary (hyperplane) that separates classes&lt;/li>
&lt;li>calculating a weighted sum of input features plus bias&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Linear&amp;lt;br/&amp;gt;classification&amp;lt;br/&amp;gt;models&amp;#34;] --&amp;gt; P[&amp;#34;Perceptron&amp;#34;]
T --&amp;gt; LR[&amp;#34;Logistic&amp;lt;br/&amp;gt;regression&amp;#34;]
T --&amp;gt; SVM[&amp;#34;Linear&amp;lt;br/&amp;gt;SVM&amp;#34;]

P --&amp;gt;|uses| STEP[&amp;#34;Step&amp;lt;br/&amp;gt;activation&amp;#34;]
LR --&amp;gt;|uses| SIG[&amp;#34;Sigmoid&amp;lt;br/&amp;gt;+ log loss&amp;#34;]
SVM --&amp;gt;|uses| HNG[&amp;#34;Hinge&amp;lt;br/&amp;gt;loss&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style P fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style SVM fill:#C8E6C9,stroke:#2E7D32,color:#000

style STEP fill:#CE93D8,stroke:#8E24AA,color:#000
style SIG fill:#CE93D8,stroke:#8E24AA,color:#000
style HNG fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h2 id="discriminant-functions">
 Discriminant Functions
 
 &lt;a class="anchor" href="#discriminant-functions">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="decision-theory">
 Decision Theory
 
 &lt;a class="anchor" href="#decision-theory">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-discriminative-classifiers">
 Probabilistic Discriminative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-discriminative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h2 id="logistic-regression">
 Logistic Regression
 
 &lt;a class="anchor" href="#logistic-regression">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Supervised machine learning algorithm&lt;/li>
&lt;li>Binary &lt;strong>classification&lt;/strong> algorithm&lt;/li>
&lt;li>requires data to be linearly separable&lt;/li>
&lt;li>predicts the probability that an input belongs to a specific class&lt;/li>
&lt;li>uses &lt;strong>Sigmoid function&lt;/strong> to convert inputs into a probability value between 0 and 1&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Logistic regression predicts $P(y=1\mid x)$ using a sigmoid of a linear score $z=w\cdot x+b$,
then learns $w,b$ by maximising likelihood (equivalently minimising log-loss).&lt;/p></description></item><item><title>Foundation Models</title><link>https://arshadhs.github.io/docs/ai/genai/foundation-model/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/foundation-model/</guid><description>&lt;h1 id="foundation-model">
 Foundation Model
 
 &lt;a class="anchor" href="#foundation-model">#&lt;/a>
 
&lt;/h1>
&lt;p>AI models trained on massive datasets to perform a wide range of tasks with minimal fine-tuning.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>are large deep learning neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>are large AI models trained on &lt;strong>massive and diverse datasets&lt;/strong> (text, images, audio, or multiple modalities).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Contain &lt;strong>millions or billions of parameters&lt;/strong>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed to perform a &lt;strong>broad range of general tasks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed for &lt;strong>general-purpose intelligence&lt;/strong>, not a single task.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>acts as &lt;strong>base models&lt;/strong> for building specialised AI applications&lt;/p></description></item><item><title>LLM - Model</title><link>https://arshadhs.github.io/docs/ai/genai/llm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/llm/</guid><description>&lt;h1 id="llm--large-language-model">
 LLM – Large Language Model
 
 &lt;a class="anchor" href="#llm--large-language-model">#&lt;/a>
 
&lt;/h1>
&lt;p>Large Language Models (LLMs) are &lt;strong>advanced AI systems&lt;/strong> designed to process, understand, and generate &lt;strong>human-like text&lt;/strong>.&lt;/p>
&lt;p>They learn language by analysing &lt;strong>massive amounts of text data&lt;/strong>, discovering patterns in:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>meaning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>relationships between words and sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Built on &lt;strong>Deep Learning&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Implemented using &lt;strong>Neural Networks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Based on &lt;strong>Transformers&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often combined with tools like:&lt;/p>
&lt;ul>
&lt;li>Retrieval (RAG)&lt;/li>
&lt;li>Agents&lt;/li>
&lt;li>External APIs&lt;/li>
&lt;li>Memory systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-makes-an-llm-special">
 What makes an LLM special?
 
 &lt;a class="anchor" href="#what-makes-an-llm-special">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Built using &lt;strong>deep neural networks&lt;/strong>&lt;/li>
&lt;li>Trained on &lt;strong>very large datasets&lt;/strong> (books, articles, code, web text)&lt;/li>
&lt;li>Can perform many tasks &lt;strong>without task-specific training&lt;/strong>&lt;/li>
&lt;li>General-purpose language understanding, not single-task models&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="foundation-transformer-architecture">
 Foundation: Transformer Architecture
 
 &lt;a class="anchor" href="#foundation-transformer-architecture">#&lt;/a>
 
&lt;/h2>
&lt;p>LLMs are based on the &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/transformer/">Transformer Architecture&lt;/a>&lt;/strong>, which allows models to understand &lt;strong>context and long-range dependencies&lt;/strong> in text.&lt;/p></description></item><item><title>AI Agents</title><link>https://arshadhs.github.io/docs/ai/genai/ai-agents/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/ai-agents/</guid><description>&lt;h1 id="ai-agents">
 AI Agents
 
 &lt;a class="anchor" href="#ai-agents">#&lt;/a>
 
&lt;/h1>
&lt;p>Also referred to as Agentic AI.&lt;/p>
&lt;p>AI agents are &lt;strong>intelligent systems&lt;/strong> that can &lt;strong>plan, make decisions, and take actions&lt;/strong> to achieve goals with &lt;strong>minimal human intervention&lt;/strong>.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>A common use case is &lt;strong>task automation&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>for example booking travel based on a user’s request.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>AI agents typically build on &lt;strong>Generative AI&lt;/strong> and use &lt;strong>Large Language Models (LLMs)&lt;/strong> as the reasoning core.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Agents often interact with tools (APIs, databases, calendars) to complete multi-step workflows.&lt;/p></description></item><item><title>Retrieval-Augmented Generation (RAG)</title><link>https://arshadhs.github.io/docs/ai/genai/rag/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/rag/</guid><description>&lt;h1 id="retrieval-augmented-generation-rag">
 Retrieval-Augmented Generation (RAG)
 
 &lt;a class="anchor" href="#retrieval-augmented-generation-rag">#&lt;/a>
 
&lt;/h1>
&lt;p>&lt;strong>Retrieval-Augmented Generation (RAG)&lt;/strong> is a system design pattern that improves an LLM’s answers by:&lt;/p>
&lt;ol>
&lt;li>&lt;strong>Retrieving&lt;/strong> relevant information from an external knowledge source, and then&lt;/li>
&lt;li>&lt;strong>Augmenting&lt;/strong> the LLM prompt with that retrieved context before generating the final response.&lt;/li>
&lt;/ol>
&lt;p>RAG helps an LLM &lt;strong>look things up first&lt;/strong>, then &lt;strong>answer using evidence&lt;/strong>.&lt;/p>
&lt;hr>
&lt;h2 id="why-rag-is-useful">
 Why RAG is Useful
 
 &lt;a class="anchor" href="#why-rag-is-useful">#&lt;/a>
 
&lt;/h2>
&lt;p>RAG is commonly used when:&lt;/p>
&lt;ul>
&lt;li>Your knowledge is in &lt;strong>private documents&lt;/strong> (PDFs, policies, internal wiki)&lt;/li>
&lt;li>You need &lt;strong>up-to-date information&lt;/strong> (things not in the model’s training data)&lt;/li>
&lt;li>You want fewer &lt;strong>hallucinations&lt;/strong> by grounding answers in retrieved sources&lt;/li>
&lt;li>You want &lt;strong>traceability&lt;/strong> (show “where the answer came from”)&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>RAG does not change the model weights.&lt;br>
It changes what the model &lt;em>sees&lt;/em> at inference time by adding retrieved context.&lt;/p></description></item><item><title>Decision Tree</title><link>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</guid><description>&lt;h1 id="decision-tree">
 Decision Tree
 
 &lt;a class="anchor" href="#decision-tree">#&lt;/a>
 
&lt;/h1>
&lt;p>A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A decision tree grows by repeatedly splitting the training data into &lt;strong>purer&lt;/strong> subsets using an impurity measure
(Entropy / Gini / Classification Error).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="information-theory">
 Information Theory
 
 &lt;a class="anchor" href="#information-theory">#&lt;/a>
 
&lt;/h2>
&lt;p>Decision trees need a way to measure:
“How mixed are the class labels at a node?”&lt;/p></description></item><item><title>Prediction &amp; Forecasting</title><link>https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/05_prediction_n_forecasting/</guid><description>&lt;h1 id="prediction--forecasting">
 Prediction &amp;amp; Forecasting
 
 &lt;a class="anchor" href="#prediction--forecasting">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="correlation">
 Correlation
 
 &lt;a class="anchor" href="#correlation">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="regression">
 Regression
 
 &lt;a class="anchor" href="#regression">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="time-series-analysis">
 Time Series Analysis
 
 &lt;a class="anchor" href="#time-series-analysis">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="introduction-components-of-time-series-data">
 Introduction, Components of time series data
 
 &lt;a class="anchor" href="#introduction-components-of-time-series-data">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="ma-model--basic-and-weighted-ma-model">
 MA model – basic and weighted MA model
 
 &lt;a class="anchor" href="#ma-model--basic-and-weighted-ma-model">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="time-series-models">
 Time series models
 
 &lt;a class="anchor" href="#time-series-models">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>AR Model&lt;/li>
&lt;li>ARIMA Model&lt;/li>
&lt;li>SARIMA,SARIMAX,VAR,VARMAX&lt;/li>
&lt;li>Simple exponential smoothing model&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;p>&lt;a href="">Prediction &amp;amp; Forecasting&lt;/a>&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">
 Statistics
&lt;/a>&lt;/p></description></item><item><title>Gaussian Mixture model &amp; Expectation Maximization</title><link>https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/statistics/06_prediction_n_forecasting/</guid><description>&lt;h1 id="gaussian-mixture-model--expectation-maximization">
 Gaussian Mixture model &amp;amp; Expectation Maximization
 
 &lt;a class="anchor" href="#gaussian-mixture-model--expectation-maximization">#&lt;/a>
 
&lt;/h1>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;p>&lt;a href="https://www.geeksforgeeks.org/machine-learning/gaussian-mixture-model/">Gaussian Mixture model&lt;/a>&lt;/p>
&lt;p>&lt;a href="https://www.geeksforgeeks.org/machine-learning/ml-expectation-maximization-algorithm/">Expectation Maximization&lt;/a>&lt;/p>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/statistics/">
 Statistics
&lt;/a>&lt;/p></description></item><item><title>Instance-based Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</guid><description>&lt;h1 id="instance-based-learning">
 Instance-based Learning
 
 &lt;a class="anchor" href="#instance-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Instance-based learning is a family of methods that &lt;strong>do not build one explicit global model during training&lt;/strong>. Instead, they &lt;strong>store training examples&lt;/strong> and delay most of the work until a new query arrives.&lt;/p>
&lt;p>When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.&lt;/p>
&lt;p>Instance-based Learning covers three linked ideas:&lt;/p></description></item><item><title>Support Vector Machine</title><link>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</guid><description>&lt;h1 id="support-vector-machine-svm">
 Support Vector Machine (SVM)
 
 &lt;a class="anchor" href="#support-vector-machine-svm">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Support Vector Machine (SVM)&lt;/strong> is a &lt;strong>supervised machine learning algorithm&lt;/strong> used for:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Classification&lt;/strong> (most common)&lt;/li>
&lt;li>&lt;strong>Regression&lt;/strong> (SVR – Support Vector Regression)&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>Find the decision boundary that separates classes with the &lt;strong>maximum margin&lt;/strong>.&lt;/p>
&lt;/blockquote>&lt;blockquote class="book-hint default">
&lt;p>A Support Vector Machine is a supervised learning algorithm that finds an optimal hyperplane by maximising the margin between classes, using support vectors and kernel functions to handle non-linear data.&lt;/p></description></item><item><title>Attention Mechanism</title><link>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</guid><description>&lt;h1 id="attention-mechanism">
 Attention Mechanism
 
 &lt;a class="anchor" href="#attention-mechanism">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Queries, Keys, and Values&lt;/li>
&lt;li>Attention Pooling by Similarity&lt;/li>
&lt;li>Attention Pooling via Nadaraya–Watson Regression&lt;/li>
&lt;li>Attention Scoring Functions&lt;/li>
&lt;li>Dot Product Attention&lt;/li>
&lt;li>Convenience Functions&lt;/li>
&lt;li>Scaled Dot Product Attention&lt;/li>
&lt;li>Additive Attention&lt;/li>
&lt;li>Bahdanau Attention Mechanism&lt;/li>
&lt;li>Multi-Head Attention&lt;/li>
&lt;li>Self-Attention&lt;/li>
&lt;li>Positional Encoding&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_builders-guide/model-construction.html">Ch 10&lt;/a>, &lt;a href="https://d2l.ai/chapter_convolutional-neural-networks/index.html">Ch7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Bayesian Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</guid><description>&lt;h1 id="bayesian-learning">
 Bayesian Learning
 
 &lt;a class="anchor" href="#bayesian-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="mle-hypothesis">
 MLE Hypothesis
 
 &lt;a class="anchor" href="#mle-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="map-hypothesis">
 MAP Hypothesis
 
 &lt;a class="anchor" href="#map-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayes-rule">
 Bayes Rule
 
 &lt;a class="anchor" href="#bayes-rule">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="optimal-bayes-classifier">
 Optimal Bayes Classifier
 
 &lt;a class="anchor" href="#optimal-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="naïve-bayes-classifier">
 Naïve Bayes Classifier
 
 &lt;a class="anchor" href="#na%c3%afve-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-generative-classifiers">
 Probabilistic Generative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-generative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayesian-linear-regression">
 Bayesian Linear Regression
 
 &lt;a class="anchor" href="#bayesian-linear-regression">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Transformer</title><link>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</link><pubDate>Mon, 15 Dec 2025 10:55:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/090-transformer/</guid><description>&lt;h1 id="transformer">
 Transformer
 
 &lt;a class="anchor" href="#transformer">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>is an architecture of neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>based on the multi-head attention mechanism&lt;/p>
&lt;/li>
&lt;li>
&lt;p>text is converted to numerical representations called tokens, and each token is converted into a vector via lookup from a word embedding table&lt;/p>
&lt;/li>
&lt;li>
&lt;p>takes a text sequence as input and produces another text sequence as output&lt;/p>
&lt;/li>
&lt;li>
&lt;p>foundation for modern &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/genai/llm/">Large Language Models (LLMs)&lt;/a>&lt;/strong> like ChatGPT and Gemini&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Transformer architecture&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Model, Positionwise Feed-Forward Networks, Residual Connection and Layer Normalization&lt;/p></description></item><item><title>Ensemble Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</guid><description>&lt;h1 id="ensemble-learning">
 Ensemble Learning
 
 &lt;a class="anchor" href="#ensemble-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="combining-classifiers">
 Combining Classifiers
 
 &lt;a class="anchor" href="#combining-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bagging">
 Bagging
 
 &lt;a class="anchor" href="#bagging">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="random-forest">
 Random Forest
 
 &lt;a class="anchor" href="#random-forest">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="boosting">
 Boosting
 
 &lt;a class="anchor" href="#boosting">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="adaboost">
 ADABoost
 
 &lt;a class="anchor" href="#adaboost">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="gradient-boosting">
 Gradient Boosting
 
 &lt;a class="anchor" href="#gradient-boosting">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="xgboost">
 XGBoost
 
 &lt;a class="anchor" href="#xgboost">#&lt;/a>
 
&lt;/h3>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Optimisation of Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</guid><description>&lt;h1 id="optimisation-of-deep-models">
 Optimisation of Deep models
 
 &lt;a class="anchor" href="#optimisation-of-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Goal of Optimization&lt;/li>
&lt;li>Optimization Challenges in Deep Learning&lt;/li>
&lt;li>Gradient Descent&lt;/li>
&lt;li>Stochastic Gradient Descent&lt;/li>
&lt;li>Minibatch Stochastic Gradient Descent&lt;/li>
&lt;li>Momentum&lt;/li>
&lt;li>Adagrad and Algorithm&lt;/li>
&lt;li>RMSProp and Algorithm&lt;/li>
&lt;li>Adadelta and Algorithm&lt;/li>
&lt;li>Adam and Algorithm&lt;/li>
&lt;li>Code Implementation and comparison of algorithms (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (Ch12)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Evaluation/Comparison</title><link>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</guid><description>&lt;h1 id="machine-learning-model-evaluationcomparison">
 Machine Learning Model Evaluation/Comparison
 
 &lt;a class="anchor" href="#machine-learning-model-evaluationcomparison">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="comparing-machine-learning-models">
 Comparing Machine Learning Models
 
 &lt;a class="anchor" href="#comparing-machine-learning-models">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">
 Emerging requirements e.g., bias, fairness, interpretability of ML models
 
 &lt;a class="anchor" href="#emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Regularisation for Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</guid><description>&lt;h1 id="regularisation-for-deep-models">
 Regularisation for Deep models
 
 &lt;a class="anchor" href="#regularisation-for-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Generalization for regression&lt;/li>
&lt;li>Training Error and Generalization Error&lt;/li>
&lt;li>Underfitting or Overfitting&lt;/li>
&lt;li>Model Selection&lt;/li>
&lt;li>Weight Decay and Norms&lt;/li>
&lt;li>Generalization in Classification&lt;/li>
&lt;li>Environment and Distribution Shift&lt;/li>
&lt;li>Generalization in Deep Learning&lt;/li>
&lt;li>Dropout&lt;/li>
&lt;li>Batch Normalization&lt;/li>
&lt;li>Layer Normalization&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_introduction/index.html">T1 – Ch 3.6, 3.7, T1 - Ch 4.6, 4.7, T1 - Ch 5.5, 5.6, T1 - Ch 8.5, T1 - Ch 11.7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>AI Learning Resources</title><link>https://arshadhs.github.io/docs/ai/foundation/ai-notes/</link><pubDate>Sat, 03 Jan 2026 12:00:00 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/foundation/ai-notes/</guid><description>&lt;h1 id="ai-learning-resources">
 AI Learning Resources
 
 &lt;a class="anchor" href="#ai-learning-resources">#&lt;/a>
 
&lt;/h1>
&lt;p>A curated list of &lt;strong>high-quality online courses&lt;/strong> to learn Artificial Intelligence, Machine Learning, and Deep Learning from reputable universities and organisations.&lt;/p>
&lt;hr>
&lt;h2 id="recommended-books--references">
 Recommended Books &amp;amp; References
 
 &lt;a class="anchor" href="#recommended-books--references">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h3 id="deep-neural-networks-dnn">
 Deep Neural Networks (DNN)
 
 &lt;a class="anchor" href="#deep-neural-networks-dnn">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>&lt;strong>Deep Learning&lt;/strong>. MIT Press.&lt;br>
Goodfellow, I., Bengio, Y., &amp;amp; Courville, A. (2016). (Vol. 1, No. 2).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Introduction to Deep Learning&lt;/strong>. MIT Press.&lt;br>
Eugene, C. (2019).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>Deep Learning with Python&lt;/strong>. Simon &amp;amp; Schuster.&lt;br>
Chollet, F. (2021).&lt;/p></description></item></channel></rss>