<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>ML on Arshad Siddiqui</title><link>https://arshadhs.github.io/tags/ml/</link><description>Recent content in ML on Arshad Siddiqui</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Sat, 21 Feb 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://arshadhs.github.io/tags/ml/index.xml" rel="self" type="application/rss+xml"/><item><title>Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-supervised/</guid><description>&lt;h1 id="supervised-learning">
 Supervised Learning
 
 &lt;a class="anchor" href="#supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Trained using &lt;strong>labelled data&lt;/strong>.&lt;br>
Each example in the training set includes the &lt;strong>correct output&lt;/strong>.&lt;br>
The algorithm learns to &lt;strong>generalise&lt;/strong> and make predictions on unseen data.&lt;br>
Generally more &lt;strong>accurate&lt;/strong> than unsupervised methods.&lt;br>
Requires &lt;strong>human intervention&lt;/strong> for labelling and setup.&lt;br>
Widely used due to its &lt;strong>accuracy and efficiency&lt;/strong>.&lt;br>
Produces &lt;strong>highly accurate results&lt;/strong> when trained on good-quality labelled data.&lt;/p>
&lt;hr>
&lt;h2 id="classification">
 Classification
 
 &lt;a class="anchor" href="#classification">#&lt;/a>
 
&lt;/h2>
&lt;p>Output is &lt;strong>discrete&lt;/strong> (e.g. Yes/No, Spam/Not Spam).&lt;br>
Used for &lt;strong>categorising data&lt;/strong> into predefined classes.&lt;br>
Support Vector Machine (SVM) is a common classifier (a linear classifier with margin-based separation).&lt;/p></description></item><item><title>Unsupervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-unsupervised/</guid><description>&lt;h1 id="unsupervised-learning">
 Unsupervised Learning
 
 &lt;a class="anchor" href="#unsupervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Works on &lt;strong>unlabelled raw data&lt;/strong>.&lt;/li>
&lt;li>The algorithm &lt;strong>discovers hidden patterns&lt;/strong> without prior knowledge of outcomes.&lt;/li>
&lt;li>Requires &lt;strong>no human intervention&lt;/strong> during training.&lt;/li>
&lt;li>Does not make direct predictions — it &lt;strong>groups or organises data&lt;/strong> instead.&lt;/li>
&lt;li>Carries a &lt;strong>higher risk&lt;/strong> because there’s no ground truth to verify results.&lt;/li>
&lt;li>Common techniques include &lt;strong>Clustering&lt;/strong>, &lt;strong>Association&lt;/strong>, and &lt;strong>Dimensionality Reduction&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ML maths-based colours (same palette as supervised)
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#f3f4f6,stroke:#374151

 %% Root
 USL: Unsupervised Learning

 %% Main branches
 USL --&amp;gt; CLU:::category
 CLU: Clustering

 USL --&amp;gt; DR:::category
 DR: Dimensionality Reduction

 %% Clustering algorithms
 CLU --&amp;gt; KM:::geometry
 KM: K-Means

 CLU --&amp;gt; HC:::geometry
 HC: Hierarchical Clustering

 CLU --&amp;gt; DB:::geometry
 DB: DBSCAN

 %% Probabilistic models
 USL --&amp;gt; PM:::category
 PM: Probabilistic Models

 PM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 PM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;h2 id="clustering">
 Clustering
 
 &lt;a class="anchor" href="#clustering">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Groups &lt;strong>similar data points&lt;/strong> together based on shared features.&lt;/li>
&lt;li>Commonly used for &lt;strong>market segmentation&lt;/strong>, &lt;strong>image compression&lt;/strong>, and &lt;strong>anomaly detection&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-types-of-clustering">
 Common Types of Clustering
 
 &lt;a class="anchor" href="#common-types-of-clustering">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>K-Means Clustering&lt;/strong> – Divides data into &lt;em>K&lt;/em> groups based on similarity.&lt;/li>
&lt;li>&lt;strong>Hierarchical Clustering&lt;/strong> – Builds a hierarchy (tree) of clusters.&lt;/li>
&lt;li>&lt;strong>DBSCAN (Density-Based Spatial Clustering)&lt;/strong> – Groups points close in density; identifies noise/outliers.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="association">
 Association
 
 &lt;a class="anchor" href="#association">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Identifies &lt;strong>relationships or correlations&lt;/strong> between variables in a dataset.&lt;/li>
&lt;li>Commonly used in &lt;strong>market basket analysis&lt;/strong> (e.g. &amp;ldquo;Customers who bought X also bought Y&amp;rdquo;).&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Apriori Algorithm&lt;/strong> – Finds frequent itemsets and generates association rules.&lt;/li>
&lt;li>&lt;strong>Eclat Algorithm&lt;/strong> – Similar to Apriori but uses set intersections for faster computation.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="dimensionality-reduction">
 Dimensionality Reduction
 
 &lt;a class="anchor" href="#dimensionality-reduction">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Reduces the &lt;strong>number of input variables&lt;/strong> to simplify data.&lt;/li>
&lt;li>Helps remove noise and redundancy.&lt;/li>
&lt;li>Commonly used in &lt;strong>data pre-processing&lt;/strong> and &lt;strong>visualisation&lt;/strong>.&lt;/li>
&lt;/ul>
&lt;h3 id="common-techniques-1">
 Common Techniques
 
 &lt;a class="anchor" href="#common-techniques-1">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>&lt;strong>Principal Component Analysis (PCA)&lt;/strong> – Projects data onto fewer dimensions while keeping most variance.&lt;/li>
&lt;li>&lt;strong>Linear Discriminant Analysis (LDA)&lt;/strong> – Focuses on class separation.&lt;/li>
&lt;li>&lt;strong>t-SNE (t-Distributed Stochastic Neighbour Embedding)&lt;/strong> – Used for visualising high-dimensional data.&lt;/li>
&lt;li>&lt;strong>Autoencoders&lt;/strong> – Neural networks that compress and reconstruct data.&lt;/li>
&lt;/ul>
&lt;hr>


&lt;pre class="mermaid">
mindmap
 root(Unsupervised Learning)
 Clustering
 K Means
 Hierarchical Clustering
 DBSCAN
 Dimensionality Reduction
 PCA
 t SNE
 Autoencoders
 Probabilistic Models
 Gaussian Mixture Model
 Hidden Markov Model
&lt;/pre>

&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Semi-Supervised Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</link><pubDate>Sat, 03 Jan 2026 10:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/ml-semi-supervised/</guid><description>&lt;h1 id="semi-supervised-learning">
 Semi-Supervised Learning
 
 &lt;a class="anchor" href="#semi-supervised-learning">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A combination of &lt;strong>labelled&lt;/strong> and &lt;strong>unlabelled data&lt;/strong>.&lt;/li>
&lt;li>Useful when labelling large datasets is &lt;strong>expensive or time-consuming&lt;/strong>.&lt;/li>
&lt;li>Works well with &lt;strong>high-volume datasets&lt;/strong> (e.g. millions of images).&lt;/li>
&lt;li>Only a &lt;strong>small fraction of data&lt;/strong> is labelled (e.g. a few thousand).&lt;/li>
&lt;li>The algorithm learns from both labelled examples and structure in unlabelled data.&lt;/li>
&lt;li>&lt;strong>Ideal for medical imaging&lt;/strong> where labelled data is limited.&lt;/li>
&lt;li>For example, a &lt;strong>radiologist&lt;/strong> can label a small set of medical scans,&lt;br>
and the model uses that to learn from thousands of unlabelled scans.&lt;/li>
&lt;li>Helps improve &lt;strong>accuracy and generalisation&lt;/strong> with minimal manual effort.&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Neural Networks</title><link>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/010-neural-network/</guid><description>&lt;h1 id="neural-networks">
 Neural Networks
 
 &lt;a class="anchor" href="#neural-networks">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>A &lt;strong>network of artificial neurons&lt;/strong> inspired by how neurons function in the &lt;strong>human brain&lt;/strong>.&lt;/li>
&lt;li>At its core - a &lt;strong>mathematical model&lt;/strong> designed to process and learn from data.&lt;/li>
&lt;li>Neural networks form the &lt;strong>foundation of &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">Deep Learning&lt;/a>&lt;/strong> (involves training large and complex networks on vast amounts of data).&lt;/li>
&lt;/ul>
&lt;hr>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart LR
 subgraph subGraph0[&amp;#34;Input Layer&amp;#34;]
 I1((&amp;#34;Input 1&amp;#34;))
 I2((&amp;#34;Input 2&amp;#34;))
 I3((&amp;#34;Input 3&amp;#34;))
 end
 subgraph subGraph1[&amp;#34;Hidden Layer&amp;#34;]
 H1((&amp;#34;Hidden 1&amp;#34;))
 H2((&amp;#34;Hidden 2&amp;#34;))
 H3((&amp;#34;Hidden 3&amp;#34;))
 end
 subgraph subGraph2[&amp;#34;Output Layer&amp;#34;]
 O((&amp;#34;Output&amp;#34;))
 end
 I1 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I2 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 I3 --&amp;gt; H1 &amp;amp; H2 &amp;amp; H3
 H1 --&amp;gt; O
 H2 --&amp;gt; O
 H3 --&amp;gt; O

 style I1 fill:#C8E6C9
 style I2 fill:#C8E6C9
 style I3 fill:#C8E6C9
 style H1 stroke:#2962FF,fill:#BBDEFB
 style H2 fill:#BBDEFB
 style H3 fill:#BBDEFB
 style O fill:#FFCDD2
 style subGraph0 stroke:none,fill:transparent
 style subGraph1 stroke:none,fill:transparent
 style subGraph2 stroke:none,fill:transparent
&lt;/pre>

&lt;hr>
&lt;h3 id="structure-of-a-neural-network">
 Structure of a Neural Network
 
 &lt;a class="anchor" href="#structure-of-a-neural-network">#&lt;/a>
 
&lt;/h3>
&lt;p>A typical neural network has &lt;strong>three main layers&lt;/strong>:&lt;/p></description></item><item><title>Machine Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/</link><pubDate>Tue, 06 Aug 2024 23:29:52 +0100</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/</guid><description>&lt;h1 id="machine-learning">
 Machine Learning
 
 &lt;a class="anchor" href="#machine-learning">#&lt;/a>
 
&lt;/h1>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
stateDiagram-v2

 %% ===== CLASS DEFINITIONS (Math-based colours) =====
 classDef algebra fill:#cfe8ff,stroke:#1e3a8a,stroke-width:1px
 classDef probability fill:#d1fae5,stroke:#065f46,stroke-width:1px
 classDef geometry fill:#ffedd5,stroke:#9a3412,stroke-width:1px
 classDef logic fill:#ede9fe,stroke:#5b21b6,stroke-width:1px
 classDef category font-style:italic,font-weight:bold,fill:#aaaaaa,stroke:#374151,stroke-width:3px

 %% ===== ROOT =====
 ML: Machine Learning

 %% ===== SUPERVISED =====
 ML --&amp;gt; SL:::category
 SL: Supervised Learning

 SL --&amp;gt; Regression
 Regression --&amp;gt; LR:::algebra
 LR: Linear Regression

 LR --&amp;gt; NN:::algebra
 NN: Neural Network

 NN --&amp;gt; DT:::logic
 DT: Decision Tree

 SL --&amp;gt; Classification
 Classification --&amp;gt; NB:::probability
 NB: Naive Bayes

 NB --&amp;gt; KNN:::geometry
 KNN: k-Nearest Neighbours

 KNN --&amp;gt; SVM:::algebra
 SVM: Support Vector Machine
 
 %% ===== UNSUPERVISED =====
 ML --&amp;gt; USL:::category
 USL: Unsupervised Learning

 USL --&amp;gt; Clustering
 Clustering --&amp;gt; KM:::geometry
 KM: K-Means

 KM --&amp;gt; GMM:::probability
 GMM: Gaussian Mixture Model

 GMM --&amp;gt; HMM:::probability
 HMM: Hidden Markov Model

 %% ===== REINFORCEMENT =====
 ML --&amp;gt; RL:::category
 RL: Reinforcement Learning

 RL --&amp;gt; DM:::logic
 DM: Decision Making
&lt;/pre>

&lt;hr>
&lt;details >&lt;summary>Mathematical Legend&lt;/summary>
 &lt;div class="markdown-inner">
&lt;h3 id="algebra--linear-algebra-blue">
 Algebra / Linear Algebra (Blue)
 
 &lt;a class="anchor" href="#algebra--linear-algebra-blue">#&lt;/a>
 
&lt;/h3>
&lt;p>Used heavily when models rely on:&lt;/p></description></item><item><title>Artificial Neuron and Perceptron</title><link>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/020-perceptron/</guid><description>&lt;h1 id="artificial-neuron-and-perceptron">
 Artificial Neuron and Perceptron
 
 &lt;a class="anchor" href="#artificial-neuron-and-perceptron">#&lt;/a>
 
&lt;/h1>
&lt;blockquote class="book-hint info">
&lt;p>knowledge in neural networks is stored in &lt;strong>connection weights&lt;/strong>, and learning means &lt;strong>modifying those weights&lt;/strong>.&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="biological-neuron">
 Biological Neuron
 
 &lt;a class="anchor" href="#biological-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>A biological neuron is a specialised cell that processes and transmits information through electrical and chemical signals.&lt;/p>
&lt;p>Core components:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Dendrites&lt;/strong>: receive signals from other neurons&lt;/li>
&lt;li>&lt;strong>Cell body (soma)&lt;/strong>: processes incoming signals&lt;/li>
&lt;li>&lt;strong>Axon&lt;/strong>: transmits the output signal&lt;/li>
&lt;li>&lt;strong>Synapses&lt;/strong>: connection points between neurons&lt;/li>
&lt;/ul>
&lt;p>Biological intuition:&lt;/p>
&lt;ul>
&lt;li>many inputs arrive to one neuron&lt;/li>
&lt;li>one neuron can connect out to many neurons&lt;/li>
&lt;li>massive parallelism enables fast perception and recognition&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="artificial-neuron">
 Artificial Neuron
 
 &lt;a class="anchor" href="#artificial-neuron">#&lt;/a>
 
&lt;/h2>
&lt;p>An artificial neuron is a simplified computational model inspired by biological neurons.&lt;/p></description></item><item><title>ML Workflow</title><link>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/02-ml-workflow/</guid><description>&lt;h1 id="machine-learning-workflow">
 Machine learning Workflow
 
 &lt;a class="anchor" href="#machine-learning-workflow">#&lt;/a>
 
&lt;/h1>
&lt;p>Data is the foundation of any machine learning system.
Quality of data matters more than model complexity.&lt;/p>
&lt;h3 id="role-of-data">
 Role of Data
 
 &lt;a class="anchor" href="#role-of-data">#&lt;/a>
 
&lt;/h3>
&lt;p>Data determines:&lt;/p>
&lt;ul>
&lt;li>What patterns the model can learn&lt;/li>
&lt;li>How well it generalises&lt;/li>
&lt;li>Whether bias or noise is introduced&lt;/li>
&lt;/ul>
&lt;p>Bad data → bad model (even with perfect algorithms).&lt;/p>
&lt;hr>
&lt;h3 id="data-preprocessing-wrangling">
 Data Preprocessing, wrangling
 
 &lt;a class="anchor" href="#data-preprocessing-wrangling">#&lt;/a>
 
&lt;/h3>
&lt;p>Raw data is never ready for training.&lt;/p>
&lt;p>&lt;strong>Data Issues&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>Noise
&lt;ul>
&lt;li>For &lt;strong>objects&lt;/strong>, noise is an &lt;strong>extraneous object&lt;/strong>&lt;/li>
&lt;li>For &lt;strong>attributes&lt;/strong>, noise refers to &lt;strong>modification of original values&lt;/strong>&lt;/li>
&lt;li>Use Log or Z Transfer to convert to mean&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Outliers
&lt;ul>
&lt;li>Data objects with characteristics that are considerably different than most of the other data objects in the data set&lt;/li>
&lt;li>Handle: Use &lt;strong>IQR&lt;/strong> method&lt;/li>
&lt;li>Find Lower and Upper Bound and &lt;strong>replace Outlier with Lower or Upper Bound&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Missing Values
&lt;ul>
&lt;li>Eliminate data objects or variables&lt;/li>
&lt;li>Handle: Estimate missing values
&lt;ul>
&lt;li>&lt;strong>Mean, Median or Mode&lt;/strong>&lt;/li>
&lt;li>Prefer &lt;strong>Median&lt;/strong> if there are missing &lt;strong>outliers&lt;/strong>&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Ignore the missing value during analysis&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Duplicate Data
&lt;ul>
&lt;li>Major issue when merging data from heterogeneous sources&lt;/li>
&lt;/ul>
&lt;/li>
&lt;li>Inconsistent Codes
&lt;ul>
&lt;li>Find all Unique and transfer all inconsistent to&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;p>&lt;strong>Data Preprocessing techniques&lt;/strong>&lt;/p></description></item><item><title>Regression(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-linear-models-regression/</guid><description>&lt;h1 id="linear-regression">
 Linear Regression
 
 &lt;a class="anchor" href="#linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Linear Regression is a supervised 
&lt;span style="color: blue;">
 ML
&lt;/span> method used to predict a &lt;strong>numerical&lt;/strong> target by fitting a model that is &lt;strong>linear in its parameters&lt;/strong>.&lt;/p>
&lt;p>In 
&lt;span style="color: blue;">
 ML
&lt;/span>, linear models are a core baseline:
they’re fast, often surprisingly strong, and usually easy to interpret.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Linear Regression learns parameters by minimising a squared-error cost.
You can solve it directly (closed form) or iteratively (gradient descent),
and you can extend it using basis functions and regularisation.&lt;/p></description></item><item><title>Ordinary Least Squares</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-ordinary-least-squares/</guid><description>&lt;h1 id="direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">
 Direct solution method - Ordinary Least Squares and the Line of Best Fit
 
 &lt;a class="anchor" href="#direct-solution-method---ordinary-least-squares-and-the-line-of-best-fit">#&lt;/a>
 
&lt;/h1>
&lt;p>It is possible to compute the best parameters for linear regression &lt;strong>in one shot&lt;/strong> (closed-form),
instead of iteratively improving them step-by-step. fileciteturn34file10turn34file6&lt;/p>
&lt;p>For linear regression, the direct method is usually &lt;strong>Ordinary Least Squares (OLS)&lt;/strong>.&lt;/p>
&lt;p>Ordinary Least Squares (OLS) chooses the “best” line by &lt;strong>minimising squared prediction errors&lt;/strong>.&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
OLS defines “best fit” as the line that minimises the total squared residual error across all data points.&lt;/p></description></item><item><title>Cost Function</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-cost-function/</guid><description>&lt;h1 id="cost-function">
 Cost Function
 
 &lt;a class="anchor" href="#cost-function">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>
&lt;p>also known as an objective function&lt;/p>
&lt;/li>
&lt;li>
&lt;p>&lt;strong>how far the predicted values are from the actual ones&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measure of the difference between predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>quantifies the error between a model&amp;rsquo;s predicted values and actual values&lt;/p>
&lt;/li>
&lt;li>
&lt;p>measures the model’s error on a group of datapoints&lt;/p>
&lt;/li>
&lt;li>
&lt;p>method used to predict values by drawing the best-fit line through the data&lt;/p>
&lt;/li>
&lt;li>
&lt;p>used to evaluate the accuracy of a model’s predictions&lt;/p></description></item><item><title>Gradient Descent</title><link>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</link><pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/03-gradient-descent-linear-regression/</guid><description>&lt;h1 id="gradient-descent-for-linear-regression">
 Gradient Descent for Linear Regression
 
 &lt;a class="anchor" href="#gradient-descent-for-linear-regression">#&lt;/a>
 
&lt;/h1>
&lt;p>Gradient descent is an iterative optimisation method used to minimise the regression cost function by repeatedly updating parameters in the direction that reduces error.&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Iterative method&lt;/strong>&lt;/li>
&lt;li>Types: batch / stochastic / mini-batch&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Gradient descent starts with initial parameter values and repeatedly updates them using the gradient until the cost stops decreasing.&lt;/p>
&lt;/blockquote>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
GD[&amp;#34;Gradient&amp;lt;br/&amp;gt;Descent&amp;#34;] --&amp;gt;|minimises| CF[&amp;#34;Cost&amp;lt;br/&amp;gt;function&amp;#34;]
GD --&amp;gt;|updates| W[&amp;#34;Parameters&amp;lt;br/&amp;gt;(weights)&amp;#34;]
GD --&amp;gt;|uses| GR[&amp;#34;Gradient&amp;lt;br/&amp;gt;(slope)&amp;#34;]

GD --&amp;gt; H[&amp;#34;Hyperparameters&amp;#34;]
H --&amp;gt; LR[&amp;#34;Learning&amp;lt;br/&amp;gt;rate&amp;#34;]
H --&amp;gt; BS[&amp;#34;Batch&amp;lt;br/&amp;gt;size&amp;#34;]
H --&amp;gt; EP[&amp;#34;Epochs&amp;#34;]

style GD fill:#90CAF9,stroke:#1E88E5,color:#000

style CF fill:#CE93D8,stroke:#8E24AA,color:#000
style W fill:#CE93D8,stroke:#8E24AA,color:#000
style GR fill:#CE93D8,stroke:#8E24AA,color:#000
style H fill:#CE93D8,stroke:#8E24AA,color:#000
style LR fill:#CE93D8,stroke:#8E24AA,color:#000
style BS fill:#CE93D8,stroke:#8E24AA,color:#000
style EP fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;hr>
&lt;h2 id="types-of-gd">
 Types of GD
 
 &lt;a class="anchor" href="#types-of-gd">#&lt;/a>
 
&lt;/h2>


&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Gradient Descent&amp;lt;br/&amp;gt;types&amp;#34;] --&amp;gt; BGD[&amp;#34;Batch&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; SGD[&amp;#34;Stochastic&amp;lt;br/&amp;gt;GD&amp;#34;]
T --&amp;gt; MGD[&amp;#34;Mini-batch&amp;lt;br/&amp;gt;GD&amp;#34;]

BGD --&amp;gt; ALL[&amp;#34;All data&amp;lt;br/&amp;gt;per step&amp;#34;]
BGD --&amp;gt; STB[&amp;#34;Smooth&amp;lt;br/&amp;gt;updates&amp;#34;]

SGD --&amp;gt; ONE[&amp;#34;1 sample&amp;lt;br/&amp;gt;per step&amp;#34;]
SGD --&amp;gt; FAST[&amp;#34;Quick&amp;lt;br/&amp;gt;progress&amp;#34;]
SGD --&amp;gt; NOISE[&amp;#34;Noisy&amp;lt;br/&amp;gt;updates&amp;#34;]

MGD --&amp;gt; MB[&amp;#34;Small batch&amp;lt;br/&amp;gt;per step&amp;#34;]
MGD --&amp;gt; PRACT[&amp;#34;Practical&amp;lt;br/&amp;gt;default&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style BGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style SGD fill:#C8E6C9,stroke:#2E7D32,color:#000
style MGD fill:#C8E6C9,stroke:#2E7D32,color:#000

style ALL fill:#CE93D8,stroke:#8E24AA,color:#000
style STB fill:#CE93D8,stroke:#8E24AA,color:#000
style ONE fill:#CE93D8,stroke:#8E24AA,color:#000
style FAST fill:#CE93D8,stroke:#8E24AA,color:#000
style NOISE fill:#CE93D8,stroke:#8E24AA,color:#000
style MB fill:#CE93D8,stroke:#8E24AA,color:#000
style PRACT fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h3 id="batch">
 Batch
 
 &lt;a class="anchor" href="#batch">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>Use only if you have huge compute and a lot of time to train&lt;/li>
&lt;/ul>
&lt;h3 id="sgd">
 SGD
 
 &lt;a class="anchor" href="#sgd">#&lt;/a>
 
&lt;/h3>
&lt;ul>
&lt;li>
&lt;p>go-to solution&lt;/p></description></item><item><title>Classification(Linear Models)</title><link>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/04-linear-models-classification/</guid><description>&lt;h1 id="linear-models-for-classification">
 Linear models for Classification
 
 &lt;a class="anchor" href="#linear-models-for-classification">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>categorises data by finding a linear boundary (hyperplane) that separates classes&lt;/li>
&lt;li>calculating a weighted sum of input features plus bias&lt;/li>
&lt;/ul>


&lt;script src="https://arshadhs.github.io/mermaid.min.js">&lt;/script>

 &lt;script>mermaid.initialize({
 "flowchart": {
 "useMaxWidth":true
 },
 "theme": "default"
}
)&lt;/script>




&lt;pre class="mermaid">
flowchart TD
T[&amp;#34;Linear&amp;lt;br/&amp;gt;classification&amp;lt;br/&amp;gt;models&amp;#34;] --&amp;gt; P[&amp;#34;Perceptron&amp;#34;]
T --&amp;gt; LR[&amp;#34;Logistic&amp;lt;br/&amp;gt;regression&amp;#34;]
T --&amp;gt; SVM[&amp;#34;Linear&amp;lt;br/&amp;gt;SVM&amp;#34;]

P --&amp;gt;|uses| STEP[&amp;#34;Step&amp;lt;br/&amp;gt;activation&amp;#34;]
LR --&amp;gt;|uses| SIG[&amp;#34;Sigmoid&amp;lt;br/&amp;gt;+ log loss&amp;#34;]
SVM --&amp;gt;|uses| HNG[&amp;#34;Hinge&amp;lt;br/&amp;gt;loss&amp;#34;]

style T fill:#90CAF9,stroke:#1E88E5,color:#000

style P fill:#C8E6C9,stroke:#2E7D32,color:#000
style LR fill:#C8E6C9,stroke:#2E7D32,color:#000
style SVM fill:#C8E6C9,stroke:#2E7D32,color:#000

style STEP fill:#CE93D8,stroke:#8E24AA,color:#000
style SIG fill:#CE93D8,stroke:#8E24AA,color:#000
style HNG fill:#CE93D8,stroke:#8E24AA,color:#000
&lt;/pre>

&lt;h2 id="discriminant-functions">
 Discriminant Functions
 
 &lt;a class="anchor" href="#discriminant-functions">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="decision-theory">
 Decision Theory
 
 &lt;a class="anchor" href="#decision-theory">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-discriminative-classifiers">
 Probabilistic Discriminative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-discriminative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;h2 id="logistic-regression">
 Logistic Regression
 
 &lt;a class="anchor" href="#logistic-regression">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Supervised machine learning algorithm&lt;/li>
&lt;li>Binary &lt;strong>classification&lt;/strong> algorithm&lt;/li>
&lt;li>requires data to be linearly separable&lt;/li>
&lt;li>predicts the probability that an input belongs to a specific class&lt;/li>
&lt;li>uses &lt;strong>Sigmoid function&lt;/strong> to convert inputs into a probability value between 0 and 1&lt;/li>
&lt;/ul>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
Logistic regression predicts $P(y=1\mid x)$ using a sigmoid of a linear score $z=w\cdot x+b$,
then learns $w,b$ by maximising likelihood (equivalently minimising log-loss).&lt;/p></description></item><item><title>Foundation Models</title><link>https://arshadhs.github.io/docs/ai/genai/foundation-model/</link><pubDate>Sun, 14 Dec 2025 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/foundation-model/</guid><description>&lt;h1 id="foundation-model">
 Foundation Model
 
 &lt;a class="anchor" href="#foundation-model">#&lt;/a>
 
&lt;/h1>
&lt;p>AI models trained on massive datasets to perform a wide range of tasks with minimal fine-tuning.&lt;/p>
&lt;ul>
&lt;li>
&lt;p>are large deep learning neural networks&lt;/p>
&lt;/li>
&lt;li>
&lt;p>are large AI models trained on &lt;strong>massive and diverse datasets&lt;/strong> (text, images, audio, or multiple modalities).&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Contain &lt;strong>millions or billions of parameters&lt;/strong>.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed to perform a &lt;strong>broad range of general tasks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>designed for &lt;strong>general-purpose intelligence&lt;/strong>, not a single task.&lt;/p>
&lt;/li>
&lt;li>
&lt;p>acts as &lt;strong>base models&lt;/strong> for building specialised AI applications&lt;/p></description></item><item><title>LLM - Model</title><link>https://arshadhs.github.io/docs/ai/genai/llm/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/genai/llm/</guid><description>&lt;h1 id="llm--large-language-model">
 LLM – Large Language Model
 
 &lt;a class="anchor" href="#llm--large-language-model">#&lt;/a>
 
&lt;/h1>
&lt;p>Large Language Models (LLMs) are &lt;strong>advanced AI systems&lt;/strong> designed to process, understand, and generate &lt;strong>human-like text&lt;/strong>.&lt;/p>
&lt;p>They learn language by analysing &lt;strong>massive amounts of text data&lt;/strong>, discovering patterns in:&lt;/p>
&lt;ul>
&lt;li>
&lt;p>grammar&lt;/p>
&lt;/li>
&lt;li>
&lt;p>meaning&lt;/p>
&lt;/li>
&lt;li>
&lt;p>context&lt;/p>
&lt;/li>
&lt;li>
&lt;p>relationships between words and sentences&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Built on &lt;strong>Deep Learning&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Implemented using &lt;strong>Neural Networks&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Based on &lt;strong>Transformers&lt;/strong>&lt;/p>
&lt;/li>
&lt;li>
&lt;p>Often combined with tools like:&lt;/p>
&lt;ul>
&lt;li>Retrieval (RAG)&lt;/li>
&lt;li>Agents&lt;/li>
&lt;li>External APIs&lt;/li>
&lt;li>Memory systems&lt;/li>
&lt;/ul>
&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="what-makes-an-llm-special">
 What makes an LLM special?
 
 &lt;a class="anchor" href="#what-makes-an-llm-special">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>Built using &lt;strong>deep neural networks&lt;/strong>&lt;/li>
&lt;li>Trained on &lt;strong>very large datasets&lt;/strong> (books, articles, code, web text)&lt;/li>
&lt;li>Can perform many tasks &lt;strong>without task-specific training&lt;/strong>&lt;/li>
&lt;li>General-purpose language understanding, not single-task models&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="foundation-transformer-architecture">
 Foundation: Transformer Architecture
 
 &lt;a class="anchor" href="#foundation-transformer-architecture">#&lt;/a>
 
&lt;/h2>
&lt;p>LLMs are based on the &lt;strong>&lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/transformer/">Transformer Architecture&lt;/a>&lt;/strong>, which allows models to understand &lt;strong>context and long-range dependencies&lt;/strong> in text.&lt;/p></description></item><item><title>Decision Tree</title><link>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/05-decision-tree/</guid><description>&lt;h1 id="decision-tree">
 Decision Tree
 
 &lt;a class="anchor" href="#decision-tree">#&lt;/a>
 
&lt;/h1>
&lt;p>A decision tree classifies an example by asking a sequence of questions about its attributes until it reaches a leaf (final decision).&lt;/p>
&lt;blockquote class="book-hint info">
&lt;p>Key takeaway:
A decision tree grows by repeatedly splitting the training data into &lt;strong>purer&lt;/strong> subsets using an impurity measure
(Entropy / Gini / Classification Error).&lt;/p>
&lt;/blockquote>
&lt;hr>
&lt;h2 id="information-theory">
 Information Theory
 
 &lt;a class="anchor" href="#information-theory">#&lt;/a>
 
&lt;/h2>
&lt;p>Decision trees need a way to measure:
“How mixed are the class labels at a node?”&lt;/p></description></item><item><title>Instance-based Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/06-instance-based-learning/</guid><description>&lt;h1 id="instance-based-learning">
 Instance-based Learning
 
 &lt;a class="anchor" href="#instance-based-learning">#&lt;/a>
 
&lt;/h1>
&lt;p>Instance-based learning is a family of methods that &lt;strong>do not build one explicit global model during training&lt;/strong>. Instead, they &lt;strong>store training examples&lt;/strong> and delay most of the work until a new query arrives.&lt;/p>
&lt;p>When a new point must be classified or predicted, the algorithm compares it with previously seen examples, finds the most relevant neighbours, and uses them to produce the answer.&lt;/p>
&lt;p>Instance-based Learning covers three linked ideas:&lt;/p></description></item><item><title>Support Vector Machine</title><link>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/07-support-vector-machines/</guid><description>&lt;h1 id="support-vector-machine-svm">
 Support Vector Machine (SVM)
 
 &lt;a class="anchor" href="#support-vector-machine-svm">#&lt;/a>
 
&lt;/h1>
&lt;p>A &lt;strong>Support Vector Machine (SVM)&lt;/strong> is a &lt;strong>supervised machine learning algorithm&lt;/strong> used for:&lt;/p>
&lt;ul>
&lt;li>&lt;strong>Classification&lt;/strong> (most common)&lt;/li>
&lt;li>&lt;strong>Regression&lt;/strong> (SVR – Support Vector Regression)&lt;/li>
&lt;/ul>

&lt;blockquote class='book-hint '>
 &lt;p>Find the decision boundary that separates classes with the &lt;strong>maximum margin&lt;/strong>.&lt;/p>
&lt;/blockquote>&lt;blockquote class="book-hint default">
&lt;p>A Support Vector Machine is a supervised learning algorithm that finds an optimal hyperplane by maximising the margin between classes, using support vectors and kernel functions to handle non-linear data.&lt;/p></description></item><item><title>Attention Mechanism</title><link>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/080-attention-mechanism/</guid><description>&lt;h1 id="attention-mechanism">
 Attention Mechanism
 
 &lt;a class="anchor" href="#attention-mechanism">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Queries, Keys, and Values&lt;/li>
&lt;li>Attention Pooling by Similarity&lt;/li>
&lt;li>Attention Pooling via Nadaraya–Watson Regression&lt;/li>
&lt;li>Attention Scoring Functions&lt;/li>
&lt;li>Dot Product Attention&lt;/li>
&lt;li>Convenience Functions&lt;/li>
&lt;li>Scaled Dot Product Attention&lt;/li>
&lt;li>Additive Attention&lt;/li>
&lt;li>Bahdanau Attention Mechanism&lt;/li>
&lt;li>Multi-Head Attention&lt;/li>
&lt;li>Self-Attention&lt;/li>
&lt;li>Positional Encoding&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_builders-guide/model-construction.html">Ch 10&lt;/a>, &lt;a href="https://d2l.ai/chapter_convolutional-neural-networks/index.html">Ch7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Bayesian Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/08-bayesian-learning/</guid><description>&lt;h1 id="bayesian-learning">
 Bayesian Learning
 
 &lt;a class="anchor" href="#bayesian-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="mle-hypothesis">
 MLE Hypothesis
 
 &lt;a class="anchor" href="#mle-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="map-hypothesis">
 MAP Hypothesis
 
 &lt;a class="anchor" href="#map-hypothesis">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayes-rule">
 Bayes Rule
 
 &lt;a class="anchor" href="#bayes-rule">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="optimal-bayes-classifier">
 Optimal Bayes Classifier
 
 &lt;a class="anchor" href="#optimal-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="naïve-bayes-classifier">
 Naïve Bayes Classifier
 
 &lt;a class="anchor" href="#na%c3%afve-bayes-classifier">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="probabilistic-generative-classifiers">
 Probabilistic Generative Classifiers
 
 &lt;a class="anchor" href="#probabilistic-generative-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bayesian-linear-regression">
 Bayesian Linear Regression
 
 &lt;a class="anchor" href="#bayesian-linear-regression">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Ensemble Learning</title><link>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/09-ensemble-learning/</guid><description>&lt;h1 id="ensemble-learning">
 Ensemble Learning
 
 &lt;a class="anchor" href="#ensemble-learning">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="combining-classifiers">
 Combining Classifiers
 
 &lt;a class="anchor" href="#combining-classifiers">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="bagging">
 Bagging
 
 &lt;a class="anchor" href="#bagging">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="random-forest">
 Random Forest
 
 &lt;a class="anchor" href="#random-forest">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="boosting">
 Boosting
 
 &lt;a class="anchor" href="#boosting">#&lt;/a>
 
&lt;/h2>
&lt;h3 id="adaboost">
 ADABoost
 
 &lt;a class="anchor" href="#adaboost">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="gradient-boosting">
 Gradient Boosting
 
 &lt;a class="anchor" href="#gradient-boosting">#&lt;/a>
 
&lt;/h3>
&lt;h3 id="xgboost">
 XGBoost
 
 &lt;a class="anchor" href="#xgboost">#&lt;/a>
 
&lt;/h3>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Optimisation of Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/100-optimise-deep-models/</guid><description>&lt;h1 id="optimisation-of-deep-models">
 Optimisation of Deep models
 
 &lt;a class="anchor" href="#optimisation-of-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Goal of Optimization&lt;/li>
&lt;li>Optimization Challenges in Deep Learning&lt;/li>
&lt;li>Gradient Descent&lt;/li>
&lt;li>Stochastic Gradient Descent&lt;/li>
&lt;li>Minibatch Stochastic Gradient Descent&lt;/li>
&lt;li>Momentum&lt;/li>
&lt;li>Adagrad and Algorithm&lt;/li>
&lt;li>RMSProp and Algorithm&lt;/li>
&lt;li>Adadelta and Algorithm&lt;/li>
&lt;li>Adam and Algorithm&lt;/li>
&lt;li>Code Implementation and comparison of algorithms (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (Ch12)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item><item><title>Evaluation/Comparison</title><link>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/machine-learning/11-ml-model-evaluation-comparison/</guid><description>&lt;h1 id="machine-learning-model-evaluationcomparison">
 Machine Learning Model Evaluation/Comparison
 
 &lt;a class="anchor" href="#machine-learning-model-evaluationcomparison">#&lt;/a>
 
&lt;/h1>
&lt;h2 id="comparing-machine-learning-models">
 Comparing Machine Learning Models
 
 &lt;a class="anchor" href="#comparing-machine-learning-models">#&lt;/a>
 
&lt;/h2>
&lt;h2 id="emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">
 Emerging requirements e.g., bias, fairness, interpretability of ML models
 
 &lt;a class="anchor" href="#emerging-requirements-eg-bias-fairness-interpretability-of-ml-models">#&lt;/a>
 
&lt;/h2>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/machine-learning/">
 Machine Learning
&lt;/a>&lt;/p></description></item><item><title>Regularisation for Deep models</title><link>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://arshadhs.github.io/docs/ai/deep-learning/110-regularisation-deep-models/</guid><description>&lt;h1 id="regularisation-for-deep-models">
 Regularisation for Deep models
 
 &lt;a class="anchor" href="#regularisation-for-deep-models">#&lt;/a>
 
&lt;/h1>
&lt;ul>
&lt;li>Generalization for regression&lt;/li>
&lt;li>Training Error and Generalization Error&lt;/li>
&lt;li>Underfitting or Overfitting&lt;/li>
&lt;li>Model Selection&lt;/li>
&lt;li>Weight Decay and Norms&lt;/li>
&lt;li>Generalization in Classification&lt;/li>
&lt;li>Environment and Distribution Shift&lt;/li>
&lt;li>Generalization in Deep Learning&lt;/li>
&lt;li>Dropout&lt;/li>
&lt;li>Batch Normalization&lt;/li>
&lt;li>Layer Normalization&lt;/li>
&lt;li>Code implementation (webinar)&lt;/li>
&lt;/ul>
&lt;hr>
&lt;h2 id="reference">
 Reference
 
 &lt;a class="anchor" href="#reference">#&lt;/a>
 
&lt;/h2>
&lt;ul>
&lt;li>&lt;strong>Dive into deep learning. Cambridge University Press.&lt;/strong>. (&lt;a href="https://d2l.ai/chapter_introduction/index.html">T1 – Ch 3.6, 3.7, T1 - Ch 4.6, 4.7, T1 - Ch 5.5, 5.6, T1 - Ch 8.5, T1 - Ch 11.7&lt;/a>&lt;/li>
&lt;/ul>
&lt;hr>
&lt;p>&lt;a href="https://arshadhs.github.io/">Home&lt;/a> | &lt;a href="https://arshadhs.github.io/docs/ai/deep-learning/">
 Deep Learning
&lt;/a>&lt;/p></description></item></channel></rss>