SVM

Primal and Dual Perspective for Linear SVM

Primal and Dual Perspective for Linear SVM #

A linear Support Vector Machine finds a hyperplane that separates two classes with the maximum possible margin.

The primal view gives the direct geometric optimisation problem. The dual view rewrites the problem using Lagrange multipliers and reveals why only support vectors matter.

Key takeaway: Linear SVM maximises the margin by minimising

\( \frac{1}{2}\|w\|^2 \)

subject to correct-classification constraints. The dual solution expresses ( w )

Support Vector Machine

Support Vector Machine (SVM) #

Support Vector Machine (SVM) is a supervised machine learning algorithm used for:

  • Classification (most common)
  • Regression (SVR – Support Vector Regression)

It connects many earlier ideas:

  • classification and decision boundaries
  • linear classifiers
  • margins
  • optimisation
  • constrained optimisation
  • kernels for non-linear data

SVM is a discriminative classifier.

That means it does not try to model how each class is generated.

Instead, it tries to find the best separating boundary between classes.

Mathematical Preliminaries for SVM

Mathematical Preliminaries for SVM #

Support Vector Machines use optimisation, geometry and kernels. Before deriving SVM, we need constrained optimisation, Lagrange multipliers, primal and dual problems, KKT conditions, hyperplanes and kernel functions.

Key takeaway: SVM is built on constrained optimisation. The hard-margin SVM primal problem is a quadratic optimisation problem with linear inequality constraints. The dual problem uses Lagrange multipliers and leads naturally to support vectors and kernels.

Nonlinear SVM

Nonlinear SVM #

A linear SVM works well when the data can be separated by a straight line or hyperplane. When the data is not linearly separable in the original input space, nonlinear SVM maps the data to a higher-dimensional feature space where a linear separator may exist.

Key takeaway: Nonlinear SVM uses the kernel trick. Instead of explicitly mapping

\( x \)

to ( \phi(x) )