Instance-based Learning

Instance-based Learning #

k-Nearest Neighbor Learning #

Distance / similarity choices #

Cosine similarity vs Euclidean distance #

Cosine similarity compares direction (angle), not magnitude:

\[ \cos(\theta)=\frac{x\cdot y}{\|x\|\|y\|} \]

When to use cosine:

text/document vectors (high-dimensional, sparse)
when vector length should not dominate similarity

Euclidean distance measures straight-line distance:

\[ d(x,y)=\|x-y\| \]

When to use Euclidean:

continuous features on comparable scales (often after normalisation)

Locally Weighted Regression (LWR) Learning #

Radial Basis Functions #

Home | Machine Learning