Differentiation of Univariate Functions
#
Differentiation measures rate of change.
For a function f(x), the derivative measures the rate of change.
$[
f'(x) = $lim_{h $to 0} $frac{f(x+h)-f(x)}{h}
$]Interpretation:
- Slope of tangent
- Instantaneous rate of change
Home | Vector Calculus
Partial Differentiation and Gradients
#
For f(x1, x2, …, xn):
[
\frac{\partial f}{\partial x_i}
]Gradient vector:
[
\nabla f =
\begin{bmatrix}
\frac{\partial f}{\partial x_1} \
\vdots \
\frac{\partial f}{\partial x_n}
\end{bmatrix}
]Gradient points in direction of steepest ascent.
flowchart LR
Input --> Function
Function --> Gradient
Gradient --> Optimisation
Home | Vector Calculus
Linear Independence
#
A set of vectors is linearly independent if none of them can be written as a linear combination of the others.
\[
c_1\mathbf{v}_1 + \cdots + c_k\mathbf{v}_k = \mathbf{0}
\;\Rightarrow\;
c_1=\cdots=c_k=0
\]Independence means each vector adds new information.
Gradients of Vector-Valued and Matrix Functions
#
Covers gradients when outputs or parameters are vectors/matrices.
If f: R^n -> R^m, the derivative is the Jacobian.
[
J =
\begin{bmatrix}
\frac{\partial f_1}{\partial x_1} & \dots & \frac{\partial f_1}{\partial x_n} \
\vdots & \ddots & \vdots \
\frac{\partial f_m}{\partial x_1} & \dots & \frac{\partial f_m}{\partial x_n}
\end{bmatrix}
]For scalar f(x):
[
H = \nabla^2 f
]Hessian captures curvature.
Useful Gradient Identities
#
[
\nabla (a^T x) = a
]
[
\nabla (x^T A x) = (A + A^T)x
]If A symmetric:
[
\nabla (x^T A x) = 2Ax
]These are heavily used in optimisation.
Home | Vector Calculus
Inner Products and Dot Product
#
An inner product maps two vectors to a single scalar.
It allows us to measure:
- similarity
- vector length
- projections
- orthogonality
flowchart TD
T["Inner<br/>products<br/>(types)"] --> DOT["Euclidean<br/>Dot product"]
T --> WIP["Weighted<br/>inner product"]
T --> FN["Function-space<br/>(integral)"]
T --> HERM["Complex<br/>Hermitian"]
T --> MAT["Matrix<br/>inner product<br/>(Frobenius)"]
DOT --> Rn["Vectors in<br/>
<span>
\( \mathbb{R}^n \)
</span>
"]
WIP --> SPD["SPD matrix<br/>W"]
FN --> L2["L2 space<br/>functions"]
HERM --> Cn["Vectors in<br/>C^n"]
MAT --> Mnm["Matrices<br/>R^{m×n}"]
style T fill:#90CAF9,stroke:#1E88E5,color:#000
style DOT fill:#C8E6C9,stroke:#2E7D32,color:#000
style WIP fill:#C8E6C9,stroke:#2E7D32,color:#000
style FN fill:#C8E6C9,stroke:#2E7D32,color:#000
style HERM fill:#C8E6C9,stroke:#2E7D32,color:#000
style MAT fill:#C8E6C9,stroke:#2E7D32,color:#000
style Rn fill:#CE93D8,stroke:#8E24AA,color:#000
style SPD fill:#CE93D8,stroke:#8E24AA,color:#000
style L2 fill:#CE93D8,stroke:#8E24AA,color:#000
style Cn fill:#CE93D8,stroke:#8E24AA,color:#000
style Mnm fill:#CE93D8,stroke:#8E24AA,color:#000
Definition
#
For vectors
\( \mathbf{a}, \mathbf{b} \in \mathbb{R}^n \)
Backpropagation and Automatic Differentiation
#
Backpropagation applies the chain rule:
- efficiently across a computational graph.
- repeatedly.
Chain rule:
[
\frac{dL}{dx} = \frac{dL}{dy} \cdot \frac{dy}{dx}
]
flowchart LR
x --> y
y --> L
Automatic differentiation computes exact derivatives efficiently using computational graphs.
Home | Vector Calculus