Gradient

Definition

\[\nabla f=\left(\frac{\partial f}{\partial x_1},\ldots,\frac{\partial f}{\partial x_n}\right)\]
PropertyStatement
Directional\(D_{{\mathbf{{u}}}}f=\nabla f\cdot\mathbf{{u}}\)
Max rate|∇f|, in direction ∇f

Directional Derivative

Examples

Example 1. ∇f for f=x²+y².

Solution. (2x, 2y).

In Depth

The gradient \(\nabla f = (\partial f/\partial x_1, \ldots, \partial f/\partial x_n)\) is the multivariable generalization of the derivative. It is a vector field that points in the direction of steepest ascent of \(f\), with magnitude equal to the maximum rate of change.

The directional derivative \(D_{\mathbf{u}}f = \nabla f \cdot \mathbf{u}\) gives the rate of change in any direction \(\mathbf{u}\). The gradient is perpendicular to level curves (in 2D) and level surfaces (in 3D) — a fact used in implicit differentiation and in defining surface normals.

Gradient descent is the workhorse optimization algorithm in machine learning. Starting from an initial point, it iteratively moves in the direction \(-\nabla f\) (steepest descent) to minimize \(f\). Variants include stochastic gradient descent (SGD), momentum, Adam, and RMSProp, each addressing different challenges in high-dimensional optimization.

The gradient theorem (fundamental theorem for line integrals): \(\int_C \nabla f \cdot d\mathbf{r} = f(\mathbf{b}) - f(\mathbf{a})\). A vector field is conservative (path-independent) if and only if it is the gradient of some scalar potential function.

Key Properties & Applications

The gradient is central to optimization. At a local minimum or maximum of \(f\), \(\nabla f=\mathbf{0}\) (necessary condition). The second-derivative test uses the Hessian matrix \(H_{ij}=\partial^2f/\partial x_i\partial x_j\): if \(\nabla f=0\) and \(H\) is positive definite, the point is a local minimum.

Lagrange multipliers use the gradient to optimize \(f\) subject to constraints \(g=0\): at a constrained extremum, \(\nabla f=\lambda\nabla g\). This says the gradient of \(f\) is parallel to the gradient of the constraint — the level curves of \(f\) and \(g\) are tangent.

In physics, the gradient of a scalar potential gives a force field: \(\mathbf{F}=-\nabla V\) (conservative force). The electric field is \(\mathbf{E}=-\nabla\phi\); gravity is \(\mathbf{g}=-\nabla\Phi\). The negative sign means forces point toward lower potential energy.

Further Reading & Context

The study of gradient connects to many areas of mathematics and its applications. Understanding the foundational definitions and theorems provides the basis for advanced work in analysis, algebra, and applied mathematics.

Historical development: most mathematical concepts evolved over centuries, with contributions from mathematicians across many cultures. The modern axiomatic treatment provides rigor, while computational tools enable practical application.

In modern mathematics, this topic appears in graduate courses and research across pure and applied mathematics. Connections to computer science, physics, and engineering make it a versatile and important area of study. Mastery of the core results and techniques opens doors to research in number theory, analysis, geometry, and beyond.

Recommended next steps: work through the standard theorems with full proofs, explore the connections to related topics listed above, and practice with a variety of problems ranging from computational exercises to theoretical proofs. The interplay between different areas of mathematics is one of the subject's greatest rewards.

Deep Dive: Gradient

This lesson extends core ideas for gradient with rigorous reasoning, edge-case checks, and application framing in calculus.

Practice Set

Practice. Derive one main result on this page and validate with a numeric or geometric check.

Goal. Confirm assumptions, transformation steps, and final interpretation.

References & Editorial Notes

  • Stewart, Calculus.
  • Strang, Introduction to Linear Algebra.
  • Apostol, Mathematical Analysis.

Last editorial review: 2026-04-14.