Q: What is the geometric intuition of a subgradient?

Geometrically, a subgradient \$ g \$ of a convex function \$ f \$ at a point \$ x \$ defines a supporting hyperplane \$ y = f(x) + g^T(z-x) \$ to the epigraph of \$ f \$ at \$ (x, f(x)) \$. This means the hyperplane lies entirely below or touches the function at \$ x \$, much like a tangent plane for a differentiable function, but it can 'tilt' within a certain range at non-differentiable points.

Q: When is the subgradient equivalent to the gradient?

For a convex function \$ f \$, the subgradient \$ \\partial f(x) \$ is equivalent to the gradient \$ \\nabla f(x) \$ if and only if \$ f \$ is differentiable at \$ x \$. In this case, the subgradient set contains exactly one element: \$ \\partial f(x) = \\{\\nabla f(x)\\} \$.

Question 1

What is the geometric intuition of a subgradient?

Accepted Answer

Geometrically, a subgradient $ g $ of a convex function $ f $ at a point $ x $ defines a supporting hyperplane $ y = f(x) + g^T(z-x) $ to the epigraph of $ f $ at $ (x, f(x)) $. This means the hyperplane lies entirely below or touches the function at $ x $, much like a tangent plane for a differentiable function, but it can 'tilt' within a certain range at non-differentiable points.

Question 2

Why is the subgradient a set of vectors instead of a single vector?

Accepted Answer

At points where a convex function is differentiable, the subgradient set contains only a single vector, which is the gradient. However, at non-differentiable points (e.g., a 'kink' or 'corner'), there can be multiple valid supporting hyperplanes. Each of these hyperplanes corresponds to a different 'slope' that lies below the function, hence the subgradient is a set encompassing all such possible slopes.

Question 3

When is the subgradient equivalent to the gradient?

Accepted Answer

For a convex function $ f $, the subgradient $ \partial f(x) $ is equivalent to the gradient $ \nabla f(x) $ if and only if $ f $ is differentiable at $ x $. In this case, the subgradient set contains exactly one element: $ \partial f(x) = \{\nabla f(x)\} $.

Question 4

What happens if the functions are not convex?

Accepted Answer

The concept of a subgradient is primarily defined and useful for convex functions. While generalized gradients exist for non-convex functions (e.g., Clarke subgradients), they have different properties and are defined in a more complex manner. Many of the elegant properties of subgradients, such as the sum rule, do not hold generally for non-convex functions.

Properties of Subgradients for Non-Differentiable Convex Functions

Visualizing...

The Formal Theorem

Analytical Intuition.

Institutional Warning.

Academic Inquiries.

What is the geometric intuition of a subgradient?

Why is the subgradient a set of vectors instead of a single vector?

When is the subgradient equivalent to the gradient?

What happens if the functions are not convex?

Standardized References.

Weierstrass Extreme Value Theorem: Guaranteeing Existence of Optima

Local Optima are Global Optima for Convex Functions

Hessian Matrix and Second-Order Optimality Conditions

Jensen's Inequality for Convex Functions

Institutional Citation

Dominate the Logic.

Visualizing...

The Formal Theorem

Analytical Intuition.

Institutional Warning.

Academic Inquiries.

What is the geometric intuition of a subgradient?

Why is the subgradient a set of vectors instead of a single vector?

When is the subgradient equivalent to the gradient?

What happens if the functions are not convex?

Standardized References.

Related Proofs Cluster.

Weierstrass Extreme Value Theorem: Guaranteeing Existence of Optima

Local Optima are Global Optima for Convex Functions

Hessian Matrix and Second-Order Optimality Conditions

Jensen's Inequality for Convex Functions

Institutional Citation

Dominate the Logic.