8. 次梯度

Sub-Gradient

0. 前言

  • 介绍了次梯度的一些性质,为之后的次梯度方法作铺垫。

1. Sub-Gradient

$g$ is a sub-gradient of convex function $f$ if

1.1 Properties

  • unconstrained optimality: $x$ minimize $f(x)$ if $0\in\partial f(x)$

1.2 Subdifferential

a set of all sub-gradient:

$\partial f(x)$ 是半空间的交,所以是凸闭集

1.3 Monotonicity

次梯度在凸函数上单调(类比梯度)

1.4 Sub-level set

$x$ 上的非零次梯度定义了此处下水平集的一个支撑超平面

2. Sub-Gradient calculus

  • weak result: find 1 sub-gradient
  • strong result: find all sub-gradient

2.1 Basic rules

  • Differentiable functions:

    $\partial f(x)={\triangledown f(x)}$ if $f$ is differentiable at $x$

  • Non-negative linear combination (Thm. Morean-Rockafellor):

    if $f(x)=\alpha_1f_1(x)+\alpha_2f_2(x)\text{ with } \alpha_1、\alpha_2\geq0$

    than $\partial f(x)=\alpha_1\partial f_1(x)+\alpha_2\partial f_2(x)$

  • Affine transformation of variables:

    if $f(x)=h(Ax+b)$

    then $\partial f(x)=A^T\partial h(Ax+b)$

2.2 Pointwise maximum

  • $f(x)=\max{f_1(x),\cdots,f_m(x)}$。 Define $I(x)={i\mid f(x)=f_i(x)}$

  • weak result: choose $k\in I(x)$ , any sub-gradient in $f_k(x)$

  • strong result: (Dubovitskii-Milyutin)

在同一点可能有多个函数f被激活

  • Exp. 1-norm

2.3 Maximum eigenvalue

  • $f(x)=\lambda{max}(A(x))=\sup{|y|_2=1}y^TA(x)y$

    where $A(x)=A_0+x_1A_1+\cdots+x_nA_n$

    Find a sub-gradient in $\hat{x}$

  • choose any unit eigenvalue $y$ with eigenvalue $\lambda_{max}(A(x))$

  • gradient of $y^TA(x)y$ is an answer:

    $(y_1^TAy_1,\cdots,y^T_nAy_n)\in \partial f(x)$

2.4 Some other examples

  • minimize $f(x)=\inf_yh(x,y)$

  • minimize $f(x)=\inf_{y\in C} |x-y|_2$ , which $C$ is convex.

  • Composition: $f(x)=h(f_1(x),\cdots,f_k(x))$ , which $f_i, h$ is convex, $h$ non-decreasing.

  • Optimal value function:

    considering minimize $f_0(x)$

    s.t. $f_i(x)<u_i, Ax=b+v$

    With strong duality, finite dual problem: $f(\hat{u},\hat{v})$ is optimal, we can conclude that $(-\hat{\lambda},-\hat{\nu})\in \partial f(\hat{u},\hat{v})$

    • $\lambda,\nu$ 是拉格朗日对偶对应项的系数梯度

    • $u\in \partial f(v) \Leftrightarrow v\in \partial f^*(u)\tag{8-1}$

    • 极小值问题转为次梯度问题(?)
  • Expectation

3. Optimal condition

  • $x^\star$ minimize $f(x)$ iff $0\in \partial f(x^\star)$

Constrained

KKT Condition: dual optimal iff

  • Strong duality holds
  • $x^\star$ is primal feasible
  • $\lambda^\star\geq 0$
  • $\lambda_i^\star f_i(x^\star)=0$ for $i=1,\cdots,m$
  • $x^\star$ minimize the $L(x,\lambda^\star)$

4. Directional derivative

  • Definition: ($f$ at $x$ with $y$ direction)
  • 版权声明: 本博客所有文章除特别声明外,著作权归作者所有。转载请注明出处!
  • Copyrights © 2019-2020 thiswinex
  • 访问人数: | 浏览次数:

请我喝奶茶~

支付宝
微信