Globalization Techniques

Linesearch Methods

The linesearch module in NLP.py implements a few typical line searches, i.e., given the current iterate \(x_k\) along with the value \(f(x_k)\), the gradient \(\nabla f(x_k)\) and the search direction \(d_k\), they return a stepsize \(\alpha_k > 0\) such that \(x_k + \alpha_k d_k\) is an improved iterate in a certain sense. Typical conditions to be satisfied by \(\alpha_k\) include the Armijo condition

\[f(x_k + \alpha_k d_k) \leq f(x_k) + \beta \alpha_k \nabla f(x_k)^T d_k\]

for some \(\beta \in (0,1)\), the Wolfe conditions, which consist in the Armijo condition supplemented with

\[\nabla f(x_k + \alpha_k d_k)^T d_k \geq \gamma \nabla f(x_k)^T d_k\]

for some \(\gamma \in (0,\beta)\), and the strong Wolfe conditions, which consist in the Armijo condition supplemented with

\[|\nabla f(x_k + \alpha_k d_k)^T d_k| \leq \gamma |\nabla f(x_k)^T d_k|\]

for some \(\gamma \in (0,\beta)\).

The linesearch Module

A general linesearch framework.

class linesearch.LineSearch(linemodel, name='Generic Linesearch', **kwargs)

Bases: object

A generic linesearch class.

Most methods of this class should be overridden by subclassing.

__init__(linemodel, name='Generic Linesearch', **kwargs)

Initialize a linesearch method.

Parameters:
linemodel:C1LineModel or C2LineModel instance
Keywords:
step:initial step size (default: 1.0)
value:initial function value (computed if not supplied)
slope:initial slope (computed if not supplied)
trial_value:function value at x + t*d (computed if not supplied)
name:linesearch procedure name.
check_slope(slope)

Check is supplied direction is a descent direction.

iterate

Return current full-space iterate.

While not normally needed by the linesearch procedure, it is here so it can be recovered on exit and doesn’t need to be recomputed.

linemodel

Return underlying line model.

name

Return linesearch procedure name.

next()
slope

Return initial merit function slope in search direction.

step

Return current step size.

stepmin

Return minimum permitted step size.

trial_value

Return current merit function value.

value

Return initial merit function value.

class linesearch.ArmijoLineSearch(*args, **kwargs)

Bases: linesearch.LineSearch

Armijo backtracking linesearch.

__init__(*args, **kwargs)

Instantiate an Armijo backtracking linesearch.

The search stops as soon as a step size t is found such that

ϕ(t) <= ϕ(0) + t * ftol * ϕ’(0)

where 0 < ftol < 1 and ϕ’(0) is the directional derivative of a merit function f in the descent direction d. true.

Keywords:
ftol:constant used in Armijo condition (default: 1.0e-4)
bkmax:maximum number of backtracking steps (default: 20)
decr:factor by which to reduce the steplength during the backtracking (default: 1.5).
bk
bkmax
check_slope(slope)

Check is supplied direction is a descent direction.

decr
ftol
iterate

Return current full-space iterate.

While not normally needed by the linesearch procedure, it is here so it can be recovered on exit and doesn’t need to be recomputed.

linemodel

Return underlying line model.

name

Return linesearch procedure name.

next()
slope

Return initial merit function slope in search direction.

step

Return current step size.

stepmin

Return minimum permitted step size.

trial_value

Return current merit function value.

value

Return initial merit function value.

The pyswolfe Module

Strong Wolfe linesearch.

A translation of the original Fortran implementation of the Moré and Thuente linesearch ensuring satisfaction of the strong Wolfe conditions.

The method is described in

J. J. Moré and D. J. Thuente Line search algorithms with guaranteed sufficient decrease ACM Transactions on Mathematical Software (TOMS) Volume 20 Issue 3, pp 286-0307, 1994 DOI http://dx.doi.org/10.1145/192115.192132

class wolfe.StrongWolfeLineSearch(*args, **kwargs)

Bases: nlp.ls.linesearch.LineSearch

Moré-Thuente linesearch for the strong Wolfe conditions.

ϕ(t) ≤ ϕ(0) + ftol * t * ϕ’(0) (Armijo condition)

|ϕ’(t)| ≤ gtol * |ϕ’(0)| (curvature condition)

where 0 < ftol < gtol < 1.

__init__(*args, **kwargs)

Instantiate a strong Wolfe linesearch procedure.

Keywords:
ftol:constant used in Armijo condition (1.0e-4)
gtol:constant used in curvature condition (0.9)
xtol:minimal relative step bracket length (1.0e-10)
lb:initial lower bound of the bracket
ub:initial upper bound of the bracket
check_slope(slope)

Check is supplied direction is a descent direction.

ftol
gtol
iterate

Return current full-space iterate.

While not normally needed by the linesearch procedure, it is here so it can be recovered on exit and doesn’t need to be recomputed.

lb
linemodel

Return underlying line model.

name

Return linesearch procedure name.

next()
slope

Return initial merit function slope in search direction.

step

Return current step size.

stepmin

Return minimum permitted step size.

trial_slope
trial_value

Return current merit function value.

ub
value

Return initial merit function value.

xtol

The pymswolfe Module

PyMSWolfe: Jorge Nocedal’s modified More and Thuente linesearch guaranteeing satisfaction of the strong Wolfe conditions.

class pymswolfe.StrongWolfeLineSearch(f, x, g, d, obj, grad, **kwargs)

A general-purpose linesearch procedure enforcing the strong Wolfe conditions

f(x+td) <= f(x) + ftol * t * <g(x),d> (Armijo condition)

<g(x+td),d> | <= gtol * | <g(x),d> | (curvature condition)

This is a Python interface to Jorge Nocedal’s modification of the More and Thuente linesearch. Usage of this class is slightly different from the original More and Thuente linesearch.

Instantiate as follows

SWLS = StrongWolfeLineSearch(f, x, g, d, obj, grad)

where

  • f is the objective value at the current iterate x
  • x is the current iterate
  • g is the objective gradient at the current iterate x
  • d is the current search direction
  • obj is a scalar function used to evaluate the value of
    the objective at a given point
  • grad is a scalar function used to evaluate the gradient
    of the objective at a given point.
Keywords:
ftol:the constant used in the Armijo condition (1e-4)
gtol:the constant used in the curvature condition (0.9)
xtol:a minimal relative step bracket length (1e-16)
stp:an initial step value (1.0)
stpmin:the initial lower bound of the bracket (1e-20)
stpmax:the initial upper bound of the bracket (1e+20)
maxfev:the maximum number of function evaluations permitted (20)

To ensure existence of a step satisfying the strong Wolfe conditions, d should be a descent direction for f at x and ftol <= gtol.

The final value of the step will be held in SWLS.stp

After the search, SWLS.armijo will be set to True if the step computed satisfies the Armijo condition and SWLS.curvature will be set to True if the step satisfies the curvature condition.

__init__(f, x, g, d, obj, grad, **kwargs)
search()

Trust-Region Methods

Trust-region methods are an alternative to linesearch methods as a mechanism to enforce global convergence, i.e., \(lim \nabla f(x_k) = 0\). In trust-region methods, a quadratic model is approximately minimized at each iteration subject to a trust-region constraint:

\[\begin{split}\begin{align*} \min_{d \in \mathbb{R}^n} & g_k^T d + \tfrac{1}{2} d^T H_k d \\ \text{s.t.} & \|d\| \leq \Delta_k, \end{align*}\end{split}\]

where \(g_k = \nabla f(x_k)\), \(H_k \approx \nabla^2 f(x_k)\) and \(\Delta_k > 0\) is the current trust-region radius.

The trustregion Module

Class definition for Trust-Region Algorithm and Management.

class trustregion.TrustRegion(**kwargs)

Bases: object

A trust-region management class.

__init__(**kwargs)

Instantiate an object allowing management of a trust region.

Keywords:
radius:Initial trust-region radius (default: 1.0)
eta1:Step acceptance threshold (default: 0.01)
eta2:Radius increase threshold (default: 0.99)
gamma1:Radius decrease factor (default: 1/3)
gamma2:Radius increase factor (default: 2.5)

Subclass and override update_radius() to implement custom trust-region management rules.

See, e.g.,

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.

ratio(f, f_trial, m, check_positive=True)

Compute the ratio of actual versus predicted reduction.

rho = (f - f_trial)/(-m).

reset_radius()

Reset radius to original value.

update_radius(ratio, step_norm)

Update the trust-region radius.

The rule implemented by this method is:

radius = gamma1 * step_norm if ared/pred < eta1 radius = gamma2 * radius if ared/pred >= eta2 radius unchanged otherwise,

where ared/pred is the quotient computed by ratio().

class trustregion.TrustRegionSolver(qp, cg_solver, **kwargs)

Bases: object

A generic class for implementing solvers for the trust-region subproblem.

minimize q(d) subject to ||d|| <= radius,

where q(d) is a quadratic function, not necessarily convex.

The trust-region constraint ||d|| <= radius can be defined in any norm although most derived classes currently implement the Euclidian norm only. Note however that any elliptical norm may be used via a preconditioner.

For more information on trust-region methods, see

A. R. Conn, N. I. M. Gould and Ph. L. Toint, Trust-Region Methods, MP01 MPS-SIAM Series on Optimization, 2000.

__init__(qp, cg_solver, **kwargs)

Basic solver for a quadratic trust-region subproblem.

Keywords:
qp:a QPModel instance
cg_solver:a class to be instantiated and used as solver for the trust-region problem.
m

Return the value of the quadratic model.

model_value

Return the value of the quadratic model.

niter

Return the number of iterations made by the solver.

qp

Return the QPModel instance.

solve(*args, **kwargs)

Solve the trust-region subproblem.

step

Return the step computed by the solver.

step_norm

Return the norm of the step computed by the solver.