API

Index

The global API index is as follows:

Documentation

Public

NLS_Solver.AbstractNLSType
abstract type AbstractNLS end 

Defines an abstract non-linear least squares problem (NLS). In our context such problem is essentially a differentiable function $\mathbf{r}$:

\[\mathbf{r}: \theta\in\mathbb{R}^{n_θ}\mapsto \mathbf{r}(\mathbf{\theta})\in\mathbb{R}^{n_S}\]

where:

  • $\mathbf{r}(\mathbf{θ})∈\mathbb{R}^{n_S}$ is the residue vector,
  • $\mathbf{θ}∈\mathbb{R}^{n_θ}$ is the parameter vector to be optimized

The objective function to minimize is:

\[f(θ)=\frac{1}{2}\| \mathbf{r}(θ) \|^2\]

The classical approach uses a linear approximation of $\mathbf{r}$:

\[\mathbf{r}(\mathbf{θ}+δ\mathbf{θ})\approx \mathbf{r}(\mathbf{θ}) + \mathbf{J}(\mathbf{θ})\cdot δ\mathbf{θ}\]

where $\mathbf{J}$ is the Jacobian:

\[\mathbf{J}_{i,j}=\partial_j r^i(\mathbf{θ}),\ i\in[1,n_S],\ j\in[1,n_θ]\]

This leads to

\[f(\mathbf{θ}+δ\mathbf{θ})\approx f(\mathbf{θ}) + \langle \nabla f, δ\mathbf{θ} \rangle + \frac{1}{2} \langle \nabla^2 f \cdot δ\mathbf{θ}, δ\mathbf{θ} \rangle\]

Where the gradient $\nabla f$ is $\mathbf{J}^t \mathbf{r}$ and the (approximate) Hessian $\nabla^2 f$ is $\mathbf{J}^t \mathbf{J}$.

To implement such model, you must define the following functions:

source
NLS_Solver.BoundConstraintsType

Store bound constraints $[l,u]$

Presence of NaN component and the $l\le u$ condition is checked at construction time. Note however that some components can be infinite.

Constructors

The following constructors are available:

  • Construct $[0.0,1.0]^n$
BoundConstraints(n)
  • Construct $[T(0),T(1)]^n$ where components are of type T
BoundConstraints(T,n)
  • Construct $[l,u]$ where l and u are lower and upper bound vectors
BoundConstraints(l,u)

Related functions

source
NLS_Solver.LevenbergMarquardt_BC_ConfType
mutable struct LevenbergMarquardt_BC_Conf <: Abstract_Solver_Conf
    ...
end

Use this constructor

LevenbergMarquardt_BC_Conf()

to initialize the bound constrained Levenberg-Marquardt solver default configuration parameters.

To solve a problem with this method, you must then call solve(nls::AbstractNLS, θ_init::AbstractVector, bc::BoundConstraints, conf::Abstract_BC_Solver_Conf)

See:

source
NLS_Solver.LevenbergMarquardt_ConfType
mutable struct LevenbergMarquardt_Conf <: Abstract_Solver_Conf
    ...
end

Use this constructor

LevenbergMarquardt_Conf()

to initialize the Levenberg-Marquardt solver default configuration parameters.

To solve a problem with this method, you must then call solve(nls::AbstractNLS, θ_init::AbstractVector, conf::Abstract_Solver_Conf)

See:

source
NLS_Solver.create_NLS_problem_using_ForwardDiffMethod
create_NLS_problem_using_ForwardDiff(r::Function;domain_image_dim::Pair{Int,Int})

Creates an AbstractNLS specialized instance where the eval_r_J function is automatically defined using automatic differentiation.

  • r is a function that maps a parameter vector θ to its residue. The Jacobian matrix is computed using the ForwardDiff package.
  • domain_image_dim is a pair of the form θ length => r length that defines domain and codomain dimensions.

Usage example

An example defining the Rosenbrock function

\[\frac{1}{2}\|r(\theta)\|^2\text{ where }r = \sqrt{2} \left( \begin{array}{c} 1-\theta_1 \\ 10(\theta_2-\theta_1^2) \end{array} \right)\]

nls = create_NLS_problem_using_ForwardDiff(2 => 2) do θ
     sqrt(2) sqrt(2)* [ 1-θ[1], 10*(θ[2]-θ[1]^2) ]* [ 1-θ[1], 10*(θ[2]-θ[1]^2) ]
end
source
NLS_Solver.eval_rMethod
eval_r(nls::AbstractNLS,
        θ::AbstractVector) -> r

Compte the residual vector $\mathbf{r}$

source
NLS_Solver.eval_r_JMethod
eval_r_J(nls::AbstractNLS,θ::AbstractVector) -> (r,J)

Compute the residual the vector $\mathbf{r}$ and its Jacobian $\mathbf{J}$

source

Private

Base.:+Method
Base.:+(bc::BoundConstraints,τ::AbstractArray)

Translate bound constraints

\[[a+τ,b+τ] = [a,b]+τ\]

See: BoundConstraints

source
Base.:-Method
Base.:-(bc::BoundConstraints,τ::AbstractArray)

Translate bound constraints

\[[a-τ,b-τ] = [a,b]-τ\]

See: BoundConstraints

source
NLS_Solver.check_first_orderMethod
check_first_order(∇f::AbstractVector{<:Real},
                  xstar::AbstractVector{<:Real},
                  bc::BoundConstraints{<:Real,1})

check_first_order(Q::Symmetric{<:Real},
                  q::AbstractVector{<:Real},
                  xstar::AbstractVector{<:Real},
                  bc::BoundConstraints{<:Real,1})

Check First-Order Conditions (see Bound Constrained Optimization slides)

If $x^\star=\arg\min f(x), x\in[l,u]$ then:

\[\partial_i f(x^\star) = \left\{\begin{array}{ll} \ge 0, & \text{if } x^\star[i] = l[i] \\ = 0, & \text{if } l[i] \le x^\star[i] \le u[i] \\ \le 0, & \text{if } x^\star[i] = u[i] \\ \end{array} \right.\]

This is equivalent to:

\[x^\star = P_{[l,u]}(x^\star-\nabla f(x^\star))\]

According to the previous result, this function returns:

\[\max \mid x^\star - P_{[l,u]}(x^\star-(Q.x^\star+q)) \mid\]

For a local stationary point this quantity must be null

The second function is a wrapper that computes $∇f=Q.x^\star+q$

source
NLS_Solver.clean_τ!Method
clean_τ!(τ::AbstractArray{<:Real},              
         Z::AbstractArray{BoundConstraint_Enum})

By definition τ=-Qx-q. If the algorithm converged, then one must have τ[i]=0 when the constraint is inactive.

This function updates τ by overwriting τ[i]=0 when Z[i]=inactive.

source
NLS_Solver.compute_δL_unconstrainedMethod

Compute δL = L(0)-L(h) where L is the quadratic model

\[L(h)=f(\theta) + \langle \nabla f, h \rangle + \frac{1}{2}\langle \nabla^2 f h, h \rangle\]

with $f(\theta)=\frac{1}{2}\| r(θ) \|_2^2$, $\nabla f = J^t r$ and $\nabla^2 f = J^t J$

A direct computation gives:

\[δL = L(0)-L(h) = -\left( \langle J^tr, h \rangle + \frac{1}{2}\langle \nabla^2 f h, h \rangle \right)\]

However one can avoid the computation of $\nabla^2 f h$ if one uses the fact that $h$ is solution of:

\[(\nabla^2 f + \mu I)h + \nabla f = 0\]

With this hypothesis, one gets:

\[δL = L(0)-L(h) = \frac{1}{2} \langle h, \mu h - \nabla f \rangle\]

source
NLS_Solver.compute_δfMethod

Compute true variation of the real model: $δf = \frac{1}{2}(r^t(θ)r(θ)-r^t(θ+h)r(θ+h))$

Contrary to $δL$ things are simpler. However a trick is to use an equivalent formulation:

\[δf = \frac{1}{2}(r^t(θ)r(θ)-r^t(θ+h)r(θ+h)) = \frac{1}{2}(r(θ)-r(θ+h))^t(r(θ)+r(θ+h))\]

that has a better numerical behavior.

source
NLS_Solver.initialize_x_ZMethod
initialize_x_Z(x_init::AbstractArray,
               bc::BoundConstraints)

Create (x,Z) from initial guess x_init and bound constraints bc

Z is created by recording how x_init fulfils the bound constraints bc.

x is the projection of x_init on the bounded domain [l,b].

source
NLS_Solver.multiplier_τMethod
multiplier_τ(::Abstract_BC_QuadSolver_Result)

Returns the multipliers stored in a compact form (see τ definition, TODO)

source
NLS_Solver.quadratic_subproblemMethod

Solve

\[\min\limits_{h∈[θ^l-θ,θ^u-θ]}\frac{1}{2}h^t.(H+μI).h + ∇f^t.h\]

We use the quadratic model of $f$, the bound contraints are such that the step $h$ makes the update $x+h$ falls in the $[θ^l,θ^u]$ bound.

source
NLS_Solver.restrict_to_inactive!Method
restrict_to_inactive!(Q::Symmetric,
                      q::AbstractVector,
                      Z::AbstractVector{BoundConstraint_Enum},
                      lb::AbstractVector{<:Real},
                      ub::AbstractVector{<:Real})
function restrict_to_inactive!(Q::Symmetric,
                               q::AbstractVector,
                               Z::AbstractVector{BoundConstraint_Enum},
                               bc::BoundConstraints{<:Real,1})

In-place modification of $(Q,q)$ that produces $(\tilde{Q},\tilde{q})$ such that the initial optimization problem:

\[\tilde{x} = \arg\min \frac{1}{2} x^t Q x + q^tx\]

under these constraints:

\[x[i] = \left\{\begin{array}{ll} l[i], & \text{if } Z[i] = -1 \\ u[i], & \text{if } Z[i] = +1 \end{array}\right.\]

is transformed into this linear system:

\[\tilde{Q}\tilde{x}+\tilde{q}=0\]

source
NLS_Solver.update_Z!Method
update_Z!(x::AbstractVector,
          τ::AbstractVector,
          Z::AbstractVector{BoundConstraint_Enum},
          lb::AbstractVector,
          ub::AbstractVector)
update_Z!(x::AbstractVector,
          τ::AbstractVector,
          Z::AbstractVector{BoundConstraint_Enum},
          bc::BoundConstraints{<:Real,1})

This function updates Z according to x, τ and bounds lb, ub values.

It also count how many changes have be done during this update.

No change means that the algorithm has converged.

Note: this function only modifies Z and return the number of bad hypothesis.

source
NLS_Solver.update_x!Method
update_x!(x::AbstractArray,
          Z::AbstractArray{BoundConstraint_Enum},
          bc::BoundConstraints)

Update x value such that:

\[x[i] = \left\{\begin{array}{ll} l[i], & \text{if } Z[i] = -1 \\ u[i], & \text{if } Z[i] = +1 \end{array}\right.\]

When $Z[i]=0$ the $x[i]$ value is unaffected.

source