Least Squares Optimization

leastSquares: Least Squares

Subpackage which provide some methods for linear and nonlinear least squares problems, e.g:

\[\text{min} ||Ax - b||_2\]

and

\[\min_{x \in \mathbb{R}^n} f(x),\]

for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).

Right now we can solve this kind of problems with the following methods.

  • Linear Least Squares
    1. linear least squares (solving normal equation)
  • Nonlinear Least Squares
    1. Gauss-Newton algorithm with several choices.
    2. Levenberg-Marquardt with line-search or trust-region algorithm.

Documentation is available in the docstrings and online here.

Methods

oppy.leastSquares.linear_least_square module

Linear Least Squares Algorithm.

This Module contains the function linear_least_square() and all helper functions to solve the problem of finding a solution of the linear least squares problem, e.g. solve:

\[\text{min} ||Ax - b||_2\]
oppy.leastSquares.linear_least_square.linear_least_square(A, b, options=None)

Linear Least Squares.

This function solves the normal equation, e.g. a linear optimization problem of the form

\[\Vert Ax' - b \Vert_2 = \min \Vert Ax - b \Vert_2\]

by using either Cholesky decomposition, QR decomposition or singular value decomposition. Matrices in sparse formate are not supported. In case you need a least squares function for sparse matrices anyway

scipy.sparse.linalg.lsqr

may be the desired function.

Parameters:
  • A (numpy.ndarray, shape (m,n)) – Matrix A of the least squares problem.
  • b (numpy.ndarray, shape (m,)) – RHS b of the least squares problem.
  • options (Options, optional) –

    Options for simplex, represented as a Options object. The default is None. Possible options are:

    • dispbool, optional
      If disp is True the method will show some results during the process. The default is False.
    • fargsstr, optional
      Fargs containg information about decomposition method, if fargs=None then the QR decomposition is performed 'chol' Cholesky decomposition 'QR' QR decomposition 'svd' sigular value decomposition. The default is None.
Returns:

x – Solution of the least squares problem.

Return type:

numpy.ndarray, shape (n,)

Examples

In this example, a problem with a large dense matrix is solved.

>>> import numpy as np
>>> from oppy.leastSquares import linear_least_square
>>> m = 500
>>> n = 50
>>> A = np.random.rand(m, n)
>>> b = np.random.rand(m)
>>> x = linear_least_square(A, b)
>>> print(np.linalg.norm((A.dot(x) - b)))
# may vary
>>> 6.143064057239379

References

W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].

  1. Luik. “Numerik 1”. Lecture Notes, see Luik [Lui10].
oppy.leastSquares.linear_least_square.bidiagonal_decomposition(A)

Determine the bidiagonal decomposition of a matrix.

This function determines the bidiagonal decomposition of A s.t.

\[A = QL * B * QR.\]
Parameters: A (numpy.ndarray, shape (m,n)) – matrix A.
Returns: res – Contains [QL, A, QR] in a list.
Return type: list
oppy.leastSquares.linear_least_square.singular_value_decomposition(A)

Calculate the singular value decomposition of a matrix A.

This function calculates the singular value decomposition of a matrix A such that

\[U^TAV = \Sigma,\]

where U and V are orthogonal matrices and Sigma is a diagonal matrix containing the singular values of A.

Parameters: A (numpy.ndarray, shape (m,n)) – matrix A.
Returns:
oppy.leastSquares.linear_least_square.singular_value_solution(A, b)

Solve lsq problem using SVD.

This function solves the linear problem

\[||Ax' - b|| = \min ||Ax - b||\]

by using the singular value decomposition similar to chapter 4.7 in in [1].

Parameters:
  • A (numpy.ndarray, shape (m,n)) – matrix of least squares problem.
  • b (numpy.ndarray, shape (m,)) – Rhs of least squares problem.
Returns:

x – Solution of the least squares problem.

Return type:

numpy.ndarray, shape (n,)

References

W. Dahmen, A. Reusken. Numerik für Ingenieure und Naturwissenschaftler, see Dahmen and Reusken [DR06].

oppy.leastSquares.linear_least_square.signum(x)

Change the signum command.

This function changes the sign-command s.t. sign(0) = 1.

Parameters: x (numpy.ndarray, shape (n,)) – Input vector x.
Returns: sigma – returns the sign(x).
Return type: numpy.ndarray, shape (n,)
oppy.leastSquares.linear_least_square.sort(x, A)

Sort entries of a vector in decreasing order.

This function sorts the entries of a vector x by decreasing order and changes the columns of a further matrix in corresponding order.

Parameters:
Returns:

oppy.leastSquares.nonlinear_least_square

Nonlinear Least-Squares Algorithm.

This module contains the function nonlinear_least_square() and all helper functions to solve the minimization problem:

\[\min_{x \in \mathbb{R}^n} f(x),\]

for an objective function \(f\) having the special form \(f(x)=\frac{1}{2}\sum\limits_{j=1}^m r_j^2(x)=\frac{1}{2}\|r(x)\|_2^2\).

The residual vector \(r(x)=(r_1(x),\ldots,r_m(x))^T\) contains \(m\) smooth functions \(r_j:\mathbb{R}^n\rightarrow\mathbb{R}\). Furthermore it is assumed that \(m \geq n\) is satisfied.

oppy.leastSquares.nonlinear_least_square.nonlinear_least_square(fhandle, dfhandle, x, options=None)

Nonlinear Least-Squares Algorithm.

Parameters:
  • fhandle (callable()) –

    The residual vector r. The objective function to be minimized is built of:

    fhandle(x, *fargs) -> array_like, shape (m,).

    where \(x\) is an 1-D array with shape (n,) and args is a tuple of the fixed parameters needed to completely specify the function.

  • dfhandle (callable()) –

    The Jacobian matrix of the residual vector r.

    dfhandle(x, *fargs) -> array_like, shape (m, n).

    where x is an 1-D array with shape (n,) and args is a tuple of the fixed parameters needed to completely specify the function.

  • x (numpy.ndarray, shape (n,)) – Initial guess. Array of real elements of size (n,), where ‘n’ is the number of independent variables.
  • options (Options, optional) –

    Options for the nonlinear least-squares algorithm, represented as an Options object. The default is None. Possible options are:

    • methodstr
      It can be chosen which method is used to solve the nonlinear least-squares problem. The following line search based methods are available: gauss_newton and levenberg_marquardt. The first one is the Gauss-Newton method and the latter one a Levenberg-Marquardt variant of the Gauss-Newton method. The input levenberg_marquardt_trust_region applies a Levenberg-Marquardt method within a trust-region approach. The default is levenberg_marquardt_trust_region.
    • tol_absfloat
      Absolute tolerance for the termination condition, initialized with \(10^{-7}\).
    • tol_relfloat
      Relative tolerance for the termination condition, initialized with \(10^{-7}\).

    Note

    The optimization algorithm stops, if the norm of the current residual res[k] satisfies the inequality

    \[\mathrm{res}[k] < \mathrm{stopping\_criteria},\]

    where stopping_criteria is given by

    \[\mathrm{stopping\_criteria} = \mathrm{tol\_rel} \cdot \mathrm{res}[0] + \mathrm{tol\_abs},\]

    and res[0] is the norm of the first residual. If the value of stopping_criteria is bigger than \(0.1\), e.g. if the norm of the first residual is big and the relative tolerance is small, then this value is set to \(0.1\).

    • fargstuple
      Tuple containing additional arguments for fhandle.
    • nmaxint or float
      Maximum number of iterations, initialized with 500.
    • linesearchcallable(),
      Callable function that computes a steplength. The default is None and therefore initialized to wolfe().
    • dispbool
      If disp == True, some details about the progress will be shown during the optimization
    • normcallable()
      Callable function that computes the norm of the gradient \(\nabla f\) in a vector \(x\). The default is None and therefore initialized to the standard norm in \(\mathbb{R}^n\), i.e.
      \[\mathrm{norm}: x \mapsto ||x||_2\]

      Note

      If you want to change the norm, please remember, that you have to change the inner_prod function in your linesearch algorithm, such that norm is computed by

      \[\mathrm{norm} : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]
    • solve_linearstr
      Algorithm that is choosen for solving the linear system
      \[J^T(x_k)J(x_k)p = -J(x_k)^Tr(x_k) : x \mapsto \sqrt{\mathrm{inner\_prod}(x,x)}.\]

      to get the search direction \(p_k\). It is initialized with qr (QR-factorization) (also chol (Cholesky factorization) and svd (Single Value Decomposition) are available).

    • deltafloat
      Initial trust-region radius. The default is 100.
    • delta_maxfloat
      Upper bound for the trust-region radius. The default is \(10^{12}\).
    • etafloat
      Threshold which decides whether the step is taken. It has to be chosen in \((0,1)\). The default is 0.0001.
    • sigmafloat
      Possible deviation from the trust-region constraint. The default is 0.1.
    • scalingbool
      If set to True then the scaling of the problem is incorporated. The default is False.
    • resultsstr or list
      String or list to specify the attributes for the Results object, initialized with 'normal'.
Returns:

results – Instance of the class Results. It represents the optimization results. Attributes of the object depend on the choice of the options.results parameter. With default choice the object has the following attributes:

  • xnumpy.ndarray, shape (n,)

    Solution vector of the optimization problem.

  • iterationsint

    Number of iterations the algorithm needed.

  • reslist

    List with the norm of the residuals.

Return type:

Results

Examples

Minimizing the Rosenbrock function.

>>> from oppy.tests.costfunctions import rosenbrock_least_square,\
... gradient_rosenbrock_least_square
>>> import numpy as np

The nonlinear_least_square() function is chosen to minimize the function.

>>> from oppy.leastSquares.nonlinear_least_square import \
... nonlinear_least_square

Additionaly, a starting value has to be chosen.

>>> x0 = np.array([6,3])

Now we can execute the nonlinear_least_square() function:

>>> result = nonlinear_least_square(rosenbrock_least_square,\
...                                  gradient_rosenbrock_least_square, x0))
>>> print(result)
 iterations: 2
        res: array([7.94844897e+04, 1.11803399e+04, 2.24792062e-13])
          x: array([1., 1.])

References

J. Nocedal, S.J. Wright. Numerical Optimization, see Nocedal and Wright [NW06] p. 245-257.

Jorge J. Moré. “The Levenberg-Marquardt algorithm: Implementation and theory”. see Moré [Mor78].

oppy.leastSquares.nonlinear_least_square.givens_qr(R_I, n, m)

Calculate a QR decomposition of a special matrix.

This function calculates the \(QR\) decomposition of a given \((m+n, n)\) matrix, which is already upper triangular except for \(n\) elements. The funtion is used in the implementation of the Levenberg-Marquardt method as a trust-region method. More detailed, the input matrix looks like:

\[\begin{split}\left[ \begin{array}{c} R \\ 0 \\ \sqrt{\lambda}I \end{array} \right]\end{split}\]

where \(R\) is an upper triangular matrix. The algorithm performs \((n(n+1))/2\) Givens rotations to get its QR decomposition.

Parameters:
  • R_I (numpy.ndarray, shape (m+n, n)) – The input matrix R_I is already upper triangular, except for the elements \((m+1, 1), (m+2, 2), ..., (m+n, n)\).
  • n (int) – Number of columns of the input matrix.
  • m (int) – Integer such that the number of rows of R_I is \(m+n\).
Returns:

  • R_I (numpy.ndarray, shape (m+n, n)) – Rotated matrix R_I, which is now an upper triangular matrix.
  • Q.T (numpy.ndarray, shape (m+n, m+n)) – This is the transpose of the product of all rotations, which were used to rotate the input matrix R_I.
oppy.leastSquares.nonlinear_least_square.givens_rotation(A, i, j)

Calculate one Givens rotation.

This function calculates one Givens rotation, such that a given matrix A is rotated. As a consequence, the entry A[i, j] is then zero.

Parameters:
  • A (numpy.ndarray, shape (m+n, n)) – Input matrix which is rotated.
  • i (int) – Row number of the entry which will be zero.
  • j (int) – Column number of the entry which is after the rotation zero.
Returns:

  • B (numpy.ndarray, shape (m+n, n)) – Givens rotated matrix B.
  • Q (numpy.ndarray, shape (m+n, m+n)) – Givens rotation matrix, it is \(Q \cdot A = B\).