Solving Nonlinear Least Squares Problems Using the Gauß-Newton Method

The example in this notebook is based on a measurement series taken from [1].

[1] A. Croeze, L. Pittman, andW. Reynolds. “Solving nonlinear least-squares problems with the Gauss-Newton and Levenberg-Marquardt methods”. en. In: (2012), p. 16.

Introduction

In this notebook we want to use the Gauß-Newton method to perform the following model fitting process serving as an example for a nonlinear least squares problem:

We use population data in the United states between 1815 and 1885 [1], as shown in the following Table. The period of time is mapped to the interval \([1,8]\) to simplify further computations. The measured values \(y_i\) are the average number of citizens in the United states in the corresponding year.

year 1815 1825 1835 1845 1855 1865 1875 1885
t 1 2 3 4 5 6 7 8
y 8.3 11.0 14.7 19.7 26.7 35.2 44.4 55.9

The following function is utilized to model the given data points:

\[\begin{equation*} \varphi(x,t) = x_0\exp{(x_1t)} \end{equation*}\]

where \(\varphi(x,t)\) describes the modeled number of citizens of the United States. The time points \(t\) are given by mapping the year values as described above. The vector \(x\in\mathbb{R}^2\) holds the two parameters that are used to fit the model.

import numpy as np

# measured values
time = [1, 2, 3, 4, 5, 6, 7, 8]
measured = [8.3, 11.0, 14.7, 19.7, 26.7, 35.2, 44.4, 55.9]

# model and its derivative
def model(x, t):
    return x[0]*np.exp(x[1]*t)

def dmodel(x, t):
    derivation = np.array([np.exp(x[1]*t),
                           t*x[0]*np.exp(x[1]*t)])
    return derivation

Model Fitting via oppy

Preparation: Building the Sum of Squared Residuals (SSR)

As explained in the notebook about least-squares problems, minimizing the sum of squared residuals (SSR)

\[\begin{align*} \min_{x\in\mathbb{R}^n} \ \frac{1}{2}\sum\limits_{j=1}^m \bigl(\Phi(x,t_j)-y_j\bigr)^2 \end{align*}\]

can be used to fit a model to given data points. In our case, the following SSR is set up:

\[\begin{equation*} \min_{x\in\mathbb{R}^3} \ \frac{1}{2}\sum\limits_{j=1}^{30} \bigl( \underbrace{x_0\exp{(x_1t_j)}-y_j}_{r_j(x)}\bigr)^2 \end{equation*}\]

The Gauß-Newton method takes not take the complete SSR as an input, but the residual vector \(r(x)\) (and its derivative):

# define residuals (one can use the costfunctions data_least_square and
# gradient_data_least_square integrated in oppy to reduce coding lines)
from oppy.tests.costfunctions import data_least_square, gradient_data_least_square

def f_fun(x):
    return data_least_square(x, model, time, measured)

def df_fun(x):
    return gradient_data_least_square(x, dmodel, time, measured)

Applying the Gauß-Netwon Method

As a first step, a starting point for the iteration process is set. In addition, further imports are performed that are necessary for the application of the gauss_newton() method.

# import Gauß-Newton method and the options file
from oppy.leastSquares import nonlinear_least_square
from oppy.options import Options

# starting point:
x0 = np.array([2.5, 0.25])

a) Using the Gauß-Netwon Method with its Default Setting

In the following line, the Gauss-Newton method is applied. All optional values are left to their default.

# apply the Gauß-Newton method with its default setting
# in the options the Gauss-Newton method can be chosen:
options = Options(method='gauss_newton')
result = nonlinear_least_square(f_fun, df_fun, x0, options)

# print the results
print(result)
 iterations: 6
        res: array([1.15448714e+04, 1.18918917e+04, 7.71959039e+02, 3.68152203e+00,
       3.08907754e-02, 1.17316561e-03, 3.79220090e-05])
          x: array([7.00015188, 0.26207664])

b) Using the Gauß-Netwon Method with Further Options

Alternatively, some options can be set (for a complete list of all options, see the method’s documentation):

options_gauss_newton = Options()
options_gauss_newton.method = 'gauss_newton'
options_gauss_newton.nmax = 100
options_gauss_newton.tol_rel = 1e-6
options_gauss_newton.tol_abs = 1e-6

# apply the Gauß-Newton method with further options
result_options = nonlinear_least_square(f_fun, df_fun, x0, options_gauss_newton)

# print the results
print(result_options)
 iterations: 5
        res: array([1.15448714e+04, 1.18918917e+04, 7.71959039e+02, 3.68152203e+00,
       3.08907754e-02, 1.17316561e-03])
          x: array([7.00015461, 0.26207658])

c) Using the Levenberg-Marquardt Variant of the Gauß-Netwon Method

There also exists a variant of the Gauss-Newton method implemented in the nonlinear_least_square() function, which follows a simple version of the Levenberg-Marquardt algorithm. This variant can be selected via the options:

options_levenberg_marquardt = Options(method='levenberg_marquardt')

# apply the Levenberg-Marquardt variant of the Gauß-Newton method
result_lev = nonlinear_least_square(f_fun, df_fun, x0, options_levenberg_marquardt)

# print the results
print(result_lev)
 iterations: 6
        res: array([1.15448714e+04, 1.20201297e+04, 7.85136012e+02, 3.80880879e+00,
       3.11472615e-02, 1.18009947e-03, 3.80636286e-05])
          x: array([7.00015188, 0.26207664])