{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Solving Nonlinear Least Squares Problems Using the Gauß-Newton Method\n", "\n", "**The example in this notebook is based on a measurement series taken from [1]**.\n", "\n", "[1] A. Croeze, L. Pittman, andW. Reynolds. “Solving nonlinear least-squares problems with the Gauss-Newton and Levenberg-Marquardt methods”. en. In: (2012), p. 16.\n", "\n", "## Introduction\n", "\n", "In this notebook we want to use the Gauß-Newton method to perform the following model fitting process serving as an example for a nonlinear least squares problem:\n", "\n", "We use population data in the United states between 1815 and 1885 [1], as shown in the following Table. The period of time is mapped to the interval $[1,8]$ to simplify further computations. The measured values $y_i$ are the average number of citizens in the United states in the corresponding year." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "| year | 1815 | 1825 | 1835 | 1845 | 1855 | 1865 | 1875 | 1885 |\n", "| -: | :-: | :-: | :-: | :-: | :-: | :-: | :-: | :-: |\n", "| t | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |\n", "| y | 8.3 | 11.0 | 14.7 | 19.7 | 26.7 | 35.2 | 44.4 | 55.9" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following function is utilized to model the given data points:\n", "\\begin{equation*}\n", " \\varphi(x,t) = x_0\\exp{(x_1t)}\n", "\\end{equation*}\n", "where $\\varphi(x,t)$ describes the modeled number of citizens of the United States. The time points $t$ are given by mapping the year values as described above. The vector $x\\in\\mathbb{R}^2$ holds the two parameters that are used to fit the model.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "is_executing": true, "name": "#%%\n" } }, "outputs": [], "source": [ "import numpy as np\n", "\n", "# measured values\n", "time = [1, 2, 3, 4, 5, 6, 7, 8]\n", "measured = [8.3, 11.0, 14.7, 19.7, 26.7, 35.2, 44.4, 55.9]\n", "\n", "# model and its derivative\n", "def model(x, t):\n", " return x[0]*np.exp(x[1]*t)\n", "\n", "def dmodel(x, t):\n", " derivation = np.array([np.exp(x[1]*t),\n", " t*x[0]*np.exp(x[1]*t)])\n", " return derivation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Model Fitting via oppy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Preparation: Building the Sum of Squared Residuals (SSR)\n", "\n", "As explained in the notebook about least-squares problems, minimizing the sum of squared residuals (SSR)\n", "\\begin{align*}\n", " \\min_{x\\in\\mathbb{R}^n} \\ \\frac{1}{2}\\sum\\limits_{j=1}^m \\bigl(\\Phi(x,t_j)-y_j\\bigr)^2\n", "\\end{align*}\n", "can be used to fit a model to given data points. In our case, the following SSR is set up:\n", "\\begin{equation*}\n", " \\min_{x\\in\\mathbb{R}^3} \\ \\frac{1}{2}\\sum\\limits_{j=1}^{30} \\bigl( \\underbrace{x_0\\exp{(x_1t_j)}-y_j}_{r_j(x)}\\bigr)^2\n", "\\end{equation*}\n", "The Gauß-Newton method takes not take the complete SSR as an input, but the residual vector $r(x)$ (and its derivative):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# define residuals (one can use the costfunctions data_least_square and\n", "# gradient_data_least_square integrated in oppy to reduce coding lines)\n", "from oppy.tests.costfunctions import data_least_square, gradient_data_least_square\n", "\n", "def f_fun(x):\n", " return data_least_square(x, model, time, measured)\n", "\n", "def df_fun(x):\n", " return gradient_data_least_square(x, dmodel, time, measured)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Applying the Gauß-Netwon Method\n", "\n", "As a first step, a starting point for the iteration process is set. In addition, further imports are performed that are necessary for the application of the gauss_newton() method." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# import Gauß-Newton method and the options file\n", "from oppy.leastSquares import nonlinear_least_square\n", "from oppy.options import Options\n", "\n", "# starting point:\n", "x0 = np.array([2.5, 0.25])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### a) Using the Gauß-Netwon Method with its Default Setting\n", "In the following line, the Gauss-Newton method is applied. All optional values are left to their default." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "# apply the Gauß-Newton method with its default setting\n", "# in the options the Gauss-Newton method can be chosen:\n", "options = Options(method='gauss_newton')\n", "result = nonlinear_least_square(f_fun, df_fun, x0, options)\n", "\n", "# print the results\n", "print(result)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### b) Using the Gauß-Netwon Method with Further Options\n", "Alternatively, some options can be set (for a complete list of all options, see the method's documentation):" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "options_gauss_newton = Options()\n", "options_gauss_newton.method = 'gauss_newton'\n", "options_gauss_newton.nmax = 100\n", "options_gauss_newton.tol_rel = 1e-6\n", "options_gauss_newton.tol_abs = 1e-6\n", "\n", "# apply the Gauß-Newton method with further options\n", "result_options = nonlinear_least_square(f_fun, df_fun, x0, options_gauss_newton)\n", "\n", "# print the results\n", "print(result_options)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### c) Using the Levenberg-Marquardt Variant of the Gauß-Netwon Method\n", "There also exists a variant of the Gauss-Newton method implemented in the nonlinear_least_square() function, which follows a simple version of the Levenberg-Marquardt algorithm. This variant can be selected via the options:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "pycharm": { "name": "#%%\n" } }, "outputs": [], "source": [ "options_levenberg_marquardt = Options(method='levenberg_marquardt')\n", "\n", "# apply the Levenberg-Marquardt variant of the Gauß-Newton method\n", "result_lev = nonlinear_least_square(f_fun, df_fun, x0, options_levenberg_marquardt)\n", "\n", "# print the results\n", "print(result_lev)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.8" } }, "nbformat": 4, "nbformat_minor": 1 }