Solving the Rosenbrock Problem

Introduction

In this notebook we want to minimize the “Rosenbrock fuction”, regarding Wikipedia, the general form is given by

\[\begin{align*} f : \mathbb{R}^2 &\to \mathbb{R}\\ (x, y) &\mapsto (a-x)^2 + b(y-x^2)^2 \end{align*}\]

for \(a,b \in \mathbb{R}\). This function has a global minimum at \(x^* = (a, a^2)\) with \(f(x^*) = 0\). In our case we set \(a=1\) and \(b=100\), so the global minimum is in \((1,1)\). The function looks like this:

import numpy as np
import matplotlib.pyplot as plt
# import rosenbrock function and its derivatives
from oppy.tests.costfunctions import rosenbrock, gradient_rosenbrock, hessian_rosenbrock
# import additional plotting modules
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import cm
# ==================
# surface plot
# ==================
xDef = np.linspace(-2, 2, 100)
yDef = np.linspace(-0.5, 3, 100)
X,Y = np.meshgrid(xDef, yDef)

rosenbrock_vec = lambda x,y: rosenbrock(np.array([x,y]))
Z = rosenbrock_vec(X,Y)

fig = plt.figure(1)
ax = fig.add_subplot(111, projection='3d')
surf = ax.plot_surface(X,Y,Z,cmap=cm.gist_rainbow)
plt.title("Rosenbrock function: $a=1, b=100$")
plt.show()

# ==================
# contour plot
# ==================
xDef = np.linspace(-0.5, 2, 100)
yDef = np.linspace(-0.5, 2, 100)
X,Y = np.meshgrid(xDef, yDef)

rosenbrock_vec = lambda x,y: rosenbrock(np.array([x,y]))
Z = rosenbrock_vec(X,Y)

plt.figure(2)
cp = plt.contour(X,Y,Z,np.logspace(-0.5,3.5,7,base=10),cmap=cm.coolwarm)
plt.clabel(cp, inline=True, fontsize=12)
plt.plot(1.0,1.0, "rx", label="x* = (1,1)")
plt.title("Rosenbrock function: $a=1, b=100$")
plt.legend(loc="upper right")
plt.show()
../_images/ffbfe335bb8dcd58462d5b361ce479064f7fd81832f6bcbead4e7f01e6b13191.png ../_images/892e37c86f0abf3ce09767983a86f3e55a9a2887632d2d952181972ade71c06c.png

This function is pretty interesting to test our optimization algorithms, since the global minimum is inside a long, narrow, parabolic shaped flat valley. Most of the minimization methods can easily get inside this valley, but converging to the global minimum is usually pretty hard.

Minimization of the Rosenbrock function via oppy

Conclusion: Comparison of convergence rates

In the cell below all results with the wolfe linesearch are printed into one graph

#==============================================================================
#Plot the norm of the gradient for all schemes:
#------------------

plt.figure(figsize=(15,10))

plt.semilogy(result_dict["res_grad_wolfe"].res, label = "res_grad_wolfe") 

plt.semilogy(result_dict["res_newton_wolfe"].res, label = "res_newton_wolfe") 

plt.semilogy(result_dict["res_quasi_wolfe"].res, label = "res_quasi_wolfe") 

plt.semilogy(result_dict["res_nonlinear_cg_wolfe"].res, label = "res_nonlinear_cg_wolfe") 

plt.legend(loc="lower right")
plt.title('Convergence results for different methods')
plt.ylabel("Norm of the gradient")
plt.xlabel('Number of iterations')
plt.show()
../_images/2e14f8948ac12f2f4bbb6bf718da0dff80182323e8620f94869e317222e1f3b6.png