2.12.5 Multivariate Newton’s Method

2.12.5  Multivariate Newton’s Method

Newton’s method generalizes naturally to multiple dimensions. We seek a solution x for


where f : [n] → [n] is differentiable. Start with seed value x[1], and recursively obtain subsequent values x[2], x[3], x[4], … as follows. At each value x[k], construct Jacobian approximation [k] for f :


and select x[k + 1] as the solution to


Substituting [2.250] into [2.251] and setting x = x[k + 1] yields the recursive equation


which generalizes [2.142]

Newton’s method entails similar convergence issues in multiple dimensions as in a single dimension. Just as the univariate method fails if f ′(x[k]) = 0, so will the multivariate method fail if Jf (x[k]) is singular. Issues of no solution or multiple solutions also arise.

If an analytic formula for the Jacobian of f is unavailable, there is a generalization of the secant method called Broyden’s method. See Dennis and Schnabel (1983) or Burden and Faires (2010). A simpler solution is to apply Newton’s method using finite differences to approximate the Jacobian matrix.

2.12.6  Line searches

For Newton’s method, define the kth step length as x[k + 1] – x[k]. By construction, [k] tends to approximate f best near x[k], so small step lengths are desirable. It is the nature of Newton’s method that step lengths tend to decrease as it approaches a solution. For this reason, convergence is excellent close to a solution but can be poor further from a solution. Line searches are a technique for selectively shortening step lengths during the first few iterations of Newton’s method. They are invaluable in situations where a seed value x[1] may be some distance from a solution.

Suppose we are performing Newton’s method on a function f : 2 → 2. Newton’s method has just determined point x[k]. It and subsequent point x[k + 1] are plotted in Exhibit 2.20, which is a topographical graph. It indicates contour lines on which f (x) is constant.

Exhibit 2.20: A topographical graph of the norm of f. Numbers next to contour lines indicate the value of f(x) on those lines. In the example, an iteration of Newton’s method causes the norm of f to increase going from x[k] to x[k + 1]. As illustrated, this is due to the large step size.

Since our goal is to find a value x such that f (x) = 0, we expect the norm f (x[k]) to diminish with each iteration of Newton’s method. In the specific iteration illustrated in Exhibit 2.20, this does not happen. Moving along the line that connects x[k] and x[k + 1], the norm f (x) initially does decrease—from over 5 to less than 4. By the time we reach x[k + 1], however, it has risen again to exceed 6. The step length is too long. We may address the problem by searching backwards along the line between x[k] and x[k + 1] to find a more suitable value to use instead of x[k + 1]. This is called a line search. Line searches are performed in various ways. They raise two questions:

  1. When should we use them?
  2. How should we perform them?

When Newton’s method is close to a solution, it doesn’t make sense to perform line searches. Convergence will be rapid anyway, so there is no need to encumber the process. Since we have no way of knowing if Newton’s method is near a solution, we need some criteria to determine at which iterations to perform a line search. A simple criterion is to require7


and perform a line search if the criterion is not met.

If a line search is called for, it is performed by approximating f with a quadratic polynomial on the line between x[k] and x[k + 1]. Define g :  →  as the restriction of f to that line:


We have




and from these construct quadratic approximation for g:


This is minimized at


We set


which satisfies


Let x* be the value for x corresponding to λ*


Replace the unacceptable value for x[k + 1] with x*.

We are not yet done because our new value x[k + 1] was obtained with quadratic approximation [2.158] and a minimum bound of 0.1 for λ*, so it may also fail criterion [2.153]. If it satisfies the criterion, proceed to the next iteration of Newton’s method. Otherwise, perform another line search based upon x[k] and the new x[k + 1].

Line searches may be ineffective if components fi of f have different magnitudes in a neighborhood of a solution. Consider a function f : 2 → 2 whose components f1 and f2 have magnitudes on the order of 106 and 10–4 near a solution. Then f will be dominated by the value of f1, causing criterion [2.153] to largely ignore convergence with regard to f2. This problem can usually be solved with scaling. Define the function


where the ai are selected so that the components of all have similar magnitudes in a neighborhood of a solution. Then apply Newton’s method with line searches to solve



Use Newton’s method with line searches to solve the equation f(x) = 0, where f : 2 → 2 is defined as


Use seed value x[1] = (1, 1). Do not use scaling.