Mykel J. Kochenderfer
Tim A. Wheeler
All rights reserved. Permission is granted, free of charge, to use the code snippets associated with this book, subject to the condition that the source of the code is acknowledged. No other parts of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher.
This book was set in TEX Gyre Pagella by the authors in LATEX.
Printed and bound in the United States of America.
Names: Kochenderfer, Mykel J., 1980 author. | Wheeler, Tim A. (Tim Allan), author.
Title: Algorithms for optimization / Mykel J. Kochenderfer and Tim A. Wheeler.
Description: Cambridge, MA: The MIT Press, [2019] | Includes bibliographical references and index.
Identifiers: LCCN 2018023091 | ISBN 9780262039420 (hardcover: alk. paper)
Subjects: LCSH: Algorithms. | AlgorithmsProblems, exercises, etc.
Contents
List of Figures
. Queen Dido, founder of Carthage, was granted as much land as she could enclose with a bullhide thong. She made a semicircle with each end of the thong against the Mediterranean Sea, thus enclosing the maximum possible area. This problem is mentioned in Virgils Aeneid (19 BCE).
. The design optimization process. We seek to automate the optimization procedure highlighted in blue.
. A one-dimensional optimization problem. Note that the minimum is merely the best in the feasible setlower points may exist outside the feasible region.
.
has no solution because the constraint boundary is not feasible.
. Examples of critical points of interest to optimization algorithms (where the derivative is zero) on a univariate function.
. Examples of the necessary but insufficient conditions for strong local minima.
. The three local regions where the gradient is zero.
. Dependencies for the chapters in Algorithms for Optimization. Lighter arrows indicate weak dependencies.
. The function f is drawn in black and the tangent line to f(x) is drawn in blue. The derivative of f at x is the slope of the tangent line.
. The tangent line is obtained by joining points with sufficiently small step differences.
. Each component of the gradient defines a local tangent line. These tangent lines define the local tangent hyperplane. The gradient vector points in the direction of greatest increase.
. A comparison of the error in derivative estimate for the function sin(x) at x = 1/2 as the step size is varied. The linear error of the forward difference method and the quadratic error of the central difference and complex methods can be seen by the constant slopes on the right hand side. The complex step method avoids the subtractive cancellation error that occurs when differencing two function evaluations that are close together.
. The computational graph for ln(ab + max(a, 2)).
. The computational graph for ln(ab + max(a, 2)) being set up for forward accumulation to calculate f/a with a = 3 and b = 2.
. The computational graph for ln(ab + max(a, 2)) after forward accumulation is applied to calculate f/a with a = 3 and b = 2.
. Three points shown bracketing a minimum.
. An example of running bracket_minimum on a function. The method reverses direction between the first and second iteration and then expands until a minimum is bracketed in the fourth iteration.
. Our initial guess for two queries will remove one-third of the initial interval.
. The most we can guarantee to shrink our interval is by just under a factor of two.
. With three queries we can shrink the domain by a factor of three. The third query is made based on the result of the first two queries.
. For n queries we are guaranteed to shrink our interval by a factor of Fn+1. The length of every interval constructed during Fibonacci search can be expressed in terms of the final interval times a Fibonacci number. If the final, smallest interval has length In, then the second smallest interval has length In1 = F2In, the third smallest interval has length In2 = F3In, and so forth.
. For n queries of a univariate function we are guaranteed to shrink a bracketing interval by a factor of n1.
. Fibonacci and golden section search on a unimodal function.
. Fibonacci and golden section search on a nonunimodal function.
. Quadratic fit search fits a quadratic function to three bracketing points (black dots) and uses the analytic minimum (blue dot) to determine the next set of bracketing points.
. Four iterations of the quadratic fit method.
. The first iteration of the Shubert-Piyavskii method.
. Updating the lower bound involves sampling a new point and intersecting the new lines with the existing sawtooth.
. Nine iterations of the Shubert-Piyavskii method proceeding left to right and top to bottom. The blue lines are uncertainty regions in which the global minimum could lie.
. A horizontal line drawn from any y [f(a), f(b)] must intersect the graph at least once.
. Four iterations of the bisection method. The horizontal line corresponds to f(x) = 0. Note that multiple roots exist within the initial bracket.
. A bracketing method initialized such that it straddles the two roots in this figure will expand forever, never to find a sign change. Also, if the initial interval is between the two roots, doubling the interval can cause both ends of the interval to simultaneously pass the two roots.
. The sufficient decrease condition, the first Wolfe condition, can always be satisfied by a sufficiently small step size along a descent direction.
. Backtracking line search used on the Rosenbrock function (appendix B.6). The black lines show the seven iterations taken by the descent method and the red lines show the points considered during each line search.
. The curvature condition, the second Wolfe condition, is necessary to ensure that second-order function approximations have positive curvature, thereby having a unique global minimum.
. Regions where the curvature condition is satisfied.
. Regions where the strong curvature condition is satisfied.
. The interval [0, ] is guaranteed to bracket an interval containing a step length satisfying the strong Wolfe conditions when any of these three conditions is true.
. The first phase of strong backtracking line search, indicated with black open circles, is used to bracket an interval. In this case, the condition f(x+(4)d) is triggered, causing the bracketing interval to be [(3), (4)]. Then the zoom phase, indicated by the red open circle, shrinks the bracketed region until a suitable step length is found.
. Trust region methods constrain the next step to lie within a local region. The trusted region is expanded and contracted based on the predictive performance of models of the objective function.
. Trust region optimization used on the Rosenbrock function (appendix B.6).