Algorithms for Minimization Without Derivatives
eBook - ePub

Algorithms for Minimization Without Derivatives

Richard P. Brent

Partager le livre
  1. 208 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Algorithms for Minimization Without Derivatives

Richard P. Brent

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

This outstanding text for graduate students and researchers proposes improvements to existing algorithms, extends their related mathematical theories, and offers details on new algorithms for approximating local and global minima. None of the algorithms requires an evaluation of derivatives; all depend entirely on sequential function evaluation, a highly practical scenario in the frequent event of difficult-to-evaluate derivatives.
Topics include the use of successive interpolation for finding simple zeros of a function and its derivatives; an algorithm with guaranteed convergence for finding a minimum of a function of one variation; global minimization given an upper bound on the second derivative; and a new algorithm for minimizing a function of several variables without calculating derivatives. Many numerical examples augment the text, along with a complete analysis of rate of convergence for most algorithms and error bounds that allow for the effect of rounding errors.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Algorithms for Minimization Without Derivatives est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Algorithms for Minimization Without Derivatives par Richard P. Brent en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Mathematics et Optimization. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2013
ISBN
9780486143682

1

INTRODUCTION AND SUMMARY

Section 1

INTRODUCTION

Consider the problem of finding an approximate zero or minimum of a function of one real variable, using limited-precision arithmetic on a sequential digital computer. The function f may not be differentiable, or the derivative fâ€Č may be difficult to compute, so a method which uses only computed values of f is desirable. Since an evaluation of f may be very expensive in terms of computer time, a good method should guarantee to find a correct solution, to within some prescribed tolerance, using only a small number of function evaluations. Hence, we study algorithms which depend on evaluating f at a small number of points, and for which certain desirable properties are guaranteed, even in the presence of rounding errors.
Slow, safe algorithms are seldom preferred in practice to fast algorithms which may occasionally fail. Thus, we want algorithms which are guaranteed to succeed in a reasonable time even for the most “difficult” functions, yet are as fast as commonly used algorithms for “easy” functions. For example, bisection is a safe method for finding a zero of a function which changes sign in a given interval, but from our point of view it is not an acceptable method, because it is just as slow for any function, no matter how well behaved, as it is in the worst possible case (ignoring the possibility that an exact zero may occasionally be found by chance). As a contrasting example, consider the method of successive linear interpolation, which converges superlinearly to a simple zero of a C1 function, provided that the initial approximation is good and rounding errors are unimportant. This method is not acceptable either, for in practice there may be no way of knowing in advance if the zero is simple, if the initial approximation is sufficiently good to ensure convergence, or if the effect of rounding errors is important.
In Chapter 4 we describe an algorithm which, by combining some of the desirable features of bisection and successive linear interpolation, does come close to satisfying our requirements: it is guaranteed to converge (i.e., halt) after a reasonably small number of function evaluations, and the rate of convergence for well-behaved functions is so fast that a less reliable algorithm is unlikely to be preferred on grounds of speed.
An analogous algorithm, which finds a local minimum of a function of one variable by a combination of golden section search and successive parabolic interpolation, is described in Chapter 5. This algorithm fails to satisfy one of our requirements: in certain applications where repeated one-dimensional minimizations are required, and where accuracy is not very important, a faster (though less reliable) method is preferable. One such application, finding local minima of functions of several variables without calculating derivatives, is discussed in Chapter 7. (Note that wherever we consider minima we could equally well consider maxima.)
Most algorithms for minimizing a nonlinear function of one or more variables find, at best, a local minimum. For a function with several local minima, there is no guarantee that the local minimum found is the global (i.e., true or lowest) minimum. Since it is the global minimum which is of interest in most applications, this is a serious practical disadvantage of most minimization algorithms, and our algorithm given in Chapter 5 is no exception. The usual remedy is to try several different starting points and, perhaps, vary some of the parameters of the minimization procedure, in the hope that the lowest local minimum found is the global minimum. This approach is inefficient, as the same local minimum may be found several times. It is also unreliable, for, no matter how many starting points are tried, it is impossible to be quite sure that the global minimum has been found.
In Chapter 6 we discuss the problem of finding the global minimum to within a prescribed tolerance. It is possible to give an algorithm for solving this problem, provided that a little a priori information about the function to be minimized is known. We describe an efficient algorithm, applicable if an upper bound on f″ is known, and we show how this algorithm can be used recursively to find the global minimum of a function of several variables. Unfortunately, because the amount of computation involved increases exponentially with the number of variables, the recursive method is practical only for functions of less than four variables. For functions of more variables, we still have to resort to the unreliable “trial and error” method, unless special information about the function to be minimized is available.
Thus, we are led to consider practical methods for finding local (unconstrained) minima of functions of several variables. As before, we consider methods which depend on evaluating the function at a small number of points. Unfortunately, without imposing very strict conditions on the functions to be minimized, it is not possible to guarantee that an n-dimensional minimization algorithm produces results which are correct to within some prescribed tolerance, or that the effect of rounding errors has been taken into account. We have to be satisfied with algorithms which nearly always give correct results for the functions likely to arise in practical applications.
As suggested by the length of our bibliography, there has recently been considerable interest in the unconstrained minimization problem. Thus, we can hardly expect to find a good method which is completely unrelated to the known ones. In Chapter 7 we take one of the better methods which does not use derivatives, that of Powell (1964), and modify it to try to overcome some of the difficulties observed in the literature. Numerical tests suggest that our proposed method is faster than Powell's original method, and just as reliable. It also compares quite well with a different method proposed by Stewart (1967), at least for functions of less than ten variables. (We have few numerical results for non-quadratic functions of ten or more variables.)
ALGOL implementations of all the above algorithms are given. Most testing was done with ALGOL W (Wirth and Hoare (1966)) on IBM 360/67 and 360/91 computers. As ALGOL W is not widely used, we give ALGOL 60 procedures (Naur (1963)), except for the n-dimensional minimization algorithm. FORTRAN subroutines for the one-dimensional zero-finding and minimization algorithms are given in the Appendix.
To recapitulate, we describe algorithms, and give ALGOL procedures, for solving the following problems efficiently, using only function (not derivative) evaluations:
1.Finding a zero of a function of one variable if an interval in which the function changes sign is given;
2.Finding a local minimum of a function of one variable, defined on a given interval;
3.Finding, to within a prescribed tolerance, the global minimum of a function of one or more variables, given upper bounds on the second derivatives;
4.Finding a local minimum of a function of several variables.
For the first three algorithms, rigorous bounds on the error and the number of function evaluations required are established, taking the effect of rounding errors into account. Some results concerning the order of convergence of the first two algorithms, and preliminary results on interpolation and divided differences, are also of interest.

Section 2

SUMMARY

In this section we summarize the main results of the following chapters. A more detailed discussion is given at the appropriate places in each chapter. This summary is intended to serve as a guide to the reader who is interested in some of our results, but not in others. To assist such a reader, an attempt has been made to keep each chapter as self-contained as possible.

Chapter 2

In Chapter 2 we collect some results on Taylor series, Lagrange interpolation, and divided differences. Most of these results are needed in Chapter 3, and the casual reader might prefer to skip Chapter 2 and refer back to it when necessary. Some of the results are similar to classical ones, but instead of assuming that f has n + 1 continuous derivatives, we only assume that f(n) is Lipschitz continuous, and the term f(n + 1) (Ο) in the classical results is replaced by a number which is bounded in absolute value by a Lipschitz constant. For example, Lemmas 2.3.1, 2.3.2, 2.4.1, and 2.5.1 are of this nature. Since a Lipschitz continuous function is differentiable almost everywhere, these results are not surprising, although they have not been found in the literature, except where references are given. (Sometimes Lipschitz conditions are imposed on the derivatives of functions of several variables: see, for example, Armijo (1966) and McCormick (1969).) The proofs are mostly similar to those for the classical results.
Theorem 2.6.1 is a slight generalization of some results of Ralston (1963, 1965) on differentiating the error in Lagrange interpolation. It is included both for its independent interest, and because it may be used to prove a slightly weaker form of Lemma 3.6.1 for the important case q = 2. (A proof along these lines is sketched in Kowalik and Osborne (1968).)
An interesting result of Chapter 2 is Theorem 2.6.2, which gives an expression for the derivative of the error in Lagrange interpolation at the points of interpolation. It is well ...

Table des matiĂšres