Alefeld (1995) proposes the TOMS 748 root-finding algorithm and claims (page 327) that it “has the best behavior of all 12 methods” compared, including Brent’s (1973). This claim has been endorsed by credible authorities. However, it depends on questionable tabulations of test results, although that is not readily evident because the underlying data are not provided. Reconstructing the missing raw dataset, and tabulating it more reasonably, shows that Brent’s method generally performs better than TOMS 748.
The classic algorithm for finding a root of a univariate equation is Brent’s method, which Press (2007, p. 455) names the “method of choice”:
Brent’s method combines the sureness of bisection with the speed of a higher-order method when appropriate. We recommend it as the method of choice for general one-dimensional root finding where a function’s values only (and not its derivative or functional form) are available.
Many authors have advanced intriguing alternatives to Brent’s method, and they necessarily strive to demonstrate some improvement in performance. The usual criterion—the only one Alefeld uses, and the only one considered here—is the number of function evaluations required to find a root within a given tolerance[1]. Whatever theoretical innovations a new algorithm may present (e.g., Brent favors inverse quadratic interpolation (IQI) where Alefeld prefers cubic (ICI)—see Appendix I), we want the algorithm that performs best in practice.
Only comparison to the reigning method of choice really matters, although Alefeld measures performance for twelve algorithms. Of those twelve, seven are progressive refinements of its authors’ own work, culminating in Algorithm 4.2 (“TOMS 748” hereinafter for ease of reference). Of the other five methods, one is Brent’s, three are by Dekker (upon whose work Brent’s is based), and the last, by Le (1985), is of the same Dekker–Brent family. Here, Brent’s method is taken as the standard of comparison, and only it and TOMS 748 are compared.
Alefeld presents its case in six tables. Table I lays out the test suite (15 distinct formulas, with a total of 154 sets of parameters). Tables II–VI summarize the number of function evaluations for various groups of problems, and are recapitulated below. Excepting only Table IV (which turns out to be pathological), TOMS 748 appears to demonstrate a strikingly consistent 4–6% advantage over Brent’s method:
Exhibit 1: Recapitulation of Alefeld’s data tables table tolerance Alefeld Brent (B−A)/B tests II 1E−07 2650 2804 5.5% all {1–15} 1E−10 2786 2905 4.1% 1E−15 2859 2975 3.9% 0 2884 3008 4.1% III 1E−07 2347 2501 6.2% all {1–15}, but 1E−10 2469 2589 4.6% only if solvable by 1E−15 2535 2651 4.4% Dekker (see below) 0 2554 2674 4.5% IV 1E−15 29 23 −26.1% pathological {13} only V 1E−15 128 136 5.9% {3, 14, 15} only VI 1E−15 114 120 5.0% all but {3, 13–15} totals are much lower for IV–VI because those tables are limited to selected test problems and use only one set of parameters [in headers here and elsewhere, A means Alefeld and B means Brent]
Tables II and VI are analyzed in depth below. As for the others:
These tables appear to make a prima facie case that TOMS 748 is quite consistently better than Brent’s method, and should therefore become the new algorithm of choice. Furthermore, it was published in Transactions on Mathematical Software (TOMS), a leading peer-reviewed journal, and has been included in the prestigious Collected Algorithms of the ACM (CACM), whose policy states:
An algorithm must either provide a capability not readily available or perform a task better in some way than has been done before. … In the case where the new algorithm is claimed to be an improvement over currently available codes supporting evidence must be included in the submission. Efficiency claims need to be supported on a wide range of practical problems.
Others have endorsed it, presumably because they trust the ACM to have reviewed the evidence rigorously. For example, one DConf 2016 paper calls it the “State Of The Art”, and the influential C++ boost library says “the Brent–Dekker algorithm, although very well know[n], is not provided by this library as TOMS 748 algorithm … [is] superior”.
On the strength of such recommendations, the TOMS 748 algorithm was
experimentally tried in our principal work, the
lmi
life-insurance illustration system, by translating the original
FORTRAN to C using f2c
. However, Brent’s method
performs much better in lmi, as shown in this
commit
that reports the number of function evaluations for the 388 test
cases in lmi’s regression tests:
evaluations max mean sample-std-dev Brent 7332 65 18.9 7.13 Alefeld 7622 76 19.6 7.65
This unexpected result might suggest that lmi’s
use cases are atypical, but they are simply polynomials.
Could the f2c
translation be incorrect?
Or is the TOMS 748 algorithm not as “superior”
as Tables II–VI depict?
These questions can be addressed only after reconstructing
the missing raw data, but a few preliminary comments must
be made first.
A careful reading of Alefeld (1995) reveals several anomalies. There is a mistake in the code[2] for one formula, though it’s apparently an inconsequential typo. And the value given for machine epsilon is implausible[3], evidencing a lack of rigorous review. But superficial flaws like those would be easy to fix.
An internal contradiction for problem #13 is of greater concern.
The unit-test benchmark file (testout
in
the CACM archive)
gives the root as “2.2317679157465D-02”, but
Alefeld (1995, p. 343)
declares that values in that vicinity are incorrect:
If we code xe−x−2 in Fortran 77 as x ⋅ (e−1 ⁄ x2) then all algorithms that solve this problem within 1000 iterations deliver values around 0.02 as the exact solution, because the result of the computation of 0.02 ⋅ (e−1 ⁄ (0.02)2) on our machine is equal to 0. However, when we code xe−x−2 as x ⁄ e1 ⁄ x2, all algorithms give correct solutions.
The subexpression
xe−x−2
has
no roots,
but the actual problem defines the function as zero at x=0
,
so the correct solution is x=0
. Due to inevitable
underflow, no iterative algorithm that uses customary floating-point
numbers can find that solution unless by
chance[4].
Problem #13 is nevertheless included in the analysis of Table II below for comparison with Alefeld’s published totals. This favors neither of the algorithms compared here, because the number of function evaluations is 18 for both in our reconstructed data. (Curiously, Table IV says that Brent’s method takes 23 evaluations and Alefeld’s takes 29, but that’s meaningless when the correct root cannot be found.)
To verify the performance claims that depend on
Alefeld (1995)’s
Tables II–VI, it is necessary to reconstruct the underlying
raw data that were not provided. This could be done directly, using
the FORTRAN code that they do provide. Lacking FORTRAN expertise,
we chose instead to use the f2c
translation that would
be needed for lmi anyway, on an x86_64-pc-linux-gnu platform.
(Appendix II gives links to the reconstructed data.)
This indirect approach usefully validates
the translation, using the testout
file in
Alefeld’s code archive, which gives (only) the roots found by
TOMS 748. Those roots were all reproduced
accurately[5] (except
for problem #13, discussed above), and the numbers of function
evaluations agree closely with published Tables II and VI,
so the f2c
translation seems to be correct.
Relative error, defined as the
(reconstructed–published)/published
number of function evaluations,
is a fraction of a percent in the reconstruction of
Table II’s “BR” and “4.2” columns:
Exhibit 2: Table II, reconstructed vs. published
tolerance | 1E−07 | 1E−10 | 1E−15 | zero | totals | |||||
Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | |
reconstructed | 2658 | 2807 | 2787 | 2907 | 2854 | 2974 | 2875 | 2991 | 11174 | 11679 |
published | 2650 | 2804 | 2786 | 2905 | 2859 | 2975 | 2884 | 3008 | 11179 | 11692 |
discrepancy | 8 | 3 | 1 | 2 | −5 | −1 | −9 | −17 | −5 | −13 |
relative error | 0.30% | 0.11% | 0.04% | 0.07% | −0.17% | −0.03% | −0.31% | −0.57% | −0.04% | −0.11% |
Table VI’s “BR” and “4.2” columns are reproduced as follows:
Exhibit 3: Table VI, reconstructed vs. published
problem | 1 | 2 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | total |
A, published | 10 | 11 | 13 | 10 | 11 | 7 | 11 | 9 | 9 | 18 | 5 | 114 |
A, reconstructed | 10 | 12 | 18 | 10 | 11 | 7 | 11 | 9 | 9 | 18 | 5 | 120 |
A, diff | – | 1 | 5 | – | – | – | – | – | – | – | – | 6 |
B, published | 9 | 10 | 15 | 10 | 13 | 9 | 11 | 10 | 9 | 14 | 10 | 120 |
B, reconstructed | 9 | 10 | 17 | 10 | 13 | 9 | 11 | 10 | 9 | 14 | 10 | 122 |
B, diff | – | – | 2 | – | – | – | – | – | – | – | – | 2 |
It may be all right to disregard the one-unit difference for problem
#2, but the discrepancies for problem #4 require some discussion.
Formula #4 is
xn − a,
with parameters a=1
and n=4
here, i.e.,
x4 − 1, and it is curious that
the only significant discrepancy occurs with such a simple formula.
Presumably the code (as translated by f2c
) is correct,
because the other data in Table VI, and especially in
the much more comprehensive Table II, are reproduced so closely.
The reconstructed data for problem #4 with other parameters are
generally consistent with those shown here,
so perhaps the power function’s implementation on the
authors’ machine behaved strangely in the neighborhood of unity,
in a way that somehow favored their algorithm.
Or maybe this is just an error in transcription from the raw data
to the published tables.
Our analysis proceeds on the assumption that this is an
isolated aberration whose ultimate cause may never be known.
Appending a measure of relative performance to Exhibit 2 shows that Table II creates an impression that TOMS 748 outperforms Brent by about four or five percent:
Exhibit 4a: Table II, relative performance
tolerance | 1E−07 | 1E−10 | 1E−15 | zero | ||||
Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | |
reconstructed | 2658 | 2807 | 2787 | 2907 | 2854 | 2974 | 2875 | 2991 |
published | 2650 | 2804 | 2786 | 2905 | 2859 | 2975 | 2884 | 3008 |
discrepancy | 8 | 3 | 1 | 2 | −5 | −1 | −9 | −17 |
relative error | 0.3% | 0.1% | 0.0% | 0.1% | −0.2% | 0.0% | −0.3% | −0.6% |
(B−A)/B, reconstructed totals | 5.3% | 4.1% | 4.0% | 3.9% | ||||
(B−A)/B, published totals | 5.5% | 4.1% | 3.9% | 4.1% |
However, taking the mean number of evaluations for each problem separately, and tabulating the sum of those means, leads to the opposite conclusion—that Brent outperforms TOMS 748 by about four percent:
Exhibit 4b: Table II, sum of means
tolerance | 1E−07 | 1E−10 | 1E−15 | zero | ||||
Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | Alefeld | Brent | |
averages | 204.528 | 196.346 | 214.405 | 206.387 | 222.480 | 213.069 | 225.400 | 216.014 |
(B−A)/B, means | −4.2% | −3.9% | −4.4% | −4.3% |
How can summarizing the same data in two different ways lead to two
opposite conclusions? The reason is that most problems are tested
with multiple sets of parameters, and the multiplicity varies greatly.
For example, problem #14 finds a zero of
(N ⁄ 20) *
(x ⁄ 1.5 + sin(x) − 1),
for each of N=1,2,3,…40
.
Alefeld’s testout
file gives the same root for this
problem forty separate times, as expected because N
doesn’t affect the root; unsurprisingly, N
doesn’t affect the number of evaluations either. Thus,
problem #14 is really only one test, whose ostensible forty-fold
multiplicity is clearly degenerate.
It may be useful to multiply both sides of an equation by a variety of constants, but only as an ancillary experiment to determine whether that affects the performance of one algorithm or another. Yet when no effect is found, as in this case, the result should be counted only once in a summary intended to compare overall performance. With forty times the weight of some other problems, problem #14 constitutes more than one-quarter of the Table II totals.
It is illuminating to tabulate the problems, and their weights, in their original order:
Exhibit 5a: Number of parameters per problem
problem | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | total |
multiplicity | 1 | 10 | 3 | 14 | 1 | 10 | 3 | 5 | 7 | 5 | 4 | 19 | 1 | 40 | 31 | 154 |
proportion | 1% | 6% | 2% | 9% | 1% | 6% | 2% | 3% | 5% | 3% | 3% | 12% | 1% | 26% | 20% |
and also by decreasing multiplicity, indicating which algorithm is faster (by totalling the number of evaluations for each problem across all four tolerances):
Exhibit 5b: Number of parameters per problem, sorted
problem | 14 | 15 | 12 | 4 | 2 | 6 | 9 | 8 | 10 | 11 | 3 | 7 | 1 | 5 | 13 | total |
multiplicity | 40 | 31 | 19 | 14 | 10 | 10 | 7 | 5 | 5 | 4 | 3 | 3 | 1 | 1 | 1 | 154 |
proportion | 26% | 20% | 12% | 9% | 6% | 6% | 5% | 3% | 3% | 3% | 2% | 2% | 1% | 1% | 1% | |
cumulative | 26% | 46% | 58% | 68% | 74% | 81% | 85% | 88% | 92% | 94% | 96% | 98% | 99% | 99% | 100% | |
faster | A | A | B | B | B | A | B | B | B | B | B | A | B | tie | tie |
TOMS 748 outperforms Brent’s method on four problems, but Brent’s method is better on nine (the other two being ties). The two problems for which TOMS 748 performs relatively best, #14 and #15, account for 71 out of the total 154 cases (46.1%). Assigning inordinate weights to those two problems distorts the outcome. Using the per-problem means, rather than the sums, yields a more robust summation, with exactly the opposite outcome.
Unlike Table II, Table VI considers only eleven of the fifteen problems, and selects a single set of parameters for each. Selecting one set is not as robust as taking the mean across all parameters, but would be expected to avoid the bias introduced by the multiplicity issue, because problems #14 and #15 are omitted—as long as the selection is unbiased. However, the published Table VI gives the impression that TOMS 748 performs about five percent better than Brent’s method, whereas the mean performance is actually nine percent worse on average across all parameters:
Exhibit 6: Table VI, subsets vs. means data match the “reconstructed” rows of Exhibit 3, except where explicitly marked “published” here
Alefeld’s selections |
mean rather than just one selection |
||||||
problem | parameter | ||||||
Alefeld | Brent | diff | Alefeld | Brent | diff | ||
1 | 10 | 9 | 1 | 10.0 | 9.0 | 1.0 | |
2 | 2 | 12 | 10 | 2 | 14.0 | 12.7 | 1.3 |
4 | 1,4, [0,5] | 18 | 17 | 1 | 19.0 | 16.7 | 2.3 |
5 | 10 | 10 | 0 | 10 | 10 | 0.0 | |
6 | 20 | 11 | 13 | −2 | 11.1 | 11.8 | −0.7 |
7 | 10 | 7 | 9 | −2 | 7.7 | 9.0 | −1.3 |
8 | 10 | 11 | 11 | 0 | 9.6 | 9.6 | 0.0 |
9 | 1 | 9 | 10 | −1 | 9.9 | 8.3 | 1.6 |
10 | 5 | 9 | 9 | 0 | 10.6 | 10.6 | 0.0 |
11 | 20 | 18 | 14 | 4 | 17.0 | 10.8 | 6.3 |
12 | 3 | 5 | 10 | −5 | 10.5 | 10.2 | 0.3 |
totals | 120 | 122 | −2 | 129.3 | 118.7 | 10.6 | |
published | 114 | 120 | −6 | [raw data not published] | |||
discrepancy | 6 | 2 | |||||
relative error | 5.3% | 1.7% | |||||
(B−A)/B, published data | 5.0% | ||||||
(B−A)/B, reconstructed data | 1.6% | −9.0% |
Comparing the grand totals of function evaluations for Table VI’s eleven problems yields a similar result:
Alefeld Brent B−A (B−A)/B 997 921 −76 −8.3%
It would seem that the parameter selections are not representative.
Problem #12, for example, exhibits the following differences in the
(reconstructed) number of function evaluations for its nineteen parameter
values, lower numbers indicating better performance for TOMS 748 as above:
{−1, −5, 0, −1, −1, −1, −1,
1, 1, 2, 1, 0, 1, 1, 2, 1, 1, 2, 2}
of which only the second—the most favorable for TOMS 748 by
far—was chosen for Table VI. No rationale is given for this
choice; replacing −5 with the mode, +1, for this problem alone,
would change the (published) totals to 120 for both algorithms in the
published table, nullifying the portrayed TOMS 748 advantage.
Using the mean for each problem, instead of choosing a single parameter for each, shows that Brent’s method outperforms TOMS 748 by nine percent.
Alefeld’s Tables II–VI summarize the same data in several different ways, all of which (except Table IV, comprising only the pathological problem #13) consistently depict TOMS 748 as performing better than Brent’s method. This consistency is unexpected, for example because of problems #14 and #15, which favor TOMS 748 and constitute 46% of the tests in Table II, and 80% in Table V, but none in Table VI. And Tables IV–VI form a partition of Table II, so they would be expected to show symmetric variations in relative performance. The expected pattern emerges when all are tabulated comparably from reconstructed data:
Exhibit 7: Published vs. reconstructed data, tol=1E−15 only table Alefeld Brent (B−A)/B tests ------published--------------- II 2859 2975 3.9% all {1–15} IV 29 23 −26.1% pathological {13} only V 128 136 5.9% {3, 14, 15} only VI 114 120 5.0% all but {3, 13–15} ----reconstructed----------- II 2854 2974 4.0% all {1–15} IV 18 18 0.0% pathological {13} only V 1839 2035 9.6% {3, 14, 15} only VI 997 921 −8.3% all but {3, 13–15} published Tables V and VI summarize only selected data, whereas reconstructed portion uses all data and therefore II ≡ IV ∪ V ∪ VI
Although our analysis so far has taken the test suite as a given, other suites might be used. For example, eleven of the problems in Alefeld’s (1995, p. 341) test suite match Le’s (1985, pp. 257–258) closely or exactly[6]. Considering only those problems (i.e., excluding #1, #12, #14, and #15), the total number of evaluations, as in Table II, are:
Exhibit 8: All tests common to both Alefeld and Le tolerance Alefeld Brent (B−A)/B 1E−07 808 719 −12.4% 1E−10 846 761 −11.2% 1E−15 877 792 −10.7% 0 888 802 −10.7% totals 3419 3074 −11.2%
From that point of view, Brent’s method performs better.
The tables in Alefeld (1995) seem to establish that their TOMS 748 algorithm consistently performs about 5% better than Brent’s (1973). This conclusion depends delicately on the various ways the test problems were selected and weighted. Inordinate weight was given to two problems that favor TOMS 748. The appearance of consistency across tables disappears under careful examination. Reasonable analysis of the same data (once reconstructed) supports the opposite conclusion: that Brent’s method generally outperforms Alefeld’s.
Inclusion in CACM is a strong endorsement of an algorithm’s general superiority. Alefeld (1995) doesn’t meet that standard, so CACM should reconsider its endorsement of TOMS 748. To prevent its readers from drawing conclusions that are not supported by the evidence provided, the editors of TOMS should carefully review the paper de novo, publish the raw data underlying its tables, and address the defects documented above.
Varah’s brief review said: “Although the new algorithms are marginally more economical in terms of function evaluations, it is striking how well Brent’s method compares in practice, 25 years after its introduction.” The first clause seems to rest on insufficient skepticism, but the rest remains true even half a century after Brent devised his method—which remains the algorithm of choice.
[1] Others may try to improve worst-case performance: for instance, Le (1985) trades away some performance in order to establish a tight upper bound on the number of evaluations.
[2]
In testout
, driver.f
codes problem #15
thus, in relevant part:
280 CONTINUE IF(X .GE. (1.0D-3)*2.0D0/(DN+1.0D0)) THEN FX=DEXP(1.0D0)-1.859D0
where “.GE.” means ‘≥’, but the formula in Table I uses ‘>’ here. Probably this is just a proofreading error.
[3] Alefeld (1995, p. 340) gives “1.9073486328 × 10−16” as the machine precision for the authors’ AT&T 3B2-1000 hardware. But that is a binary machine, so its machine precision must be a negative-integer power of two, which this number is not. Curiously, CACM’s policy:
standard conforming enquiry functions should be used to obtain values for any machine dependent constants (for example, the machine epsilon, the largest positive real number, etc.) required by the software.
has been complied with literally: the subroutine
RMP
in
enclofx.f
,
as translated by
f2c
,
correctly gives a value of
2−52 ≈ 2.220446e−16
on contemporary IEEE 754 hardware.
[4] This pathological problem is the same as eq. 2.7 in the fourth chapter of Brent (1973, p. 50), who uses it only to demonstrate that the secant method may take an excessive number of iterations.
[5]
See this
commit,
which shows the roots for all 154 tests.
For example, the first problem’s root is hard-coded for unit
testing, with Alefeld’s testout
value pasted
into a comment:
+ auto f01n01 = [](double x) {return std::sin(x) - x / 2.0;}; + // Alefeld: 1.8954942670340; + double r01n01 = 1.89549426703398093963;
[6] Corresponding test problems in both papers:
Le Alefeld – 1 1 ≈ 2 2 ≡ 3 3 ≡ 4 4 ≡ 5 5 ≈ 6 6 ≈ 7 7 ≈ 8 8 ≡ 9 9 ≡ 10 10 ≡ 11 11 – – 12 12 ≡ 13 13 – 14 – 15 – 16 – 17 – – 14 – 15
Where available, permanent URLs are given as “[PURL]” alternatives, in case the primary ones go stale.
Alefeld G.; Potra, F.; Shi, Yixun: “Algorithm 748: Enclosing Zeros
of Continuous Functions”, ACM Trans. Math. Softw., Vol. 21, No. 3,
September 1995, Pages 327–344. Reviewed by Varah, J., ACM Computing
Reviews, Review #: CR119678 (9605-0371). Available online:
citation;
PDF
[PURL];
code
[PURL].
The code archive includes a testout
file containing
the roots computed for all 154 test cases, by Algorithm 4.2 only.
Brent, R. P.: “Algorithms for Minimization without Derivatives”, Prentice-Hall, Englewood Cliffs, New Jersey, 1973. Online copy provided by Brent [PURL]; PDF [PURL].
Kahan, W.: “Lecture Notes on Real Root-Finding”, University of California at Berkeley, Berkeley, California, 2016. Online copy provided by Kahan [PURL].
Le, D.: “An Efficient Derivative-Free Method for Solving Nonlinear Equations”, ACM Trans. Math. Softw., Vol. 11, No. 3, September 1985, Pages 250−262. Available online: citation; PDF. [PURL]. Le uses a third-order Newton method with numerical approximations of derivatives, and achieves a tight upper bound of no more than four times the number of evaluations required by pure bisection.
Press, W. H., et al.: Numerical Recipes (3rd ed.), Cambridge University Press, Cambridge, 2007 snippet including quoted passage
Varah, James Martin: [Review of Alefeld et. al (1995)], Computing Reviews, Association for Computing Machinery, New York, New York, 1996. Available online: HTML [PURL].
Evolving from earlier algorithms, TOMS 748 introduces three innovations.
First, Alefeld (1995, p. 328) says that in most earlier work, including Brent’s, “only the convergence rate of {|xn − x*|}xn=1, where xn is the current estimate of x*, has been studied and not the convergence rate of the diameters (bn − an)”. For TOMS 748, a proof of superlinear convergence of the diameters is given. While this is academically interesting, it isn’t shown to confer any practical advantage.
Second, Alefeld (1995, p. 329) says TOMS 748’s “improvements are achieved by employing inverse cubic interpolation instead of quadratic interpolation” (increasing the asymptotic order of convergence of the diameters from ½⋅(1 + 5½) ≈ 1.6180… to (2 + 7½)⅓ ≈ 1.6686…), and that this “is optimal in a certain class of algorithms”. Kahan’s (2016, p. 6) general counterpoint discusses three phases:
Phase 1 : Flailing [:] Initial iterates x approximate the desired root z poorly. They may move towards it, or wander, or jump about as if at random, but they do not converge rapidly.
Phase 2 : Converging [:] Differences between successive iterates x dwindle,— rapidly, we hope.
Phase 3 : Dithering [:] Indistinguishable from Flailing except that different iterates x differ much less from a root and may (very nearly) repeat. Dithering is due entirely to roundoff.
… Converging is what we hope the chosen iteration does quickly, and usually it does; and when it does, the search for a zero can spend relatively little time in Phase 2. Why then is so much of the literature about numerical methods concerned with this phase? Perhaps because it is the easiest phase to analyze. Ultimately superlinear ( fast ) convergence is rarely difficult to accomplish, as we shall see; Newton’s iteration usually converges quadratically. Convergence faster than that is an interesting topic omitted from these notes because it reduces only the time spent in Phase 2
Brent (1973, p. 51) experimentally found that favoring IQI over the secant method saved about half of one function evaluation per root. If the average is 20, then that improvement is only 0.5 ⁄ 20 = 2.5%. Progressively higher-order methods yield diminishing returns: the roots converge with order 1.618… (secant), 1.839… (IQI: tribonacci constant), 1.928… (ICI: tetrabonacci), and so on. Thus, Brent (1973, p. 57) felt that “nothing much can be gained by going beyond linear or quadratic interpolation”. It should also be noted that higher-order methods are more susceptible to numerical errors such as roundoff, overflow, underflow, and catastrophic cancellation, especially in Phase 3.
Third, Alefeld (1995, p. 329) uses "double-length secant steps". The secant method often converges slowly from one side of the bounding interval, and doubling the size of a step is a heuristic that hopes to overshoot the root, leading to faster two-sided convergence. However, the secant method already has asymptotic order of convergence ≈ 1.618…, and this heuristic sacrifices that theoretically-proven property: when convergence would otherwise proceed rapidly, this technique spoils it.
Only experiment can determine whether this set of innovations is profitable.
The replication crisis shows that publication of raw data is crucial to upholding scientific integrity. As one journal editor says: “No raw data, no science”. PLOS requires authors to submit “The values behind the means, standard deviations and other measures reported”. This appendix shows where to find the data that underlie our findings, and (assuming proficiency with common tools) how to regenerate them.
Our data files are published in
lmi’s
git repository. The raw files used in our reconstructions are:
zero_a_x86_64-pc-linux-gnu, for Alefeld’s algorithm, and
zero_b_x86_64-pc-linux-gnu, for Brent’s.
Optionally running this command:
gwc/toms_vs_brent.sh
in a working copy recreates them from scratch (verified with
HEAD = fcc1f6714aec5e of 2022-04-13). Thus, everything in this
article can be reproduced from our published code alone (with the
sole exception of an incidental comment about lmi’s
regression tests, some of which depend on proprietary data).
The zero_consolidated file results from manually combining, reformatting, and labelling the raw data for easier reference. A spreadsheet (cryptographically signed) shows how the consolidated data have been analyzed and tabulated.
Copyright © 2022, 2023 Gregory W. Chicares. This program, including its documentation, is free software. Read the terms under which you can redistribute and modify it.
Maintained by Gregory W. Chicares. The latest version of this file can be found at the lmi website.