Bug 15219 - Behaviour of integrate() has changed
Behaviour of integrate() has changed
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Accuracy
R 2.15.2
x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 normal
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-02-24 20:52 UTC by J. R. M. Hosking
Modified: 2013-07-15 13:20 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description J. R. M. Hosking 2013-02-24 20:52:53 UTC
I have only just noticed this, but starting with R 2.12.0 the results
returned from integrate() are sometimes different from those given by
the original Fortran routines in QUADPACK.  I do not think that this
change can have been intended, since it has some unfortunate effects:
(i) reducing the error tolerance will, more often than previously, not
reduce the actual error; (ii) the actual error may exceed the error
bound reported by the function.  Point (ii) is particularly perturbing;
I do not recall having encountered this behaviour in R before version
2.12.0, or in any of my Fortran programs that have called numerical
integration routines from QUADPACK.

Test case for finite interval:

  fun <- function(x) (-log(x))^(-1/2)
  #
  res <- integrate(fun, 0, 1, rel.tol=1e-4)
  res
  cat("actual error", res$value - sqrt(pi), "\n")
  #
  res <- integrate(fun, 0, 1, rel.tol=1e-6)
  res
  cat("actual error", res$value - sqrt(pi), "\n")
  #
  res <- integrate(fun, 0, 1, rel.tol=1e-8)
  res
  cat("actual error", res$value - sqrt(pi), "\n")

Test case for infinite interval:

  fun <- function(x) x^(-1/2)*exp(-x)
  #
  res <- integrate(fun, 0, Inf, rel.tol=1e-4)
  res
  cat("actual error", res$value - sqrt(pi), "\n")
  #
  res <- integrate(fun, 0, Inf, rel.tol=1e-6);
  res
  cat("actual error", res$value - sqrt(pi), "\n")
  #
  res <- integrate(fun, 0, Inf, rel.tol=1e-8);
  res
  cat("actual error", res$value - sqrt(pi), "\n")

Test case results (for finite interval; infinite interval shows similar pattern):

# With R 2.11.1
# As tolerance decreases, reported error decreases
# and actual error remains lower than reported error
> fun <- function(x) (-log(x))^(-1/2)
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-4)
> res
1.772454 with absolute error < 1.7e-05
> cat("actual error", res$value - sqrt(pi), "\n")
actual error 1.396563e-08
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-6)
> res
1.772454 with absolute error < 4.7e-08
> cat("actual error", res$value - sqrt(pi), "\n")
actual error 1.299407e-10
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-8)
> res
1.772454 with absolute error < 1.5e-08
> cat("actual error", res$value - sqrt(pi), "\n")
actual error -1.567546e-11

# With R 2.15.2 (R 2.12.0 and R-devel 2013-02-16 r61969 give the same results)
# As tolerance decreases, reported error decreases
# but actual error remains much the same
> fun <- function(x) (-log(x))^(-1/2)
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-4)
> res
1.772461 with absolute error < 5.2e-05
> cat("actual error", res$value - sqrt(pi), "\n")
actual error 7.340643e-06
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-6)
> res
1.772461 with absolute error < 8.4e-07
> cat("actual error", res$value - sqrt(pi), "\n")
actual error 7.340636e-06
> #
> res <- integrate(fun, 0, 1, rel.tol=1e-8)
> res
1.772461 with absolute error < 5.4e-11
> cat("actual error", res$value - sqrt(pi), "\n")
actual error 7.34062e-06
> sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
>


Examination of the code of integrate.c shows that there were several
changes between R 2.11.1 and R 2.12.0. I am by no means a C expert, so
the following diagnosis is speculative.  But consider this code fragment
from the R 2.12.0 and R 2.15.2 versions of integrate.c:

    for (*last = 2; *last <= *limit; ++(*last)) {  /* line 1394 */

        [code elided]

        if (ierro != 3 && erlarg > ertest) {       /* line 1499 */

           [code elided]

            for (k = id; k <= jupbnd; ++k) {
                maxerr = iord[nrmax];
                errmax = elist[maxerr];
                if (fabs(blist[maxerr] - alist[maxerr]) > small) {
                    continue;
                }
                ++nrmax;
                /* L50: */
            }
        }

        [code elided]

    }

The 'continue' statement in the inner loop will jump to the end of the
inner loop.  But in R 2.11.1 the corresponding statement was 'goto L90',
a jump to the end of the outer loop.  (In the QUADPACK Fortran code the
corresponding statement,

  if(dabs(blist(maxerr)-alist(maxerr)).gt.small) go to 90

is preceded by the comment "jump out of do-loop" --
http://www.netlib.org/quadpack/dqagse.f, lines 375-376).  I have
verified that replacing this 'continue' by a jump to the end of the
outer loop will restore the R 2.11.1 behaviour of integrate(), at least
in the test cases given above.  Somebody more knowledgeable about the
code than I am should verify that this change is safe and appropriate,
but my tentative suggestion is that this change should be made in the
current integrate.c. The replacement needs to be made twice, once each
in routines dqagie and dqagse.  A patch follows.    

--- E:\R\R-2.15.2\src\appl\integrate.c  2013-02-23 21:47:53.821259100 -0500
+++ E:\R\integrate.fix\src\integrate.c  2013-02-23 21:47:43.972695800 -0500
@@ -809,7 +809,7 @@
                 maxerr = iord[nrmax];
                 errmax = elist[maxerr];
                 if (fabs(blist[maxerr] - alist[maxerr]) > small) {
-                    continue;
+                    goto L90;
                 }
                 ++nrmax;
                 /* L50: */
@@ -850,6 +850,8 @@
         extrap = FALSE;
         small *= .5;
         erlarg = errsum;
+L90:
+        ;
     }

 /* L100:     set final result and error estimate. */
@@ -1511,7 +1513,7 @@
                 maxerr = iord[nrmax];
                 errmax = elist[maxerr];
                 if (fabs(blist[maxerr] - alist[maxerr]) > small) {
-                    continue;
+                    goto L90;
                 }
                 ++nrmax;
                 /* L50: */
@@ -1551,6 +1553,8 @@
         extrap = FALSE;
         small *= .5;
         erlarg = errsum;
+L90:
+        ;
     }