Bug 15620 - dnorm accurate for large x
Summary: dnorm accurate for large x
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Accuracy (show other bugs)
Version: R 3.0.2
Hardware: Other All
: P5 normal
Assignee: Martin Maechler
URL:
Depends on:
Blocks:
 
Reported: 2013-12-30 04:40 UTC by M Welinder
Modified: 2015-12-14 13:49 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description M Welinder 2013-12-30 04:40:35 UTC
dnorm uses the obvious formula: c*exp(-0.5*x*x) ignoring mu and sigma.
That is fine for moderate arguments, but "exp" has a nasty habit of
blowing up rounding errors such as those for x*x when the argument is
large.

For example:

x=35+2^-46.

dnorm(x,0,1) --> 3.940396277134232207447585696367E-267
mathematica  --> 3.940396277134064456725435240768312101152712969120510158e-267

That's only about 13 correct digits.  That's not a catastrophe if dnorm
is the end product, but dnorm is used as a building block in countless
places.

Suggestion for large x, non-log case:

1. Drop the sign (to ensure symmetry).
2. Split x into w+f with |f|<=1/2
3. Use c*exp(-0.5*w*w)*exp(-0.5*f*f)*exp(-w*f)

This helps because w*w has no rounding error and w*f has ~5 extra bits of
accuracy because it isn't lumped together with the large w*w.

It's not clear what makes x large, but something like >15 is probably fine.

This would be 3x slower for large x.  If that is not acceptable, one
could reduce it to 2x by tabulating the first exp.  (With c folded in for
a tiny bit of extra accuracy.)
Comment 1 M Welinder 2013-12-30 14:22:49 UTC
Actually, we can do even better.  dnorm underflows around x=37; with long
double it would underflow around 150.  So the whole part will have no more
than 8 interesting bits.  We can cram 26 bits into the first part without
rounding error, so code like this ought to do better:

x = fabs (x);
if (x >= 256) {
    /* The main formula might produce a NaN in this case.  */
    return 0;
} else {
    /* x1 will have only 24 or fewer bits.  */
    x1 = floor (x * 65536 + 0.5) / 65536;
    x2 = x - x1;
    return c*exp(-0.5*x1*x1)*exp(-0.5*x2*x2)*exp(-x1*x2);
}
Comment 2 Brian Ripley 2013-12-30 17:00:20 UTC
All you can assume about long double is that it is no less accurate than double.  See the C99/C11 standards.
Comment 3 M Welinder 2013-12-30 18:27:20 UTC
This seems to work for me without making assumptions about the
precision of "double".


    x = fabs (x);
    if (x >= 2 * sqrt (DOUBLE_MAX))
	    return R_D__0;
    if (give_log)
	    return -(M_LN_SQRT_2PI + 0.5 * x * x + log(sigma));
    else if (x < 5)
	    return M_1_SQRT_2PI * exp(-0.5 * x * x) / sigma;
    else {
	    /*
	     * Split x into two parts, x=x1+x2, such that |x2|<=2^-16.
	     * Assuming that we are using IEEE doubles, that means that
	     * x1*x1 is error free for x<1024 (above which we will underflow
	     * anyway).  If we are not using IEEE doubles then this is
	     * still an improvement over the naive formula.
	     */
	    double x1 = floor (x * 65536 + 0.5) / 65536;
	    double x2 = x - x1;
	    return M_1_SQRT_2PI / sigma *
		    (exp(-0.5 * x1 * x1) *
		     exp((-0.5 * x2 - x1) * x2));
    }
Comment 4 Martin Maechler 2013-12-31 16:38:22 UTC
(In reply to M Welinder from comment #3)

I've implemented a version of this, in R-devel (actually, using ldexp(.) for faster multiplication with powers of 2), and then checked the result
both accuracy and time consumption:
For accuracy checking (my own) package 'Rmpfr' is very useful;
for timing, I've used package  'microbenchmark'.

Summary:
 - The new code uses 20% more CPU in a (somewhat) realistic setting
 - The accuracy is improved, indeed,  but then really only for these very rare
   numbers such as   <integer> +/- 2^-50  (you've used  35 + 2^-46).
I'm very hesitant to activate this, because the problem is triggered so rarely *and* after all, the accuracy is still 13 digits in these rare cases, and the
full  2e-16 (or better) accuracy for all the normal cases.

Here's some R "log" for the timing {on one platform}:


#### We provide  pnorm() explicitly -- but with the price of warning about
#### "shadowing  pnorm() from R's  'stats' package.

###--> ~/R/Pkgs/Rmpfr/R/special-fun.R
## dnorm() is much simpler {and not part of 'Rmpfr'}:
dnorm <- function (x, mean = 0, sd = 1, log = FALSE) {
    if(is.numeric(x) && is.numeric(mean) && is.numeric(sd))
        stats::dnorm(x, mean, sd, log=log)
    else if(is(x, "mpfr") || is(mean, "mpfr") || is(sd, "mpfr")) {
        stopifnot(length(log) == 1)
        rr <- x <- (as(x, "mpfr") - mean) / sd
        prec.x <- max(.getPrec(x))
        twopi <- 2*Const("pi", prec.x)
        ## f(x) =  1/(sigma*sqrt(2pi)) * exp(-1/2 x^2)
        if(log) ## log( f(x) ) = -[ log(sigma) + log(2pi)/2 + x^2 / 2]
            -(log(as(sd,"mpfr")) + (log(twopi) + x*x)/2)
        else exp(-x^2/2) / (as(sd,"mpfr")*sqrt(twopi))
    } else stop("invalid arguments (x,mean,sd)")
}

options(digits=21)
dnorm(5)
require("Rmpfr")
all.equal(dnorm(mpfr(5, 100)), dnorm(5), tol= 1e-17) # 2.0187 e-17
all.equal(dnorm(mpfr(6, 120)), dnorm(6), tol= 1e-17) # 9.8807 e-17
all.equal(dnorm(mpfr(7, 120)), dnorm(7), tol= 1e-17) # 9.8807 e-17

relErr <- function(target, current) as.numeric(1 - current/target)

x0 <- seq(2, 40, by=1/4)
x <- sort(c(x0, x0+2^-46, x0+2^-50, x0- 2^-54))

system.time(## takes a couple of seconds:
re.x <- sapply(x, function(.) relErr(dnorm(mpfr(., 150)), dnorm(.)))
) ## 7.5

## Now, I'm using a version of  Morten Welinder's proposal for PR#15620
## https://bugs.r-project.org/bugzilla/show_bug.cgi?id=15620
## in ~/R/D/r-devel/R/src/nmath/dnorm.c

##---> But this is *not* active by default: You need to recompile
##  with  -DR_ACCURATE_dnorm to get that.
## I've kept both the logs (microbenchmark)
## ->  ./distr-ex.Rout.~Rdevel~
## ->  ./distr-ex.Rout
## and the plots
## -> ./dnorm-relErr_R_3.0.2.pdf  and
## -> ./dnorm-relErr_R_3.1.0.pdf <<- with the accurate (slow) code

pdf(print(paste0("dnorm-relErr_R_",getRversion(),".pdf")))

op <- par(mar=par("mar")+c(0,0,0,1.5))
plot(abs(re.x) ~ x, subset=abs(re.x) < 1e-6, type="o", col="gray25", log="y",
     ylab = expression(group("|", rel.err, "|")),
     main = "| rel.Error( dnorm(x) ) |",
     yaxt="n"); sfsmisc::eaxis(2)
col.e <- adjustcolor("tomato", 0.75)
eps.c <- .Machine$double.eps
f <- c(1,16, 256)
(fE <- as.expression(lapply(f, function(.)
    if(. == 1) quote(epsilon[c]) else substitute(F*epsilon[c], list(F= .)))))
abline(h = f*eps.c, lty=3, lwd=2, col=col.e)
axis(4, at=f*eps.c, labels = fE, las=1, col=col.e, col.axis=col.e)
par(op)
abline(v=3:5, lty=2, col="gray")
mtext(R.version.string, 1, adj=1, cex = 0.5, line=-1)


detach("package:Rmpfr")
rm(dnorm) # from above
require(microbenchmark)

mb <- microbenchmark(dnorm(x), dnorm(x, log=TRUE))
mb
##---- with the accurate dnorm: --------  2 calls -----

## Unit: microseconds
##                  expr     min      lq   median      uq      max neval
##              dnorm(x) 117.609 118.168 118.5875 125.711 1342.033   100
##  dnorm(x, log = TRUE)  76.264  77.102  77.5220  85.204  182.979   100

## Unit: microseconds
##                  expr     min      lq  median       uq     max neval
##              dnorm(x) 117.610 118.308 119.006 140.9365 153.088   100
##  dnorm(x, log = TRUE)  76.265  76.824  77.382  77.6620  88.556   100

##---- with the regular R 3.0.2 dnorm : --------  2 calls -----

## Unit: microseconds
##                  expr    min     lq  median     uq     max neval
##              dnorm(x) 83.249 83.807 94.7025 96.937 107.274   100
##  dnorm(x, log = TRUE) 82.131 82.690 82.9690 83.249 498.094   100

## Unit: microseconds
##                  expr    min     lq median     uq     max neval
##              dnorm(x) 83.528 84.087 96.658 97.217 108.949   100
##  dnorm(x, log = TRUE) 82.410 82.690 83.248 83.249  97.217   100
Comment 5 M Welinder 2013-12-31 18:01:39 UTC
I think you are mistaken as to how often this happens -- I didn't discover
it for constructed numbers like 35+2^-46 but for some random number.

Here's another sample with only 13 correct digits:

(1/Sqrt[2*Pi])*Exp[-(32486362294277461.5/2^46)^2/2] -->
    3.196270108727368905046765648427127746702004721034032742... × 10^-272

R gets...
> dnorm(2486362294277461.5/2^46,0,1)
[1] 3.196270108727435

(That's 35+1/3 rounded to nearest double, btw.)

Basically it following from the use of exp that this problem will exist
for all large x for which x*x suffers rounding errors.  The reason you
don't see the problem is your test data:

    x0 <- seq(2, 40, by=1/4)

*All* these numbers have <10 bits so x*x is error free.  Do "by=1/10" and
you will see the problem.
Comment 6 Martin Maechler 2014-01-01 14:54:37 UTC
(In reply to M Welinder from comment #5)
> I think you are mistaken as to how often this happens -- I didn't discover
> it for constructed numbers like 35+2^-46 but for some random number.
> 

You are right.  I had been mistaken, there.
Indeed it is typical that about 8 bit precision is lost (on my 64-bit platform) for "most" numbers around 35.
OTOH, the extra effort is not negligible (and you never claimed it would be).

It is a border case of what "should happen"... as in other cases, we are worse than 13 digits accuracy.

After another rounds of experiments, I now tend to *do* the change (more accurate but slower for |x| > 5), but keep a compilation option which prefers speed.

Martin