Bug 15628 - dpois_raw accuracy
Summary: dpois_raw accuracy
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Accuracy (show other bugs)
Version: R-devel (trunk)
Hardware: Other All
: P5 enhancement
Assignee: R-core
Depends on:
Reported: 2014-01-03 16:55 UTC by M Welinder
Modified: 2014-01-10 04:38 UTC (History)
0 users

See Also:

Extended bd0 code (14.69 KB, text/x-csrc)
2014-01-10 04:38 UTC, M Welinder

Note You need to log in before you can comment on or make changes to this bug.
Description M Welinder 2014-01-03 16:55:56 UTC
The work horse for dpois, dgamma, and dchisq is dpois_raw.  It has
problems with accuracy when the result is small.  For example:

> options(digits=20)
> dpois(1400,1000)
[1] 1.466773344196698e-33

    1.466773344196568303089528189573057152277244031720195885... × 10^-33

I.e., only about 13 correct digits.

The basic problem is that the function works with log-probability.  Basically
it does


stirlerr is about 1/(12x) and thus harmless, but bd0 can get big.  In this
case it's about 71 and the rounding error on its value is blown up by exp.

It's not entirely clear to me how to solve this completely, but there are
things that can be done that will be faster and more accurate for certain
parameter ranges.  Assuming the non-log case:

x<=170 && lambda<690:

  This assumes an accurate fact(double) function.  An accurate gamma function
  can be used if also x>=1 (so that doing x+1 doesn't affect accuracy much).
  Note: pow(lambda,x) can overflow.  If that happens, bail out and try the
  next formula:

x<=170 && lambda<2*690:
  (f < 1 ? f*f/fact(x) : f/fact(x)*f) where f=pow(lambda,x/2)*exp(-lambda/2)

  Again, pow can overflow and that needs to be detected.

The above can be used for the log case too, except when the products

These ranges are too restrictive to help with the dpois(1400,1000) case,
but they will improve things like dpois(40,7.5).
Comment 1 M Welinder 2014-01-10 04:38:55 UTC
Created attachment 1552 [details]
Extended bd0 code

I think this problem can be solved with the attached code.  The code was
developed for Gnumeric, but should work more or less directly in R.

The code implements ebd0, an extended version of bd0, that returns its
result in two parts.  The basic approach is to scale x/M to become close
to 1 using a factor from a fixed set of factors whose logs are known
to higher-than-double precision.  I used ~100 bits precision which is
probably somewhat higher than needed.  Someone ought to check the values.

The result is split in whole and fractional part in a simple way.  One
could use Dekker addition for even higher accuracy.

With this, the main case of dpois_raw just becomes

    ebd0 (x, lambda, &yh, &yl);
    yl += stirlerr(x);
    f2 = M_2PI * x;

    return give_log
	    ? -yl - yh - 0.5 * log (f2)
	    : exp (-yl) * exp (-yh) / sqrt (f2);

Note, that dbinom_raw and dt appear to suffer from the same problem.