Bug 15162 - Underflow error in pt and pf
Underflow error in pt and pf
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Accuracy
R 2.15.2 patched
All All
: P5 normal
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-01-09 21:21 UTC by Roby Joehanes
Modified: 2013-03-28 20:17 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Roby Joehanes 2013-01-09 21:21:03 UTC
> pt(45, df=5253, lower.tail=FALSE, log.p=TRUE)
[1] -Inf

But:
> pt(50, df=5253, lower.tail=FALSE, log.p=TRUE)
[1] -1027.082

This cannot be true. The same situation happens to pf:

> pf(45^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE)
[1] -Inf

> pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE)
[1] -1026.389

Plus, the results for pt and pf given the same value should be the same. That is, the output of this:
> pt(50, df=5253, lower.tail=FALSE, log.p=TRUE)

should be the same with that of this:
> pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE)

But it does not.

The problem for pt turns out to be an underflow in bgrat function in nmath/toms708.c file. But for values of 50, the function being used to compute the approximation is bpser (label 110) and it works.

Not sure about pf yet.
Comment 1 Roby Joehanes 2013-01-09 22:01:25 UTC
I think a fix for this is that right after the following statement:
bgrat(b0, a0, y0, x0, w1, 15*eps, &ierr1);

We add: if(w1 <= 0) goto L110;

Let me know whether this is appropriate or not.
Comment 2 Roby Joehanes 2013-01-10 00:10:10 UTC
To make the results of pt and pf the same, modify the following lines from nmath/pt.c:

	val = (n > x * x)
	    ? pbeta (x * x / (n + x * x), 0.5, n / 2., /*lower_tail*/0, log_p)
	    : pbeta (1. / nx,             n / 2., 0.5, /*lower_tail*/1, log_p);


to:

	val = (n > x * x)
	    ? pbeta (x * x / (n + x * x), 0.5, n / 2., lower_tail, log_p)
	    : pbeta (1. / nx,             n / 2., 0.5, !lower_tail, log_p);

Now we have the same value.
Comment 3 Martin Maechler 2013-01-11 08:43:10 UTC
Thank you for the succinct report and you the patch proposals!

I am assigning this bug to myself, (I hope, though the bugzilla interface behaves un-intuitive here)
but will need a couple of days before getting to this,
as really I'll want to check the situation and compare with my somewhat extensive 
"test suite".

Martin
Comment 4 Martin Maechler 2013-03-04 17:23:48 UTC
(In reply to comment #3)
> Thank you for the succinct report and you the patch proposals!
> 
> I am assigning this bug to myself, (I hope, though the bugzilla interface
> behaves un-intuitive here)
> but will need a couple of days before getting to this,
> as really I'll want to check the situation and compare with my somewhat
> extensive 
> "test suite".
> 
> Martin

First: as you hinted, too,  this is all about  pbeta() really, which is called
      from pt() and pf()

Your proposal (to change in src/nmath/toms708.c )

   if(w1 <= 0) goto L110;

would have let to very inefficient (aka S.L.O.W) code
for larger 'df', e.g.,

df <- 1e10
x2 <- 1450; pbeta(x2/(df + x2), 0.5, df/2, lower.tail=FALSE, log.p=TRUE)

would take about 5 sec..
I have gone through much of bratio(), notably got rid of many more 'goto'
(in order to allow translation to more modern comp languages),
and have introduced more careful rescaling, notably working in log scale in even more places.
Finally this bug is fixed...
I'd be happy if other  pbeta() related bugs were reported now, rather than in a few months time...

Martin
Comment 5 Roby Joehanes 2013-03-14 21:41:52 UTC
I'm currently in the middle of code review on toms708.c that I took from R-alpha_2013-03-13_r62247.tar.gz.

I noticed that in function bup, there are some lines:
    ret_val = log_p
	? brcmp1(mu, a, b, x, y, TRUE) - log(a)
	: brcmp1(mu, a, b, x, y, FALSE)  / a;
    if (n == 1 || (log_p && ret_val == ML_NEGINF) || (!log_p && ret_val == 0.))
	return ret_val;

These "log_p"s are undefined. Shouldn't it be "give_log" instead?
Comment 6 Roby Joehanes 2013-03-14 23:21:25 UTC
Also, another question: How about my proposed change to pt.c that addressed the apparent discrepancy between pf and pt? I would expect the output of this:
> pt(50, df=5253, lower.tail=FALSE, log.p=TRUE)

to be the same with that of this:
> pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE)

It does not seem to be applied in the latest alpha source (R-alpha_2013-03-13_r62247.tar.gz)
Comment 7 Martin Maechler 2013-03-15 08:55:33 UTC
(In reply to comment #5)
> I'm currently in the middle of code review on toms708.c that I took from
> R-alpha_2013-03-13_r62247.tar.gz.
> 
> I noticed that in function bup, there are some lines:
>     ret_val = log_p
>     ? brcmp1(mu, a, b, x, y, TRUE) - log(a)
>     : brcmp1(mu, a, b, x, y, FALSE)  / a;
>     if (n == 1 || (log_p && ret_val == ML_NEGINF) || (!log_p && ret_val == 0.))
>     return ret_val;
> 
> These "log_p"s are undefined. Shouldn't it be "give_log" instead?


Well, they are identical (hint: the compiler would have complained !!),
but you are right that --- for better readability ---
we should use  give_log there.

As the consequence 'nil'  (identical compiled code!),
I'll not commit such a change now... but only once there is need for another tweak.
Comment 8 Martin Maechler 2013-03-15 08:57:31 UTC
(In reply to comment #6)
> Also, another question: How about my proposed change to pt.c that addressed the
> apparent discrepancy between pf and pt? I would expect the output of this:
> > pt(50, df=5253, lower.tail=FALSE, log.p=TRUE)
> 
> to be the same with that of this:
> > pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE)
> 
> It does not seem to be applied in the latest alpha source
> (R-alpha_2013-03-13_r62247.tar.gz)

Hopefully not!  Your statement and your proposed patch
were completely wrong.

You have to redo your math, to see that the above should *NOT* be the same
(but differ by a somewhat well known mathematical constant) !!
Comment 9 Roby Joehanes 2013-03-28 20:17:06 UTC
Ah, you are right. They should differ by a factor of 0.5. Thanks.