Bugzilla – Bug 15162

Underflow error in pt and pf

Last modified: 2013-03-28 20:17:06 UTC

> pt(45, df=5253, lower.tail=FALSE, log.p=TRUE) [1] -Inf But: > pt(50, df=5253, lower.tail=FALSE, log.p=TRUE) [1] -1027.082 This cannot be true. The same situation happens to pf: > pf(45^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE) [1] -Inf > pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE) [1] -1026.389 Plus, the results for pt and pf given the same value should be the same. That is, the output of this: > pt(50, df=5253, lower.tail=FALSE, log.p=TRUE) should be the same with that of this: > pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE) But it does not. The problem for pt turns out to be an underflow in bgrat function in nmath/toms708.c file. But for values of 50, the function being used to compute the approximation is bpser (label 110) and it works. Not sure about pf yet.

I think a fix for this is that right after the following statement: bgrat(b0, a0, y0, x0, w1, 15*eps, &ierr1); We add: if(w1 <= 0) goto L110; Let me know whether this is appropriate or not.

To make the results of pt and pf the same, modify the following lines from nmath/pt.c: val = (n > x * x) ? pbeta (x * x / (n + x * x), 0.5, n / 2., /*lower_tail*/0, log_p) : pbeta (1. / nx, n / 2., 0.5, /*lower_tail*/1, log_p); to: val = (n > x * x) ? pbeta (x * x / (n + x * x), 0.5, n / 2., lower_tail, log_p) : pbeta (1. / nx, n / 2., 0.5, !lower_tail, log_p); Now we have the same value.

Thank you for the succinct report and you the patch proposals! I am assigning this bug to myself, (I hope, though the bugzilla interface behaves un-intuitive here) but will need a couple of days before getting to this, as really I'll want to check the situation and compare with my somewhat extensive "test suite". Martin

(In reply to comment #3) > Thank you for the succinct report and you the patch proposals! > > I am assigning this bug to myself, (I hope, though the bugzilla interface > behaves un-intuitive here) > but will need a couple of days before getting to this, > as really I'll want to check the situation and compare with my somewhat > extensive > "test suite". > > Martin First: as you hinted, too, this is all about pbeta() really, which is called from pt() and pf() Your proposal (to change in src/nmath/toms708.c ) if(w1 <= 0) goto L110; would have let to very inefficient (aka S.L.O.W) code for larger 'df', e.g., df <- 1e10 x2 <- 1450; pbeta(x2/(df + x2), 0.5, df/2, lower.tail=FALSE, log.p=TRUE) would take about 5 sec.. I have gone through much of bratio(), notably got rid of many more 'goto' (in order to allow translation to more modern comp languages), and have introduced more careful rescaling, notably working in log scale in even more places. Finally this bug is fixed... I'd be happy if other pbeta() related bugs were reported now, rather than in a few months time... Martin

I'm currently in the middle of code review on toms708.c that I took from R-alpha_2013-03-13_r62247.tar.gz. I noticed that in function bup, there are some lines: ret_val = log_p ? brcmp1(mu, a, b, x, y, TRUE) - log(a) : brcmp1(mu, a, b, x, y, FALSE) / a; if (n == 1 || (log_p && ret_val == ML_NEGINF) || (!log_p && ret_val == 0.)) return ret_val; These "log_p"s are undefined. Shouldn't it be "give_log" instead?

Also, another question: How about my proposed change to pt.c that addressed the apparent discrepancy between pf and pt? I would expect the output of this: > pt(50, df=5253, lower.tail=FALSE, log.p=TRUE) to be the same with that of this: > pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE) It does not seem to be applied in the latest alpha source (R-alpha_2013-03-13_r62247.tar.gz)

(In reply to comment #5) > I'm currently in the middle of code review on toms708.c that I took from > R-alpha_2013-03-13_r62247.tar.gz. > > I noticed that in function bup, there are some lines: > ret_val = log_p > ? brcmp1(mu, a, b, x, y, TRUE) - log(a) > : brcmp1(mu, a, b, x, y, FALSE) / a; > if (n == 1 || (log_p && ret_val == ML_NEGINF) || (!log_p && ret_val == 0.)) > return ret_val; > > These "log_p"s are undefined. Shouldn't it be "give_log" instead? Well, they are identical (hint: the compiler would have complained !!), but you are right that --- for better readability --- we should use give_log there. As the consequence 'nil' (identical compiled code!), I'll not commit such a change now... but only once there is need for another tweak.

(In reply to comment #6) > Also, another question: How about my proposed change to pt.c that addressed the > apparent discrepancy between pf and pt? I would expect the output of this: > > pt(50, df=5253, lower.tail=FALSE, log.p=TRUE) > > to be the same with that of this: > > pf(50^2, df1=1, df2=5253, lower.tail=FALSE, log.p=TRUE) > > It does not seem to be applied in the latest alpha source > (R-alpha_2013-03-13_r62247.tar.gz) Hopefully not! Your statement and your proposed patch were completely wrong. You have to redo your math, to see that the above should *NOT* be the same (but differ by a somewhat well known mathematical constant) !!

Ah, you are right. They should differ by a factor of 0.5. Thanks.