Bug 15988 - Bug in function hist with wide extreme breakpoints
Summary: Bug in function hist with wide extreme breakpoints
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Graphics (show other bugs)
Version: R 3.1.0
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 major
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-09-21 13:26 UTC by lucatrv
Modified: 2014-10-01 02:20 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description lucatrv 2014-09-21 13:26:34 UTC
I believe I hit a bug in function hist. This happens when setting the extreme breakpoints very wide (for instance to +/- Inf, or to +/- 1e9) with a number of bins between 2 and 4. I summarize below how to reproduce the issue. I am using R 3.1.0 64-bit under Windows 64-bit.

x <- runif(100)

hist(x, breaks = c(-10, 10), plot = FALSE)$counts
[1] 100
OK

hist(x, breaks = c(-Inf, Inf), plot = FALSE)$counts
[1] 100
OK

hist(x, breaks = c(-10, 0.5, 10), plot = FALSE)$counts
[1] 48 52
OK

hist(x, breaks = c(-Inf, 0.5, Inf), plot = FALSE)$counts
[1] 100   0
BUG?

hist(x, breaks = c(-10, 0.33, 0.66, 10), plot = FALSE)$counts
[1] 30 30 40
OK

hist(x, breaks = c(-Inf, 0.33, 0.66, Inf), plot = FALSE)$counts
[1] 100   0   0
BUG?

hist(x, breaks = c(-10, 0.25, 0.5, 0.75, 10), plot = FALSE)$counts
[1] 25 23 24 28
OK

hist(x, breaks = c(-Inf, 0.25, 0.5, 0.75, Inf), plot = FALSE)$counts
[1] 100   0   0   0
BUG?

hist(x, breaks = c(-10, 0.2, 0.4, 0.6, 0.8, 10), plot = FALSE)$counts
[1] 21 14 20 23 22
OK

hist(x, breaks = c(-Inf, 0.2, 0.4, 0.6, 0.8, Inf), plot = FALSE)$counts
[1] 21 14 20 23 22
OK

The issue does not happen with a number of bins greater than 4.

The issue happens also when setting the extreme breakpoints to a large finite value (for instance to +/- 1e9).
Comment 1 Duncan Murdoch 2014-09-25 11:23:11 UTC
This is an unfortunate consequence of documented behaviour:  "A numerical tolerance of 1e-7 times the median bin size is applied when counting entries on the edges of bins."  In case of 2 or 3 bins, your median bin size is infinite, so the tolerance is infinite as well.
Comment 2 Martin Maechler 2014-09-30 16:51:01 UTC
Inspite of the fact that very strictly speaking it is not a bug, but documented behavior,  we have after quite a bit off line discussion decided
to fix the problem so that all your cases (and more) do work the same for
outer boundaries (-10, 10) or (-1e9, 1e9) or (-Inf, Inf)
Comment 3 lucatrv 2014-10-01 02:20:21 UTC
Thanks for your feedback, I think this is a wise decision.