Bug 15810 - Negative zero in labels returned by cut()
Summary: Negative zero in labels returned by cut()
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: I/O (show other bugs)
Version: R 3.1.0
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-05-24 18:06 UTC by J. R. M. Hosking
Modified: 2015-12-24 14:02 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description J. R. M. Hosking 2014-05-24 18:06:33 UTC
> levels(cut(0, breaks=c(-1, round(-0.1), 1)))
[1] "(-1,-0]" "(-0,1]"

Is the "-0" in the labels intended?  It is surprising, given that
printing 'round(-0.1)' gives "0" rather than "-0".

The "-0" occurs because the numeric values in the labels are obtained
by formatC().  If this behaviour is deemed to be a bug, one possible
fix is, in cut.default(), in the line

   ch.br <- formatC(breaks, digits = dig, width = 1L)

replace 'breaks' by 'breaks + 0'.

[The use case that exhibited this behaviour was an attempt to generate
groups of approximately equal size with round numbers for the breaks,
as in

  z <- diff(faithful$eruptions)
  qz <- quantile(z)
  table(cut(z, c(floor(qz[1]), round(qz[2:4]), ceiling(qz[5]))))

]

> sessionInfo()
R version 3.1.0 (2014-04-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base


R-devel [R Under development (unstable) (2014-05-20 r65701)]
gives the same result.
Comment 1 Brian Ripley 2014-05-26 16:29:11 UTC
Most implementations of C follow IEC60559 and have signed zeros.  Although R does not make use of them, it is still a C program and here you are seeing the results of (C-level) sprintf("%.0f", -0).

Given the description

     Formatting numbers individually and flexibly, using ‘C’ style
     format specifications.

it is intended for formatC, at least.
Comment 2 Sander Maijers 2015-12-24 14:02:11 UTC
This is not an I/O bug, so the Component is wrong.