Bug 15774 - addition should return NA and integer overflow warning, but does not when compiled with clang
Summary: addition should return NA and integer overflow warning, but does not when com...
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Accuracy (show other bugs)
Version: R 3.1.0
Hardware: Other All
: P5 critical
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-04-23 23:07 UTC by Dan Tenenbaum
Modified: 2014-05-05 20:20 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Tenenbaum 2014-04-23 23:07:59 UTC
Apologies for duplication, since I posted this on R-sig-mac. But nobody there has said they would fix this and it seems to me to be rather serious. 

Also, since this bug affects a released version, I'd appreciate knowing what the plan is to fix it, i.e., will you make a point release that addresses it, or just ask people to use R-patched? 

This bug affects some Bioconductor packages built on Mavericks (the binaries are not yet publically available but the build report is) and I'm not sure if we should start distributing them until we know the way forward from this.

Thank you! Here's my bug report:

On R-3.1.0, Mavericks build:

> 1980000020L + 222000000L
[1] -2092967276

On R-3.1.0, Snow Leopard build:

> 1980000020L + 222000000L
[1] NA
Warning message:
In 1980000020L + 222000000L : NAs produced by integer overflow

A bug, perhaps?

Mavericks sessionInfo():

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

Snow Leopard sessionInfo():

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
Comment 1 Dan Tenenbaum 2014-04-27 17:14:34 UTC
I'm sorry to reopen this, but this is a very obvious and reproducible bug, and it's a very serious one that should not be closed with "works for me" and no comment. I have reproduced it on multiple machines.

I should also mention that you need to install the Mavericks binary:

http://cran.r-project.org/bin/macosx/R-3.1.0-mavericks.pkg

Building from source may or may not reproduce the bug because it may be due to the build tools used.

If you install this binary and then do this:

1980000020L + 222000000L

If you get the result:
[1] -2092967276

then you have just reproduced the bug. If you cannot reproduce the bug, I would appreciate you posting your sessionInfo() here.
Comment 2 Martin Maechler 2014-04-29 07:50:23 UTC
Notes mostly to Dan:
- Simon Urbanek seems to have assigned this to himself alone; so R-core does not even get e-mails.
- The person who *closed* your report as "WORKSFORME" has gmail address (vikramk<n>@gmail.com for an <n>) unknown to me.
- On R-SIG-Mac,  Luke Tierney
Comment 3 Martin Maechler 2014-04-29 07:59:19 UTC
(In reply to Martin Maechler from comment #2)
> Notes mostly to Dan:
> - Simon Urbanek seems to have assigned this to himself alone; so R-core does
> not even get e-mails.
> - The person who *closed* your report as "WORKSFORME" has gmail address
> (vikramk<n>@gmail.com for an <n>) unknown to me.

Actually he is one of the recent spammers. That these guys now *close* important bug reports is something new.  Very malicious!  (are they paid by ???)

- On R-SIG-Mac,  Luke Tierney  proposed that it must be a compiler bug that one hopefully could work around when building the binary.
I agree with Dan that this is severe and so am adding R-core to the recipient list.
Comment 4 Simon Urbanek 2014-04-29 14:32:31 UTC
I did't assign anything - this was filed as Mac GUI bug so those come to me. I can change the assignment now that more people care about Macs ;). And I suspect this is really a clang issue but I didn't have time to verify yet.
Comment 5 Simon Urbanek 2014-04-29 18:42:42 UTC
I can confirm that this is a clang bug - I can replicate it on Linux (Ubuntu clang version 3.3-5ubuntu4~precise1).
Comment 6 Luke Tierney 2014-04-29 19:37:13 UTC
It looks to me like the compiler is probably arranging for intermediete calculartions to be done in 64 bit integers so the sign changes that would happen with 32 bit overflow aren't being seen. There is code in arithmetic.c
to use doubles for the overflow check, unfortunately not in a way that can be turned on with a compiler flag. Using doubles used to be slower buut might no longer be. If there is no longer a significant speed penalty then we can use doubpes all the time; oherwise we could use a check for clang and, for now, use doubles if clang is being used to compile. A configure check for this behavior is another option but seems a lot of trouble.
Comment 7 Hervé Pagès 2014-05-04 04:50:53 UTC
Might not be easy to convince the clang developpers that this is a bug in their
optimizer since R_integer_plus() in src/main/arithmetic.c assumes wrapped
arithmetic on signed integers to detect overflow. According to the C language
specs, signed integer overflow is undefined behavior which means that code
relying on wrapped arithmetic on signed integers is not portable. Even if a
compiler seems to support it by default (like it's the case for gcc and clang),
it's not committed to it and anything could happen when some level of
optimization is requested.

Following the CERT C Coding Standard to detect overflow, it's easy to modify
R_integer_plus() to detect overflow in a portable way. See patch below. It
doesn't rely on wrapped arithmetic and doesn't use the "double trick" either
(which slows down R_integer_plus() by > 20% on my system).

Some quick benchmarks with 3.1.0 Patched (2014-05-03 r65519) on my system
(64-bit Ubuntu 12.04.4 laptop, gcc 4.6.3):

  set.seed(33)
  x <- sample(2147483647L, 88000000L, replace=TRUE)
  y <- sample(2147483647L, 88000000L, replace=TRUE)

Then:

- Original R_integer_plus() (assumes wrapped arithmetic on signed integers
  to detect overflow):

    > system.time(for (i in 1:10) z <- x + y)
       user  system elapsed 
      6.076   0.832   6.927 

- Patched R_integer_plus() (follows CERT C Coding Standard to detect
  overflow):

    > system.time(for (i in 1:10) z <- x + y)
       user  system elapsed 
      6.084   0.756   6.856 

  It's not slower. It's even a little bit faster. But most importantly, it
  doesn't break clang optimizer.

Here is the patch:

hpages@thinkpad:~/svn/R$ svn diff R-3-1-branch
Index: R-3-1-branch/src/main/arithmetic.c
===================================================================
--- R-3-1-branch/src/main/arithmetic.c	(revision 65519)
+++ R-3-1-branch/src/main/arithmetic.c	(working copy)
@@ -345,20 +345,20 @@
 	}							\
     } while(0)
 
+#define R_INT_MAX  INT_MAX
+#define R_INT_MIN -INT_MAX
+
 static R_INLINE int R_integer_plus(int x, int y, Rboolean *pnaflag)
 {
     if (x == NA_INTEGER || y == NA_INTEGER)
 	return NA_INTEGER;
-    else {
-	int z = x + y;
-	if (GOODISUM(x, y, z) && z != NA_INTEGER)
-	    return z;
-	else {
-	    if (pnaflag != NULL)
-		*pnaflag = TRUE;
-	    return NA_INTEGER;
-	}
+    if (((y > 0) && (x > (R_INT_MAX - y)))
+     || ((y < 0) && (x < (R_INT_MIN - y)))) {
+	if (pnaflag != NULL)
+	    *pnaflag = TRUE;
+	return NA_INTEGER;
     }
+    return x + y;
 }
 
 static R_INLINE int R_integer_minus(int x, int y, Rboolean *pnaflag)
Comment 8 Martin Maechler 2014-05-05 12:38:51 UTC
I've been able to reproduce the bug as well via clang on Fedora 19 (clang 3.3),
and confirmed that  Herve's patch fixes the problem.

Thank you!

Fix is svn rev 65526.
Comment 9 Hervé Pagès 2014-05-05 20:20:46 UTC
Thanks Martin for applying the patch.

Here is a follow-up (don't know how to re-opening that bug, should I open
a new one for this?)

R_integer_minus() in src/main/arithmetic.c also assumes wrapped arithmetic
on signed integers to detect overflow.

See patch below for a "CERT C Coding Standard" compliant R_integer_minus().
The patch also cleans up some macros in src/main/arithmetic.c that are no more
needed.

Here are some quick benchmarks with 3.1.0 Patched (2014-05-05 r65530) on my
system (64-bit Ubuntu 12.04.4 laptop, gcc 4.6.3):

  set.seed(33)
  x <- sample(2147483647L, 555000L, replace=TRUE)
  y <- - sample(2147483647L, 555000L, replace=TRUE)

Then:

- Original R_integer_minus() (assumes wrapped arithmetic on signed integers
  to detect overflow):

    > system.time(for (i in 1:100) z <- x - y)
       user  system elapsed 
      1.192   0.016   1.211 

- Patched R_integer_minus() (follows CERT C Coding Standard to detect
  overflow):

    > system.time(for (i in 1:100) z <- x - y)
       user  system elapsed 
      1.128   0.008   1.141 

  Like with R_integer_plus(), patched R_integer_minus() is slightly faster than
  the original. Some simple testing on Mavericks seem to indicate that the
  speedup is even more significant there.

Here is the patch:

hpages@thinkpad:~/svn/R$ svn diff R-3-1-branch
Index: R-3-1-branch/src/main/arithmetic.c
===================================================================
--- R-3-1-branch/src/main/arithmetic.c	(revision 65530)
+++ R-3-1-branch/src/main/arithmetic.c	(working copy)
@@ -317,22 +317,6 @@
    To be safe, we signal a compiler error if int is not 32 bits. */
 # error code requires that int have 32 bits
 #else
-/* Just to be on the safe side, configure ought to check that the
-   mashine uses two's complement. A define like
-#define USES_TWOS_COMPLEMENT (~0 == (unsigned) -1)
-   might work, but at least one compiler (CodeWarrior 6) chokes on it.
-   So for now just assume it is true.
-*/
-#define USES_TWOS_COMPLEMENT 1
-
-#if USES_TWOS_COMPLEMENT
-# define OPPOSITE_SIGNS(x, y) ((x < 0) ^ (y < 0))
-# define GOODISUM(x, y, z) (((x) > 0) ? ((y) < (z)) : ! ((y) < (z)))
-# define GOODIDIFF(x, y, z) (!(OPPOSITE_SIGNS(x, y) && OPPOSITE_SIGNS(x, z)))
-#else
-# define GOODISUM(x, y, z) ((double) (x) + (double) (y) == (z))
-# define GOODIDIFF(x, y, z) ((double) (x) - (double) (y) == (z))
-#endif
 #define GOODIPROD(x, y, z) ((double) (x) * (double) (y) == (z))
 #define INTEGER_OVERFLOW_WARNING _("NAs produced by integer overflow")
 #endif
@@ -367,16 +351,14 @@
 {
     if (x == NA_INTEGER || y == NA_INTEGER)
 	return NA_INTEGER;
-    else {
-	int z = x - y;
-	if (GOODIDIFF(x, y, z) && z != NA_INTEGER)
-	    return z;
-	else {
-	    if (pnaflag != NULL)
-		*pnaflag = TRUE;
-	    return NA_INTEGER;
-	}
+
+    if (((y < 0) && (x > (R_INT_MAX + y))) ||
+	((y > 0) && (x < (R_INT_MIN + y)))) {
+	if (pnaflag != NULL)
+	    *pnaflag = TRUE;
+	return NA_INTEGER;
     }
+    return x - y;
 }
 
 static R_INLINE int R_integer_times(int x, int y, Rboolean *pnaflag)