Bug 16925 - segfault / ASAN error on call to sort()
Summary: segfault / ASAN error on call to sort()
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R 3.3.*
Hardware: All OS X El Capitan
: P5 major
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-05-27 18:40 UTC by Kevin Ushey
Modified: 2016-10-10 15:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Ushey 2016-05-27 18:40:44 UTC
I see an ASAN error when calling `sort()` with a version of R-devel built from sources on OS X, using clang-3.8 + the associated UBSAN / ASAN sanitizers.

Given the code:

    data <- c(2147483645L, 2147483646L, 2147483647L, 2147483644L)
    sort(data, decreasing = TRUE, method = "radix")

I see:

> data <- c(2147483645L, 2147483646L, 2147483647L, 2147483644L)
> sort(data, decreasing = TRUE, method = "radix")
radixsort.c:238:43: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int'
SUMMARY: AddressSanitizer: undefined-behavior radixsort.c:238:43 in
radixsort.c:276:17: runtime error: signed integer overflow: -2147483648 + -2147483645 cannot be represented in type 'int'
SUMMARY: AddressSanitizer: undefined-behavior radixsort.c:276:17 in
radixsort.c:295:10: runtime error: signed integer overflow: -2147483648 + -2147483644 cannot be represented in type 'int'
SUMMARY: AddressSanitizer: undefined-behavior radixsort.c:295:10 in
[1] 2147483647 2147483646 2147483645 2147483644
>

This manifests as a segfault on some versions of R built from source on OS X; for example, those distributed through Homebrew. (See e.g. https://github.com/rstudio/shiny/issues/1200#issuecomment-222208173 for an example in the wild)

Relevant compilation flags:

CC = clang-3.8 -std=gnu99 -fsanitize=address,undefined -fno-omit-frame-pointer -fno-sanitize=float-divide-by-zero
CFLAGS = -g -Wall -pedantic $(LTO)

---

> sessionInfo()
R Under development (unstable) (2016-03-17 r70572)
Platform: x86_64-apple-darwin15.4.0 (64-bit)
Running under: OS X 10.11.5 (El Capitan)

For reference, R was configured with these flags (from Makeconf):

# configure  '--with-blas=-L/usr/local/opt/openblas/lib -lopenblas' '--with-lapack=-L/usr/local/opt/lapack/lib -llapack' '--with-cairo' '--disable-R-framework' '--enable-R-shlib' '--with-readline' '--enable-R-profiling' '--enable-memory-profiling' '--with-valgrind-instrumentation=2' '--without-internal-tzcode' '--prefix=/Users/kevin/r/r-devel-sanitizers' 'PKG_CONFIG_PATH=/opt/X11/lib/pkgconfig'
Comment 1 Michael Lawrence 2016-05-27 19:39:11 UTC
Thanks for the report. Yes, there is a bug when decreasing=TRUE, na.last=TRUE and the vector contains max integer. The easiest thing would be to add a check that falls back to the general radix sort, instead of the count sort shortcut. Another option is to defer the guilty addition inside of a loop, which might slow things down in general. I've alerted Matt Dowle to the issue, as I am interested in his opinion.
Comment 2 Michael Lawrence 2016-05-28 04:48:08 UTC
Matt has fixed data.table (thanks!), and I have ported his patch to base R. Will commit soon.
Comment 3 Suharto Anggono 2016-10-10 15:42:58 UTC
Currently, the fix is in R devel only, not in R patched.