Bug 17132 - segfault when calling 'grepRaw(..., fixed = TRUE)'
Summary: segfault when calling 'grepRaw(..., fixed = TRUE)'
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P5 minor
Assignee: R-core
Depends on:
Reported: 2016-08-19 03:58 UTC by Kevin Ushey
Modified: 2016-11-05 20:32 UTC (History)
2 users (show)

See Also:

Proposed patch (1.03 KB, patch)
2016-08-22 14:07 UTC, Mikko Korpela
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Ushey 2016-08-19 03:58:31 UTC
Executing the following causes a segfault relatively consistently for me:

    grepRaw("abcdefghijkl", "a", all = TRUE, fixed = TRUE)

It seems like the segfault occurs only when the 'pattern' is longer than the input string, and `fixed = TRUE` has been specified.

Running a recent-ish R-devel built with clang-3.9 + sanitizers:

> grepRaw("abcdefghijkl", "a", all = TRUE, fixed = TRUE)
==18741==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61d00685f430 at pc 0x00010d7c020c bp 0x7fff541c9bd0 sp 0x7fff541c9bc8
READ of size 1 at 0x61d00685f430 thread T0
    #0 0x10d7c020b in fgrepraw1 (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x42c20b)
    #1 0x10d7b898b in do_grepraw (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x42498b)
    #2 0x10d709b7f in bcEval (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x375b7f)
    #3 0x10d6f61d9 in Rf_eval (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x3621d9)
    #4 0x10d74c44f in Rf_applyClosure (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x3b844f)
    #5 0x10d6f6d9e in Rf_eval (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x362d9e)
    #6 0x10d7f56b5 in Rf_ReplIteration (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x4616b5)
    #7 0x10d7fa090 in R_ReplConsole (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x466090)
    #8 0x10d7f9e87 in run_Rmainloop (/Users/kevin/r/r-devel-sanitizers/lib/R/lib/libR.dylib+0x465e87)
    #9 0x10ba33ea4 in main (/Users/kevin/r/r-devel-sanitizers/lib/R/bin/exec/R+0x100000ea4)
LLVMSymbolizer: error reading file: No object file for requested architecture
    #10 0x7fff99faa5ac  (/usr/lib/system/libdyld.dylib+0x35ac)
    #11 0x3  (<unknown module>)

> utils::sessionInfo()
R Under development (unstable) (2016-08-12 r71086)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: OS X El Capitan 10.11.6
Comment 1 Mikko Korpela 2016-08-22 14:07:40 UTC
Created attachment 2142 [details]
Proposed patch

I can reproduce the issue (on Linux). The problem occurs on R-devel (r71129) and various release versions of R (tested on 3.0.2, 3.2.5 and 3.3.1 patched r71063).

I think the cause of the segfault is an underflow of the unsigned variable 'n' in fgrepraw1(), defined in src/main/grep.c. This can be avoided by checking if 'pat' is longer than 'text'. The attached patch returns the "no match" solution in such cases.

Another problem fixed in the patch is the failure to match in some cases when a match is expected. This happens if 'pat' has more than 3 bytes (the "default" branch of fgrepraw1) and there is nothing in 'text' following the match.

Example with R-devel r71129:

> grepRaw("abcd", "abcd", fixed = TRUE)
> grepRaw("abcd", "abcde", fixed = TRUE)
[1] 1

Example with the patch applied:

> grepRaw("abcd", "abcd", fixed = TRUE)
[1] 1
> grepRaw("abcd", "abcde", fixed = TRUE)
[1] 1
Comment 2 Martin Maechler 2016-11-05 20:32:21 UTC
confirmed; incl. your patch.

Thank you very much ... fixed in R-devel  and soon 'R 3.3.2 patched'