Bug 15885 - encodeString segfaults with long character strings
Summary: encodeString segfaults with long character strings
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R-devel (trunk)
Hardware: x86_64/x64/amd64 (64-bit) Linux
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-07-16 15:21 UTC by Robert McGehee
Modified: 2017-08-01 13:59 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert McGehee 2014-07-16 15:21:27 UTC
In R 3.11.1 (also in R 3.0.3), the function encodeString causes a 'memory not mapped' segfault if the character string has more than about 5e8 characters. I got this segfault first when running str on a large string generated by serialize, and reduced the problem to one with encodeString.

Here is how to reproduce:

> r <- rep("test ", 1e8)
> txt <- paste(r, collapse="")
> nchar(txt)
[1] 500000000
> en <- encodeString(txt)

 *** caught segfault ***
address 0x5f95000, cause 'memory not mapped'

 *** caught bus error ***
address (nil), cause 'unknown'
Segmentation fault

R.version
               _                           
platform       x86_64-unknown-linux-gnu    
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          1.1                         
year           2014                        
month          07                          
day            10                          
svn rev        66115                       
language       R                           
version.string R version 3.1.1 (2014-07-10)
nickname       Sock it to Me
Comment 1 Martin Maechler 2017-08-01 13:26:55 UTC
This bug / infelicity  is still present in today's R (incl R-devel).

It takes a few seconds to run.

As usual: It is an integer overflow.

I'm fixing it currently.. does not seem to be too hard.
Comment 2 Martin Maechler 2017-08-01 13:59:30 UTC
Fix committed to R-devel (svn r73008)  and  R 3.4.1-patched (r73009)