Bug 14362 - writeBin silently produces incorrect output for enormous objects
Summary: writeBin silently produces incorrect output for enormous objects
Alias: None
Product: R
Classification: Unclassified
Component: I/O (show other bugs)
Version: R 2.11.1 patched
Hardware: ix86 (32-bit) All
: P5 normal
Assignee: R-core
Depends on:
Reported: 2010-08-18 18:03 UTC by Richard Bourgon
Modified: 2010-08-19 17:04 UTC (History)
0 users

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Richard Bourgon 2010-08-18 18:03:33 UTC
It seems that the writeBin C code makes a memcpy call without checking for overflow when multiplying the length of an object ("len") by the number of bytes per element ("size"). 

For objects large enough to cause this overflow, if the value of "size * len" (as a signed int) is negative, you get a seg fault and at least know you have a problem. If it's non-negative, though, writeBin succeeds but silently leaves a portion of the output as 0. 

An example:

# Setting n to 2^28 will generate a segfault in memcpy().
# Assuming .Machine$integer.max is 2^31 - 1.

n <- as.integer( 2^29 )
x <- rep( 1.0, n )
writeBin( x, "test_data.bin" )
y <- readBin( "test_data.bin", numeric(), n )
all( y == 0 ) # TRUE

First noticed on R version 2.11.1 (2010-05-31), x86_64-unknown-linux-gnu.
Comment 1 Brian Ripley 2010-08-19 17:04:36 UTC
Such attempts are disallowed in 2.12.0.  They cannot work for RAW output,
and for a connection there is no need to write more than 2GB in a single step.