Bug 14267 - R crash with unique on large data frame containing POSIXct values
R crash with unique on large data frame containing POSIXct values
Status: RESOLVED FIXED
Product: R
Classification: Unclassified
Component: Low-level
R 2.11.0
ix86 (32-bit) Windows 32-bit
: P5 critical
Assigned To: R-core
: 14270 14290 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-04-23 11:41 UTC by Simon Carne
Modified: 2014-02-03 13:14 UTC (History)
5 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simon Carne 2010-04-23 11:41:20 UTC
The following code causes a crash on my system with R-2.11.0, using --vanilla. There is no message, the programm just terminates.


--- CUT HERE ---
dates <- seq(from = as.POSIXct("2004-01-01"),to = as.POSIXct("2010-01-01"),by = "day")
ints <- seq(10000)

set.seed(10203040)
nElements <- 1e6
k <- data.frame(A = sample(dates,nElements,replace = TRUE),B = sample(ints,nElements,replace = TRUE))
l <- unique(k)
-- CUT HERE ---

sessionInfo says (non-vanilla)
-- CUT HERE ---
R version 2.11.0 (2010-04-22) 
i386-pc-mingw32 

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base 
-- CUT HERE --
While the "unique" is running the task manager shows total memory use going up by several hundred MB, but it is still much less than this computer (which has 2 GB memory) has.

Thanks you very much and best wishes
Comment 1 Peter Dalgaard 2010-04-23 13:27:04 UTC
Confirmed on OSX Snow Leopard with 2.11 test build, but not 2.10.1 binary.

A little more information:

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00007fff5f3fffe0
0x00007fff8183c479 in __sfvwrite ()
(gdb) up
#1  0x00007fff8183c0c0 in __vfprintf ()
(gdb) 
#2  0x00007fff818a274f in sprintf_l ()
(gdb) 
#3  0x00007fff818a23be in _st_fmt ()
(gdb) 
#4  0x00007fff818a149d in strftime_l ()
(gdb) 
#5  0x000000010005ebf6 in do_formatPOSIXlt (call=<value temporarily unavailable, due to optimizations>, op=<value temporarily unavailable, due to optimizations>, args=<value temporarily unavailable, due to optimizations>, env=<value temporarily unavailable, due to optimizations>) at ../../../R/src/main/datetime.c:809
809			strftime(buff, 256, buf2, &tm);
(gdb) 
#6  0x00000001000dad66 in do_internal (call=<value temporarily unavailable, due to optimizations>, op=<value temporarily unavailable, due to optimizations>, args=0x1011b0670, env=0x101059e60) at ../../../R/src/main/names.c:1185
1185	    ans = PRIMFUN(INTERNAL(fun)) (s, INTERNAL(fun), args, env);
(gdb) 
#7  0x00000001000991b6 in Rf_eval (e=0x101059a38, rho=0x101059e60) at ../../../R/src/main/eval.c:464
464		    tmp = PRIMFUN(op) (e, op, CDR(e), rho);
(gdb) 
#8  0x000000010009deac in do_begin (call=0x1017d9fb8, op=0x100859990, args=0x101059a00, rho=0x101059e60) at ../../../R/src/main/eval.c:1245
1245		    s = eval(CAR(args), rho);
Comment 2 Duncan Murdoch 2010-04-23 15:38:43 UTC
This only affected the 2.11 branch; in R-devel the C99 local allocation was used and it was fine.
Comment 3 Duncan Murdoch 2010-04-23 16:02:15 UTC
I can confirm the bug in a current R-patched.  Running under gdb, I see 
a segfault somewhere in the C runtime lib, but I don't get a stack trace 
so it'll take a bit more time to locate it.  It might be a C stack 
overflow...

Duncan Murdoch


On 23/04/2010 6:41 AM, r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14267
> 
>            Summary: R crash with unique on large data frame containing
>                     POSIXct values
>            Product: R
>            Version: R 2.11.0
>           Platform: PC/x86
>         OS/Version: Windows
>             Status: NEW
>           Severity: critical
>           Priority: P5
>          Component: Low-level
>         AssignedTo: R-core@R-project.org
>         ReportedBy: simon.carne@web.de
>    Estimated Hours: 0.0
> 
> 
> The following code causes a crash on my system with R-2.11.0, using --vanilla.
> There is no message, the programm just terminates.
> 
> 
> --- CUT HERE ---
> dates <- seq(from = as.POSIXct("2004-01-01"),to = as.POSIXct("2010-01-01"),by =
> "day")
> ints <- seq(10000)
> 
> set.seed(10203040)
> nElements <- 1e6
> k <- data.frame(A = sample(dates,nElements,replace = TRUE),B =
> sample(ints,nElements,replace = TRUE))
> l <- unique(k)
> -- CUT HERE ---
> 
> sessionInfo says (non-vanilla)
> -- CUT HERE ---
> R version 2.11.0 (2010-04-22) 
> i386-pc-mingw32 
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
> [5] LC_TIME=German_Germany.1252    
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base 
> -- CUT HERE --
> While the "unique" is running the task manager shows total memory use going up
> by several hundred MB, but it is still much less than this computer (which has
> 2 GB memory) has.
> 
> Thanks you very much and best wishes
>



Comment 4 Duncan Murdoch 2010-04-23 16:44:31 UTC
This can be simplified:  it's actually a problem with formatting, not 
unique.  I get the same crash from

dates <- seq(from = as.POSIXct("2004-01-01"),to = 
as.POSIXct("2010-01-01"),by =
"day")

set.seed(10203040)
nElements <- 1e6
A <- sample(dates,nElements,replace = TRUE)
l <- format(A)

and it appears to happen in the internal format.POSIXlt code.

Duncan Murdoch

On 23/04/2010 6:41 AM, r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14267
> 
>            Summary: R crash with unique on large data frame containing
>                     POSIXct values
>            Product: R
>            Version: R 2.11.0
>           Platform: PC/x86
>         OS/Version: Windows
>             Status: NEW
>           Severity: critical
>           Priority: P5
>          Component: Low-level
>         AssignedTo: R-core@R-project.org
>         ReportedBy: simon.carne@web.de
>    Estimated Hours: 0.0
> 
> 
> The following code causes a crash on my system with R-2.11.0, using --vanilla.
> There is no message, the programm just terminates.
> 
> 
> --- CUT HERE ---
> dates <- seq(from = as.POSIXct("2004-01-01"),to = as.POSIXct("2010-01-01"),by =
> "day")
> ints <- seq(10000)
> 
> set.seed(10203040)
> nElements <- 1e6
> k <- data.frame(A = sample(dates,nElements,replace = TRUE),B =
> sample(ints,nElements,replace = TRUE))
> l <- unique(k)
> -- CUT HERE ---
> 
> sessionInfo says (non-vanilla)
> -- CUT HERE ---
> R version 2.11.0 (2010-04-22) 
> i386-pc-mingw32 
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
> [5] LC_TIME=German_Germany.1252    
> 
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base 
> -- CUT HERE --
> While the "unique" is running the task manager shows total memory use going up
> by several hundred MB, but it is still much less than this computer (which has
> 2 GB memory) has.
> 
> Thanks you very much and best wishes
>



Comment 5 Duncan Murdoch 2010-04-23 18:20:26 UTC
This was a stack overflow caused by using alloca() in a loop.  I'll 
commit a fix soon, after testing.

Duncan Murdoch


On 23/04/2010 8:27 AM, r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14267
>
>
> Peter Dalgaard <pd.mes@cbs.dk> changed:
>
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |pd.mes@cbs.dk
>
>
>
>
> --- Comment #1 from Peter Dalgaard <pd.mes@cbs.dk>  2010-04-23 08:27:04 ---
> Confirmed on OSX Snow Leopard with 2.11 test build, but not 2.10.1 binary.
>
> A little more information:
>
> Program received signal EXC_BAD_ACCESS, Could not access memory.
> Reason: KERN_PROTECTION_FAILURE at address: 0x00007fff5f3fffe0
> 0x00007fff8183c479 in __sfvwrite ()
> (gdb) up
> #1  0x00007fff8183c0c0 in __vfprintf ()
> (gdb) 
> #2  0x00007fff818a274f in sprintf_l ()
> (gdb) 
> #3  0x00007fff818a23be in _st_fmt ()
> (gdb) 
> #4  0x00007fff818a149d in strftime_l ()
> (gdb) 
> #5  0x000000010005ebf6 in do_formatPOSIXlt (call=<value temporarily
> unavailable, due to optimizations>, op=<value temporarily unavailable, due to
> optimizations>, args=<value temporarily unavailable, due to optimizations>,
> env=<value temporarily unavailable, due to optimizations>) at
> ../../../R/src/main/datetime.c:809
> 809            strftime(buff, 256, buf2, &tm);
> (gdb) 
> #6  0x00000001000dad66 in do_internal (call=<value temporarily unavailable, due
> to optimizations>, op=<value temporarily unavailable, due to optimizations>,
> args=0x1011b0670, env=0x101059e60) at ../../../R/src/main/names.c:1185
> 1185        ans = PRIMFUN(INTERNAL(fun)) (s, INTERNAL(fun), args, env);
> (gdb) 
> #7  0x00000001000991b6 in Rf_eval (e=0x101059a38, rho=0x101059e60) at
> ../../../R/src/main/eval.c:464
> 464            tmp = PRIMFUN(op) (e, op, CDR(e), rho);
> (gdb) 
> #8  0x000000010009deac in do_begin (call=0x1017d9fb8, op=0x100859990,
> args=0x101059a00, rho=0x101059e60) at ../../../R/src/main/eval.c:1245
> 1245            s = eval(CAR(args), rho);
>
>



Comment 6 Peter Dalgaard 2010-04-25 09:21:52 UTC
*** Bug 14270 has been marked as a duplicate of this bug. ***
Comment 7 Peter Dalgaard 2010-05-10 09:19:32 UTC
*** Bug 14290 has been marked as a duplicate of this bug. ***
Comment 8 Alexa 2014-02-03 01:52:56 UTC
*** Bug 260998 has been marked as a duplicate of this bug. ***
Seen live from the domain http://volichat.com/adult-chat-rooms
Marked for reference. Resolved as fixed @bugzilla.
Comment 9 Martin Maechler 2014-02-03 13:14:36 UTC
(In reply to Alexa from comment #8)
> *** Bug 260998 has been marked as a duplicate of this bug. ***
> Seen live from the domain http://volichat.com/adult-chat-rooms
> Marked for reference. Resolved as fixed @bugzilla.