Bug 14368 - when using task callbacks, sourcing empty file crashes R
when using task callbacks, sourcing empty file crashes R
Status: RESOLVED FIXED
Product: R
Classification: Unclassified
Component: Misc
R 2.11.1 patched
All All
: P3 normal
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-08-27 04:45 UTC by Tony Plate
Modified: 2010-09-08 19:49 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tony Plate 2010-08-27 04:45:06 UTC
This (admittedly obscure) sequence of commands seems to reliably crash Rgui.exe --vanilla under windows, and has crashed R --vanilla under Linux:

sessionInfo()
cat("", file="empty.R")
cb <- function(expr, value, ok, visible) {cat("in callback\n"); browser(); TRUE}
addTaskCallback(cb)

source("empty.R")

The essential elements of the above to cause the crash are, as far as I can tell:
(1) a task callback function
(2) source()'ing an empty file
The above sequence of commands runs ok if the call to browser() is omitted from the callback function, but I've been seeing crashes of this nature when using a different callback function with call to browser().  The above is the result of my attempts to distill this down to a minimum reproducible example.

Transcript:

> sessionInfo()
R version 2.11.1 Patched (2010-08-17 r52799)
Platform: i386-pc-mingw32 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252  
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                         
[5] LC_TIME=English_United States.1252   

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base    
> cat("", file="empty.R")
> cb <- function(expr, value, ok, visible) {cat("in callback\n"); browser(); TRUE}
> addTaskCallback(cb)
1
1
in callback
Called from: function (expr, value, ok, visible)
{
    cat("in callback\n")
    browser()
    TRUE
}(quote(addTaskCallback(cb)), 1L, TRUE, TRUE)
Browse[1]>
> source("empty.R")

now Rgui.exe crashed with a dialog box "R for Windows GUI front-end has encountered a problem and needs to close.  We are sorry for the inconvenience. ... Please tell Microsoft about this problem...

The same behavior also occurs in a much older version of R (2.6.0).

Under Linux, R crashes with a seg fault.
Comment 1 Tony Plate 2010-08-28 01:20:44 UTC
Some debugger output:

$ gdb Rgui
GNU gdb 6.8.0.20080328-cvs (cygwin-special)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-cygwin"...
(gdb) run
Starting program: /cygdrive/c/Rbuild/R-2.11.1/bin/Rgui
[New thread 4708.0x240c]

Program received signal SIGSEGV, Segmentation fault.
0x6c8159de in deparse2buff (s=0x0, d=0xe1d398) at deparse.c:663
663         if (IS_S4_OBJECT(s)) d->isS4 = TRUE;
(gdb) where
#0  0x6c8159de in deparse2buff (s=0x0, d=0xe1d398) at deparse.c:663
#1  0x6c8159de in deparse2buff (s=0x3f5358, d=0x0) at deparse.c:663
Comment 2 Tony Plate 2010-08-30 12:58:27 UTC
A slightly simpler callback function that produces the crash just tries to deparse() the expression instead of calling browser():

> cat("", file="empty.R")
> cb <- function(expr, value, ok, visible) {cat("expr is", deparse(expr), "\n"); TRUE}cb <- function(expr, value, ok, visible) {cat("in callback\n"); browser();
> addTaskCallback(cb)
> source("empty.R")
Comment 3 Tony Plate 2010-08-30 13:00:30 UTC
[Editing got messed up in last comment]

A slightly simpler callback function that produces the crash just tries to
deparse() the expression instead of calling browser():

cat("", file="empty.R")
cb <- function(expr, value, ok, visible) {cat("expr is", deparse(expr), "\n"); TRUE}
addTaskCallback(cb)
source("empty.R")
Comment 4 Brian Ripley 2010-08-31 11:55:39 UTC
The crucial fact was not in the subject line ....
Comment 5 Tony Plate 2010-08-31 12:33:34 UTC
(In reply to comment #4)
> The crucial fact was not in the subject line ....

ok, ..., so what's the "crucial fact"?  I see "sourcing", "callback", & "crash".  "deparsing" is also involved, but it works OK to deparse the result of sourcing an empty file.
Comment 6 Tony Plate 2010-08-31 12:39:03 UTC
Ok, I see now that "callback" was added to the subject line.
Using a callback does seem to be crucial to triggering this bug.

(In reply to comment #5)
> (In reply to comment #4)
> > The crucial fact was not in the subject line ....
> 
> ok, ..., so what's the "crucial fact"?  I see "sourcing", "callback", &
> "crash".  "deparsing" is also involved, but it works OK to deparse the result
> of sourcing an empty file.
Comment 7 Tony Plate 2010-09-04 05:41:09 UTC
Here's a gdb dump from R under ubuntu after the call to source("empty.R").
It shows that Rf_callToplevelHandlers is being called with expr=0x0

Program received signal SIGSEGV, Segmentation fault.
0x081a147d in Rf_eval (e=0x83b1618, rho=0x870f0c4) at eval.c:434
434		    SET_NAMED(tmp, 2);
(gdb) where
#0  0x081a147d in Rf_eval (e=0x83b1618, rho=0x870f0c4) at eval.c:434
#1  0x081a120f in forcePromise (e=0x870f498) at eval.c:331
#2  0x081a14eb in Rf_eval (e=0x870f498, rho=0x870f5b0) at eval.c:445
#3  0x081a1477 in Rf_eval (e=0x83b1618, rho=0x870f5b0) at eval.c:433
#4  0x081a43ab in Rf_evalList (el=0x8691bac, rho=0x870f5b0, call=0x8691c00, n=1) at eval.c:1583
#5  0x08070153 in do_internal (call=0x8691c00, op=0x83792f4, args=0x8691bac, env=0x870f5b0) at names.c:1181
#6  0x081a163f in Rf_eval (e=0x8691c00, rho=0x870f5b0) at eval.c:464
#7  0x081a1f68 in Rf_applyClosure (call=0x877bff8, op=0x8691f48, arglist=0x870f4b4, rho=0x870f0c4, 
    suppliedenv=0x838d574) at eval.c:699
#8  0x081a18b1 in Rf_eval (e=0x877bff8, rho=0x870f0c4) at eval.c:508
#9  0x081a120f in forcePromise (e=0x870f0fc) at eval.c:331
#10 0x081a14eb in Rf_eval (e=0x870f0fc, rho=0x870f2a0) at eval.c:445
#11 0x081a4279 in Rf_evalList (el=0x8a277c8, rho=0x870f2a0, call=0x8a277ac, n=1) at eval.c:1567
#12 0x081a171d in Rf_eval (e=0x8a277ac, rho=0x870f2a0) at eval.c:483
#13 0x081a43ab in Rf_evalList (el=0x8a27790, rho=0x870f2a0, call=0x8a2773c, n=1) at eval.c:1583
#14 0x08070153 in do_internal (call=0x8a2773c, op=0x83792f4, args=0x8a27790, env=0x870f2a0) at names.c:1181
#15 0x081a163f in Rf_eval (e=0x8a2773c, rho=0x870f2a0) at eval.c:464
#16 0x081a36b0 in do_begin (call=0x8a27fb4, op=0x8379700, args=0x8a27720, rho=0x870f2a0) at eval.c:1245
#17 0x081a163f in Rf_eval (e=0x8a27fb4, rho=0x870f2a0) at eval.c:464
#18 0x081a1f68 in Rf_applyClosure (call=0x877c0d8, op=0x8a27e9c, arglist=0x870f118, rho=0x870f0c4, 
    suppliedenv=0x838d574) at eval.c:699
#19 0x081a18b1 in Rf_eval (e=0x877c0d8, rho=0x870f0c4) at eval.c:508
#20 0x081a36b0 in do_begin (call=0x877c0f4, op=0x8379700, args=0x877c110, rho=0x870f0c4) at eval.c:1245
#21 0x081a163f in Rf_eval (e=0x877c0f4, rho=0x870f0c4) at eval.c:464
#22 0x081a1f68 in Rf_applyClosure (call=0x870fe98, op=0x877c228, arglist=0x870ff24, rho=0x838d558, 
    suppliedenv=0x838d574) at eval.c:699
#23 0x081a18b1 in Rf_eval (e=0x870fe98, rho=0x838d558) at eval.c:508
#24 0x0815290e in protectedEval (d=0xbfffe1c0) at context.c:748
#25 0x08152895 in R_ToplevelExec (fun=0x81528d3 <protectedEval>, data=0xbfffe1c0) at context.c:704
#26 0x08152950 in R_tryEval (e=0x870fe98, env=0x0, ErrorOccurred=0xbfffe1f8) at context.c:762
#27 0x0805e50a in R_taskCallbackRoutine (expr=0x0, value=0x836a5c0, succeeded=TRUE, visible=FALSE, 
    userData=0x8d720a0) at main.c:1467
#28 0x0805e2f5 in Rf_callToplevelHandlers (expr=0x0, value=0x836a5c0, succeeded=TRUE, visible=FALSE)
    at main.c:1414
#29 0x0805c131 in Rf_ReplIteration (rho=0x838d558, savestack=0, browselevel=0, state=0xbfffe2b8) at main.c:269
#30 0x0805c288 in R_ReplConsole (rho=0x838d558, savestack=0, browselevel=0) at main.c:311
#31 0x0805d546 in run_Rmainloop () at main.c:972
#32 0x0805d55d in Rf_mainloop () at main.c:979
#33 0x0805bc8e in main (ac=1, av=0xbffff3d4) at Rmain.c:32
#34 0x00293bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6
#35 0x0805bbd1 in _start ()
(gdb)
Comment 8 Tony Plate 2010-09-04 06:10:52 UTC
Here's my hypothesis about what's happening:

* main.c:Rf_ReplIteration() sets variable R_CurrentExpr to the expression of the parsed command.  After evaluating that expression, it calls Rf_callToplevelHandlers(R_CurrentExpr, ...)
* however, R_CurrentExpr is a global variable, and Rf_ReplIteration can call itself recursively, thus clobbering the value of R_CurrentExpr
* in the case of sourcing an empty file, R_CurrentExpr gets set to 0x0, and this causes a crash when Rf_callToplevelHandlers(0x0, ...) is called from the top level (and the calling function tries to deparse(expr))

A possible fix is to use a local variable in main.c:Rf_ReplIteration() to remember the value of R_CurrentExpr while evaluating the expression.

I tried this with 3 changes:
(1) add a new local var at the beginning of Rf_ReplIteration(), e.g.:
    SEXP thisExpr;
(2) set thisExpr to the value of R_CurrentExpr at the appropriate place, e.g:
	PROTECT(R_CurrentExpr);
-->	thisExpr = R_CurrentExpr;
	R_Busy(1);
	value = eval(R_CurrentExpr, rho);
(3) call Rf_callToplevelHandlers() with this value instead of R_CurrentExpr, i.e.:
	Rf_callToplevelHandlers(thisExpr, value, TRUE, wasDisplayed);

After recompiling R with these changes, it runs through a test example without crashing.  Additionally, the task callback also now receives the last top-level expression in its first arg, even when the top-level expression was source().  Prior to these changes, when the top-level expression was source(), the callback would receive the last expression in the source()'d file as its first arg.  Here's some test code, and output from an unpatched and patched version of R:

cb <- function(expr, value, ok, visible) {cat("cb: expr=", deparse(expr), "\n"); TRUE}
addTaskCallback(cb)
cat("", file="empty.R")
cat("print(1+1)\nprint(2+2)\nprint(3+\n3)\n", file="three.R")
# Without the change, the callback receives the last expression from
# the file as expr, while it should receive the 'source(...)' expression.
source("three.R")
source("empty.R")

################################################################################
> # Result of running in standard R-2.11.1:
> sessionInfo()
R version 2.11.1 (2010-05-31) 
i486-pc-linux-gnu 

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     
> cb <- function(expr, value, ok, visible) {cat("cb: expr=", deparse(expr), "\n"); TRUE}
> addTaskCallback(cb)
1 
1 
cb: expr= addTaskCallback(cb) 
> cat("", file="empty.R")
cb: expr= cat("", file = "empty.R") 
> cat("print(1+1)\nprint(2+2)\nprint(3+\n3)\n", file="three.R")
cb: expr= cat("print(1+1)\nprint(2+2)\nprint(3+\n3)\n", file = "three.R") 
> # Without the change, the callback receives the last expression from
> # the file as expr, while it should receive the 'source(...)' expression.
> source("three.R")
[1] 2
[1] 4
[1] 6
cb: expr= print(3 + 3) 
> source("empty.R")

 *** caught segfault ***
address (nil), cause 'memory not mapped'
Segmentation fault

################################################################################
> # Result of running in patched version of R
> cb <- function(expr, value, ok, visible) {cat("cb: expr=", deparse(expr), "\n"); TRUE}
> addTaskCallback(cb)
1 
1 
cb: expr= addTaskCallback(cb) 
> cat("", file="empty.R")
cb: expr= cat("", file = "empty.R") 
> cat("print(1+1)\nprint(2+2)\nprint(3+\n3)\n", file="three.R")
cb: expr= cat("print(1+1)\nprint(2+2)\nprint(3+\n3)\n", file = "three.R") 
> # Without the change, the callback receives the last expression from
> # the file as expr, while it should receive the 'source(...)' expression.
> source("three.R")
[1] 2
[1] 4
[1] 6
cb: expr= source("three.R") 
> source("empty.R")
cb: expr= source("empty.R") 
> 
(session continues without apparent problems)

Here's a diff of the changes I made:
$ diff -c main.c.orig main.c
*** main.c.orig	2010-09-04 00:58:29.877737309 -0400
--- main.c	2010-09-04 00:13:55.424926205 -0400
***************
*** 203,208 ****
--- 203,209 ----
  {
      int c, browsevalue;
      SEXP value;
+     SEXP thisExpr;
      Rboolean wasDisplayed = FALSE;
  
      if(!*state->bufp) {
***************
*** 258,263 ****
--- 259,265 ----
  	R_EvalDepth = 0;
  	resetTimeLimits();
  	PROTECT(R_CurrentExpr);
+ 	thisExpr = R_CurrentExpr;
  	R_Busy(1);
  	value = eval(R_CurrentExpr, rho);
  	SET_SYMVALUE(R_LastvalueSymbol, value);
***************
*** 266,272 ****
  	    PrintValueEnv(value, rho);
  	if (R_CollectWarnings)
  	    PrintWarnings();
! 	Rf_callToplevelHandlers(R_CurrentExpr, value, TRUE, wasDisplayed);
  	R_CurrentExpr = value; /* Necessary? Doubt it. */
  	UNPROTECT(1);
  	R_IoBufferWriteReset(&R_ConsoleIob);
--- 268,274 ----
  	    PrintValueEnv(value, rho);
  	if (R_CollectWarnings)
  	    PrintWarnings();
! 	Rf_callToplevelHandlers(thisExpr, value, TRUE, wasDisplayed);
  	R_CurrentExpr = value; /* Necessary? Doubt it. */
  	UNPROTECT(1);
  	R_IoBufferWriteReset(&R_ConsoleIob);

The patched version of R passes "make check" under Ubuntu.  I do not know whether "thisExpr" needs to be PROTECT()'d (I suspect not, as its referent is PROTECTED, but I don't know enough about PROTECTion to be sure.)  I also do not know if it is a potential source of other crashes that source() can leave R_CurrentExpr set to 0x0 rather than R_NilValue.
Comment 9 Tony Plate 2010-09-04 16:23:25 UTC
On further analysis, I see that main.c:Rf_ReplIteration is not being called recursively, but the global variable R_CurrentExpr is set in many places in the code.  In particular, it is set in xxvalue() in gram.y, so any time the parser is called, R_CurrentExpr may get set to the value of the most recently parsed expression.  Consequently, R_CurrentExpr is not suitable for remembering the value of the expression over a call to evaluate an expression (as it is currently used in main.c:Rf_ReplIteration).  Under this analysis, the fix I proposed is still valid.

Here is a smaller example illustrating this behavior.  The callback function cb() is intended to print the unevaluated value of the expression just evaluated.  Note that when the expression evaluated from the top level is a call to parse(), the callback function sees the parsed expression, not the top-level expression.

> cb <- function(expr, value, ok, visible) {cat("cb: expr=", deparse(expr), "\n"); TRUE}
> addTaskCallback(cb)
1 
1 
cb: expr= addTaskCallback(cb) 
> 1+2
[1] 3
cb: expr= 1 + 2 
> parse(text="3+4")
expression(3+4)
attr(,"srcfile")
<text> 
cb: expr= 3 + 4 
>
Comment 10 Duncan Murdoch 2010-09-08 18:55:30 UTC
I'll look into this.  At first glance the patch looks like the right approach.
Comment 11 Duncan Murdoch 2010-09-08 19:49:10 UTC
I used Tony's patch, thanks!

We're using R_CurrentExpr in way too many places, so there might be other conflicts, but I couldn't spot any in main.c.  At some point in the future those should be cleaned up.