Bug 14035

Summary: Internal error in 'ls' for pathological environments
Product: R Reporter: Jitterbug compatibility account <jitterbug-import>
Component: Low-levelAssignee: Jitterbug compatibility account <jitterbug-import>
Status: CLOSED FIXED    
Severity: normal    
Priority: P5    
Version: old   
Hardware: ix86 (32-bit)   
OS: Windows 32-bit   

Description Jitterbug compatibility account 2009-11-01 09:59:11 UTC
From: Stavros Macrakis <macrakis@alum.mit.edu>
nchar(with(list(2),ls())) gives an internal error. This is of course
a peculiar call (no names in the list), but the error is not caught
cleanly.

It is not clear from the documentation whether with(list(2)...) is
allowable; if it is not, it should presumably give an error. If it is, then
ls
shouldn't have problems with the resulting environment.

> qq <- with(list(2),ls())                         # An incorrect call (no
names in list)
> nchar(qq)
Error in nchar(qq) : 'getEncChar' must be called on a CHARSXP  # ls returned
a bad object
> qq
[1]Error: 'getEncChar' must be called on a CHARSXP
> qq[1]
[1]Error: 'getEncChar' must be called on a CHARSXP
> qq[2]
[1] NA

Apparently related:

> with(list(a=1,2),ls())
Error in ls() : 'getEncChar' must be called on a CHARSXP



--please do not edit the information below--

Version:
 platform = i386-pc-mingw32
 arch = i386
 os = mingw32
 system = i386, mingw32
 status =
 major = 2
 minor = 9.2
 year = 2009
 month = 08
 day = 24
 svn rev = 49384
 language = R
 version.string = R version 2.9.2 (2009-08-24)

Windows Vista x64 (build 6002) Service Pack 2

Locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

Search Path:
 .GlobalEnv, package:stats, package:graphics, package:grDevices,
package:utils, package:datasets, package:methods, Autoloads, package:base

	[[alternative HTML version deleted]]

Comment 1 Jitterbug compatibility account 2009-11-01 17:02:25 UTC
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
macrakis@alum.mit.edu wrote:
> nchar(with(list(2),ls())) gives an internal error. This is of course
> a peculiar call (no names in the list), but the error is not caught
> cleanly.
> 
> It is not clear from the documentation whether with(list(2)...) is
> allowable; if it is not, it should presumably give an error. If it is, then
> ls
> shouldn't have problems with the resulting environment.
> 
>> qq <- with(list(2),ls())                         # An incorrect call (no
> names in list)
>> nchar(qq)
> Error in nchar(qq) : 'getEncChar' must be called on a CHARSXP  # ls returned
> a bad object
>> qq
> [1]Error: 'getEncChar' must be called on a CHARSXP
>> qq[1]
> [1]Error: 'getEncChar' must be called on a CHARSXP
>> qq[2]
> [1] NA
> 
> Apparently related:
> 
>> with(list(a=1,2),ls())
> Error in ls() : 'getEncChar' must be called on a CHARSXP

Thanks, yes, this looks wrong.

Also, closer to the root cause:

 > eval(quote(ls()),list(a=1,2))
Error in ls() : 'getEncChar' must be called on a CHARSXP


 > e <- evalq(environment(),list(2))
 > ls(e)
[1]Error: 'getEncChar' must be called on a CHARSXP

It is not quite clear that it should be allowed to have unnamed elements 
in lists used by eval (which is what with() uses internally). I suppose 
it should, since the intended semantics are unaffected in most cases. 
(I.e., there is nothing really wrong with eval(quote(a+b), 
list(a=1,b=2,3,4)), and people may have been using such code unwittingly 
all over.)

However, it IS a bug that we are creating ill-formed environments. The 
culprit seems to be that NewEnvironment (memory.c) is getting called in 
violation of its assumption that

" This definition allows
   the namelist argument to be shorter than the valuelist; in this
   case the remaining values must be named already.  (This is useful
   in cases where the entire valuelist is already named--namelist can
   then be R_NilValue.)
"

Removing the assumption from NewEnvironment looks like an efficiency 
sink, so I would suggest that we fix do_eval (eval.c) instead, 
effectively doing l <- l[names(l) != ""].  So in

     case LISTSXP:
	env = NewEnvironment(R_NilValue, duplicate(CADR(args)), encl);
	PROTECT(env);
	break;

we need to replace the duplicate() call with something that skips the 
unnamed elements.

The only side effect I can see from such a change is the resulting 
environments get a different length() than before. I'd say that if there 
are coder who actually rely on the length of an invalid environment, 
then they'd deserve what they'd get...

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907

Comment 2 Jitterbug compatibility account 2009-11-18 03:21:00 UTC
NOTES:
 VECSXP case fixed for 2.11.0 (this was a vector list, not a pairlist)
Comment 3 Jitterbug compatibility account 2009-11-18 03:21:28 UTC
Audit (from Jitterbug):
Tue Nov 17 21:21:28 2009	ripley	changed notes
Tue Nov 17 21:21:28 2009	ripley	moved from incoming to Low-level-fixed