Bug 14435 - reshape function (package: stats) ignores 'sep' argument.
reshape function (package: stats) ignores 'sep' argument.
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Language
R 2.11.1
x86_64/x64/amd64 (64-bit) Mac OS X v10.6
: P5 trivial
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-11-16 21:20 UTC by Ryne
Modified: 2010-11-19 15:18 UTC (History)
0 users

See Also:


Attachments
changes to reshape function as described above (edits: lines 78, 85, 126) (8.54 KB, application/octet-stream)
2010-11-16 21:20 UTC, Ryne
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ryne 2010-11-16 21:20:53 UTC
Created attachment 1142 [details]
changes to reshape function as described above (edits: lines 78, 85, 126)

The reshape function (package: stats) currently ignores the sep argument when converting from tall data to wide data, and instead populates the default value ('.') regardless of input. The problem appears to lie in lines 78, 85 and 126 of the function code, which hardcodes the default value rather than the function's sep argument.

line 78:
row.names(rval) <- paste(d[, idvar], times[1L], sep = ".")

should read:

row.names(rval) <- paste(d[, idvar], times[1L], sep = sep)

line 85:
row.names(d) <- paste(d[, idvar], times[i], sep = ".")

should read:
row.names(d) <- paste(d[, idvar], times[i], sep = sep)

line 126:
varying <- outer(v.names, times, paste, sep = ".")

varying <- outer(v.names, times, paste, sep = sep)


Here's the code I'm using to assess the problem. I've also attached an edited version of the function that fixes the problem as described above. However, it may be useful to add some error checks, as one could set sep=NULL and break the function.

test <- data.frame(x=rnorm(100), 
	y=rnorm(100), 
	famid=rep(1:50, each=2), 
	time=rep(1:2, 50))

wide <- reshape(data=test, 
	v.names=c("x", "y"), 
	idvar="famid", 
	timevar="time",
	sep="",
	direction="wide")
	
names(wide)
# returns [1] "famid" "x.1"   "y.1"   "x.2"   "y.2"
Comment 1 Brian Ripley 2010-11-19 14:21:10 UTC
This is as documented: have you read the help?

     sep: A character vector of length 1, indicating a separating
          character in the variable names in the wide format. This is
          used for guessing ‘v.names’ and ‘times’ arguments based on
          the names in ‘varying’. If ‘sep==""’, the split is just
          before the first numeral that follows an alphabetic
          character.

It certainly does not apply to row names: one could argue that it should
apply to generating names (as well as guessing them) for the wide format,
and I've altered that for R 2.12.1.  But that is not what is said it did.

Please don't provide edits on the deparse, but on the original sources:
you attachment was very hard to disentangle.
Comment 2 Ryne 2010-11-19 15:18:33 UTC
I misread the documentation, specifically the portion of the 'Details' section that stated "To have alphabetic followed by numeric times use sep="".". On a second read, you're right that the functionality I assumed was there did not exist. I'll be sure to make any future references to the original source.

I appreciate the help, and look forward to the change in 2.12.1.