Bug 6899 - Wishlist: nchar() for factors
Wishlist: nchar() for factors
Status: RESOLVED FIXED
Product: R
Classification: Unclassified
Component: Wishlist
R 2.14.0
ix86 (32-bit) Windows 32-bit
: P5 normal
Assigned To: Jitterbug compatibility account
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-05-21 09:57 UTC by Jitterbug compatibility account
Modified: 2011-11-11 08:35 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jitterbug compatibility account 2004-05-21 09:57:56 UTC
From: "Vadim Ogranovich" <vograno@evafunds.com>
Hi,

R-FAQ doesn't explicitly tell how to submit patches, I assume they can
be considered as wish-items and thus should be mailed to R-bugs. Please
correct me if I am wrong.


Currently (R1.8.1) nchar() for factors is NOT equivalent to
nchar(as.character(myFactor)). This is counterintuitive, see the example
below:

> z <- as.factor("ZZZ")
> z=="ZZZ"
[1] TRUE
> nchar(z)
[1] 1


Here is a simple patch to fix it.


nchar <- function(x) {
  if (is.factor(x)) {
    ncharLevels <- .Internal(nchar(levels(x)))
    return(ncharLevels[x])
  }

  .Internal(nchar(x))
}


# simple testing
> nchar(c("ZZZ","A", "A", "BB"))
[1] 3 1 1 2
> nchar(as.factor(c("ZZZ","A", "A", "BB")))
[1] 3 1 1 2



Thank you for your consideration,
Vadim

Comment 1 Jitterbug compatibility account 2004-05-23 01:07:00 UTC
NOTES:
 nchar is not said to work for factors, so why would anyone expect it to?
Comment 2 Jitterbug compatibility account 2004-05-23 03:02:01 UTC
Audit (from Jitterbug):
Sat May 22 22:02:01 2004	ripley	changed notes
Sat May 22 22:02:01 2004	ripley	moved from incoming to wishlist
Comment 3 K Wright 2011-11-08 23:32:23 UTC
I second the wish of the original poster.  Additional comments below.


1. R 2.14.0 has

R> nchar(factor(LETTERS))
 [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Both Martin M. and Peter D. say nchar(x) is working as documented, if read very carefully.


2. As noted by Martin on an email to R-core, R-0.2 had:

> nchar(factor(LETTERS))
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1


3. substring(text) is not explicitly documented for factors, but the help page does say:

For the extraction functions, x or text will be converted to a character vector by as.character if it is not already one.

Thus we get the expected:

R> substring(factor(state.name[1:3]), 2)
[1] "labama" "laska"  "rizona"

4. With some similarities to substring, the help page for nchar(x) says:
  
x 	character vector, or a vector to be coerced to a character vector.

and

The internal equivalent of the default method of as.character is performed on x (so there is no method dispatch).

Both of these comments seem to suggest that 'as.character' could be added prior to the call to nchar without changing the result.  However,

R> nchar(as.character(factor(LETTERS)))
 [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

R> nchar(factor(LETTERS))
 [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

5. If nchar is not documented to work on factors, wouldn't it be better to thrown an error instead of the current behavior?
Comment 4 Brian Ripley 2011-11-08 23:38:37 UTC
It clearly is docmented only to work on vectors.
Comment 5 Brian Ripley 2011-11-11 08:35:23 UTC
Changed to be an error in R-devel (and affects 4 CRAN packages)