Bugzilla – Full Text Bug Listing
|Summary:||grepl does not work for some characters in Japanese Windows|
|Product:||R||Reporter:||George Yoshida <dynkin>|
Description George Yoshida 2011-07-02 04:40:12 UTC
Following regex pattern does not work. > grepl("ー", c("a", "b")) Error in grepl("ー", c("a", "b")) : invalid regular expression 'ー', reason 'Missing ']'' # OS/locale info OS Windows 7 64-bit Japanese Environment Charset : CP932 --- In the CP932 charset, "ー"(double byte one character) is '\x81\x5b', and '\x5b' in ascii is '['.
Comment 1 Brian Ripley 2011-11-04 12:46:42 UTC
Seems specific to DBCS character sets, and is using third-party code (TRE). Would need a Japanese-language-enabled Windows to reproduce.
Comment 2 Brian Ripley 2011-11-05 06:48:27 UTC
So this can be reproduced in European Windows by env LC_CTYPE=ja Rterm > grepl("\x81\x5b", c("a", "b")) Fixed for 2.14.0 patched.