Bug 15180 - encodeString() is inconsistent with print.default() for some UTF-8 characters on Windows
Summary: encodeString() is inconsistent with print.default() for some UTF-8 characters...
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Windows GUI / Window specific (show other bugs)
Version: R 2.15.1
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 trivial
Assignee: R-core
Depends on:
Reported: 2013-01-21 16:30 UTC by Jochen Wilhelm
Modified: 2013-01-21 16:57 UTC (History)
1 user (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description Jochen Wilhelm 2013-01-21 16:30:20 UTC
Wenn die Level-Bezeichner seltene Unicode-Zeichen enthalten, so werden diese bei den Werten korrekt dargestellt, nicht aber be der Ausgabe der Levels. Nach einem Aufruf der Funktion levels() werden die Bezeichner jedoch wieder korrekt dargestellt (im Beispiel unten am "Kleiner-oder-Gleich"-Zeichen:

> v <- factor(1:2)
> levels(v) <- c("\u22641",">1")
> v
[1] ≤1 >1
Levels: =1 >1
> levels(v)
[1] "≤1" ">1"

> sessionInfo()
R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)

[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
Comment 1 Simon Urbanek 2013-01-21 16:47:38 UTC
This comes from encodeString() on Windows:

> encodeString("\u2264")
[1] "="
> print("\u2264")
[1] "≤"

There is a warning about encodeString() and determining printable characters on Windows so I'm not sure if this is a bug or a one of many quirks of Windows. Arguably, one would expect print.default() and encode String() to be comparable, though.