Bug 16101 - When printing list with multibyte UTF-8 names on Windows GUI, the first name is printed as characters, the rest with <U+xxxx> format.
Summary: When printing list with multibyte UTF-8 names on Windows GUI, the first name ...
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: I/O (show other bugs)
Version: R 3.1.2
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-12-08 18:29 UTC by Bill Dunlap
Modified: 2014-12-08 18:29 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Bill Dunlap 2014-12-08 18:29:21 UTC
In the Windows GUI for R-3.1.2, create and print a list with some Cyrillic characters in the names. Note that the name for the first list item is printed in Cyrillic (where appropriate) but subsequent names are printed using the <U+xxxx> format.

a <- "One is \u043E\u0434\u0438\u043D\nTwo is \u0434\u0432\u0430\n"
Encoding(a) # expect "UTF-8"
sapply(strsplit(a, "\n")[[1]], charToRaw)[c(1,1,2)]
$`One is один`
 [1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0 b8 d0 bd

$`One is <U+043E><U+0434><U+0438><U+043D>`
 [1] 4f 6e 65 20 69 73 20 d0 be d0 b4 d0 b8 d0 bd

$`Two is <U+0434><U+0432><U+0430>`
 [1] 54 77 6f 20 69 73 20 d0 b4 d0 b2 d0 b0

If I start non-GUI R for Windows in a cmd.exe window then all names are printed in <U+xxxx> format.  On Linus, in a putty.exe window, all are printed in Cyrillic.

> l10n_info()
$MBCS
[1] FALSE

$`UTF-8`
[1] FALSE

$`Latin-1`
[1] TRUE

$codepage
[1] 1252

> Sys.info()
                     sysname                      release 
                   "Windows"                      "7 x64" 
                     version                     nodename 
"build 7601, Service Pack 1"               "WDUNLAP-W520" 
                     machine                        login 
                    "x86-64"                    "wdunlap" 
                        user               effective_user 
                   "wdunlap"                    "wdunlap"