Bug 16584 - scan() may add "\n" to UTF-8 string in some locale settings on Windows
Summary: scan() may add "\n" to UTF-8 string in some locale settings on Windows
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: Windows GUI / Window specific (show other bugs)
Version: R 3.2.2
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2015-10-28 12:59 UTC by Qin Wenfeng
Modified: 2015-10-28 12:59 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Qin Wenfeng 2015-10-28 12:59:15 UTC
> sessionInfo()
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 8 x64 (build 9200)

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 
[2] LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.2.2

> scan(text = "个", what = "character")
Read 1 item
[1] "个\n"
> scan(text = "一", what = "character")
Read 1 item
[1] "一"


> Sys.setlocale(locale= "Korean_Korea")
[1] "LC_COLLATE=Korean_Korea.949;LC_CTYPE=Korean_Korea.949;LC_MONETARY=Korean_Korea.949;LC_NUMERIC=C;LC_TIME=Korean_Korea.949"
> scan(text = "个", what = "character")
Read 1 item
[1] "个\n"
> scan(text = "一", what = "character")
Read 1 item
[1] "一"


> Sys.setlocale(locale="English")
[1] "LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252"
> scan(text = "个", what = "character")
Read 1 item
[1] "个"
> scan(text = "一", what = "character")
Read 1 item
[1] "一"