Bug 17181 - png() doesn't handle UTF-8 encoded filenames
Summary: png() doesn't handle UTF-8 encoded filenames
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Windows GUI / Window specific (show other bugs)
Version: R-devel (trunk)
Hardware: Other Other
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-11-14 21:38 UTC by Kevin Ushey
Modified: 2018-02-22 16:42 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Ushey 2016-11-14 21:38:18 UTC
To reproduce:

> setwd(tempdir())
> path <- iconv("brûlée.png", to = "UTF-8")
> png(filename = path)
> dev.off()
null device 
          1 
> list.files()
[1] "brûlée.png"

Note that the string 'brûlée' can be represented in the system locale in this case (latin1); I believe the issue is a missing translation from UTF-8 to the system encoding when generating the file. Note that 'brûlée' is the correct byte-sequence for the above UTF-8 encoded string; those bytes are simply being mis-interpreted in the system encoding.

Perhaps there's a missing 'translateChar()' call somewhere?

---

> sessionInfo()
R Under development (unstable) (2016-11-13 r71655)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] compiler_3.4.0
Comment 1 Tomas Kalibera 2018-02-22 16:42:07 UTC
Fixed in 74289. The filename is now translated to native encoding (and has already been in "cairo" and "cairo-png"), so the example works depending on the current locale/native encoding. It means that if the UTF-8 version of the filename contains characters not representable in native encoding, the resulting file name can still be surprising, but fixing this would not be feasible (e.g. cairo does not have an API for this).