Created attachment 2031 [details]
This bug report is a follow-up to a thread on the R-devel mailing list: https://stat.ethz.ch/pipermail/r-devel/2016-February/072323.html
When a text file has null bytes, e.g. a UTF-16 file with ASCII code points, file.show() may fail to show it correctly.
As an example, the (system dependent) result of the following code is a pager showing "<66>" (quotes not included) followed by several empty lines.
foobar <- charToRaw("foo\r\nbar\r\n")
foobar_utf16 <- c(as.raw(c("0xff", "0xfe")), rbind(foobar, as.raw(0L)))
filename <- tempfile()
This was tested on a Linux computer running "R version 3.2.4 beta (2016-02-29 r70247)" and R-devel r70247.
With the suggested patch applied, the result is as expected: a pager showing the lines "foo" and "bar" (followed by an empty line). The following is an almost verbatim copy of what I wrote earlier on the mailing list, describing the patch.
The idea is to read the input file "raw" in order to avoid problems with null bytes. The input then needs to be split into lines after iconv(), or it could be written to the output file with cat() if the style of line termination characters does not matter. The 'perl = TRUE' is for assumed performance advantage only. It can be removed, or one might want to test if there is a significant difference one way or the other.
R Under development (unstable) (2016-02-29 r70247)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.4 LTS
 LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
 LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
 LC_PAPER=en_US.UTF-8 LC_NAME=C
 LC_ADDRESS=C LC_TELEPHONE=C
 LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
 stats graphics grDevices utils datasets methods base