Bug 15227 - cat with bad UTF8 strings from intToUtf8 can make R GUI hang
cat with bad UTF8 strings from intToUtf8 can make R GUI hang
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Windows GUI / Window specific
R-devel (trunk)
All Windows 64-bit
: P5 minor
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-03-04 22:33 UTC by Richard Cotton
Modified: 2013-03-07 13:00 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Cotton 2013-03-04 22:33:02 UTC
Using the intToUtf8 function it is possible to create UTF8 strings that cause R to hang when outputted to the R GUI console with cat.

If the string contains an end of text character followed by any non-ASCII character, then cat will fail ungracefully.

To reproduce:

bad_string <- intToUtf8(c(3, 128))
cat(bad_string)

The non-ASCII characters (values > 127) are necessary; if they are not included then R considers the encoding of the string to be "unknown" rather than "UTF8".

catting to file works OK.

cat(bad_string, file = "test.txt") #ok

I've reproduced the problem under Win7 with 32 and 64 bit versions of R2.15.2 and a recent R3.0.0-devel.

The problem doesn't occur when using R from the command line, or other IDEs (RStudio tested).
Comment 1 Duncan Murdoch 2013-03-06 14:39:04 UTC
I can reproduce this and will investigate, but it might turn out to be a Windows bug rather than an R bug.  It doesn't happen on other platforms.
Comment 2 Duncan Murdoch 2013-03-07 13:00:08 UTC
Turned out it was our bug, now fixed in R-devel and 3.0.0-to-be.