Bug 16046 - Unicode nuls are allowed in strings
Summary: Unicode nuls are allowed in strings
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-10-27 11:14 UTC by Richard Cotton
Modified: 2014-10-27 15:26 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Cotton 2014-10-27 11:14:25 UTC
For a long time, embedding a nul character in a string using \0 has thrown an error.

"\0"
## Error: embedded nul in string: '\0'

However, it is still possible to enter a nul character using Unicode syntax.

"abc\u0000def"
## [1] "abc"

R's behaviour should be consistent between the two specifications of nul.  That is, attempting to create strings containing "\u0000" should throw an error.
Comment 1 Duncan Murdoch 2014-10-27 14:16:21 UTC
I agree that these two cases should be handled similarly.  The reason for the difference is that they are currently handled by the string building code, and that's different for byte-sized chars versus wide chars, but the detection should probably happen earlier.
Comment 2 Duncan Murdoch 2014-10-27 15:26:52 UTC
Fixed in R-devel; will port to R-patched after 3.1.2 is released.