Bug 15990 - Patch for R to fix some buffer overruns and add a missing PROTECT().
Summary: Patch for R to fix some buffer overruns and add a missing PROTECT().
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R-devel (trunk)
Hardware: Other Linux
: P5 normal
Assignee: R-core
Depends on:
Reported: 2014-09-23 19:55 UTC by Karl Millar
Modified: 2014-09-24 17:41 UTC (History)
2 users (show)

See Also:

Patch to fix errors. (11.80 KB, patch)
2014-09-23 19:55 UTC, Karl Millar
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Karl Millar 2014-09-23 19:55:18 UTC
Created attachment 1667 [details]
Patch to fix errors.

This patch is against current svn and contains three classes of fix:
   - Ensure the result is properly terminated after calls to strncpy()
   - Replace calls of sprintf() with snprintf()
   - Added a PROTECT() call in do_while which could cause memory
errors if evaluating the condition results in a warning.

The missing PROTECT() was observed on CXXR while running:
  'while(c(FALSE, TRUE) 1'
with aggressive GC enabled and address sanitizer.  The code here is the same, so doing something similar should produce the error in R.

I don't have code to reproduce the strncpy() and sprintf() fixes, but they're elementary bugs with elementary fixes.
Comment 1 Duncan Murdoch 2014-09-24 11:16:19 UTC
I've reviewed these changes, and they all look fine.  Thanks!  I'll apply the patch today to R-devel.  I haven't checked whether it will all work on R-patched as well, but I'll apply as much of it as possible there as well.
Comment 2 Radford Neal 2014-09-24 14:14:03 UTC
For the patch in platform.c, note that there is a separate copy of the do_filecopy function for Windows in the source above the copy that the patch modifies, which also needs modification (including putting the fix for handling "" in that code, which was previously fixed only for the non-windows code).  Also, the strncpy calls with destimations of 'name' and 'from' shouldn't really be strncpy calls at all, but should just be strcpy.
Comment 3 Duncan Murdoch 2014-09-24 14:24:24 UTC
I agree with part of your comment (both versions of do_filecopy need the changes), but I don't see what harm is caused by using strnc
Comment 4 Duncan Murdoch 2014-09-24 14:26:30 UTC
I agree with part of your comment (both versions of do_filecopy need the changes), but I don't see what harm is caused by using strncpy even in cases where we currently know strcpy would work.  It seems to protect against future bugs, or reasoning bugs where what we "know" isn't actually true.
Comment 5 Radford Neal 2014-09-24 14:33:43 UTC
Regarding the fix to 'while' in eval.c, note that it makes more sense to modify asLogicalNoNA to protect its argument in the rare situation when it actually produces a warning message, avoiding the need to protect all the time in 'while' (and in 'if').  That is what is done in pqR.

Regarding strncpy vs. strcpy, it's generally a bad idea to write code that does things that make no sense, even if they don't actually damage anything, since it's harder for someone reading the code to figure it out when they're distracted by trying to figure out why something is being done when there isn't actually any reason it's being done. This code is generally not really handling the case of a long path well - for one thing MAX_PATH is not really the maximum path length - but it should at least be written so that the initial truncation is the only truncation.  Obfuscating that it is not truncating later doesn't help anything.
Comment 6 Duncan Murdoch 2014-09-24 14:49:12 UTC
There's one other error I didn't notice:  in the snprintf changes in gram.y (and so gram.c), the order of arguments was wrong.  This is already committed, so I'll need another commit for the fixes.

I'll make the asLogicalNoNA change as Radford suggests.
Comment 7 Karl Millar 2014-09-24 15:19:04 UTC
My apologies for the wrong argument order bug, and thanks Radford for pointing out improvements.

On the question of strcpy vs strncpy, personally, I think strcpy should never be used, whether or not it's needed.

Using strcpy, when the author knows that the string fits, works fine if the code is written once and never updated, but when you start updating the code it's simply too easy for bugs to creep in.  Furthermore, it's hard to quickly find the issues, since grepping the code for potentially dangerous calls returns many results, which have to be manually inspected to tell if they're buggy or not.

As Radford points out, this makes the code a little less clear, but IMHO we generally ought to favour correctness and robustness over readability in the occasional cases like this where they're at odds with each other, and if it's done consistently, the cognitive overhead is fairly low.

Frankly, strncpy should never be used either, as it's also pretty error prone.  The right thing to do would be to define a 'safe_strncpy' or similar that ensures the result is always terminated, and use that exclusively.  Or use C++ and std::string which obviates these types of issues entirely ;-)
Comment 8 Duncan Murdoch 2014-09-24 17:41:16 UTC
These changes (i.e. Karl's original submission, revised as noted in comment 6) are now committed to both R-devel and R-patched.