Bug 16714 - na.omit and other na.actions incurs a copy even if no change
Summary: na.omit and other na.actions incurs a copy even if no change
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Misc (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P5 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-02-17 09:08 UTC by Simen Gaure
Modified: 2016-02-17 09:36 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Simen Gaure 2016-02-17 09:08:35 UTC
The na.action functions in library/stats/all.R have the side-effect of copying the data even if no action is taken.  
E.g. with something like

a <- b <- rnorm(10)
d <- na.omit(data.frame(a,b))

d$a will be a copy of a.
 
This does not affect the semantics in any way, but is annoying for very large datasets since it also affects things like model.frame(). The culprit is indexing with an all TRUE logical vector in e.g. na.omit.data.frame:

    xx <- object[!omit, , drop = FALSE]

which could be changed to avoid the copy in this special case:

    xx <- if(any(omit)) object[!omit, , drop = FALSE] else object

Alternatively, the low-level indexing can be amended, but that could be a more involved task.
Comment 1 Peter Dalgaard 2016-02-17 09:36:47 UTC
This looks like a sensible request. Possibly easier to just

   if(!any(omit)) return(object)

and then drop the if() from the later code.

Incidentally, why do we do 

    if (any(omit > 0L)) {

when (AFAICS) omit cannot be anything but logical?