Bug 14393 - clash between na.action=na.pass and drop.unused.levels=TRUE in model.frame.default
clash between na.action=na.pass and drop.unused.levels=TRUE in model.frame.de...
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Models
R 2.11.1 patched
Other Linux
: P5 trivial
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-01 17:34 UTC by Heather Turner
Modified: 2010-10-02 02:16 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Heather Turner 2010-10-01 17:34:51 UTC
model.frame.default does not honour drop.unused.levels=TRUE when na.action=na.pass and there are NAs and one unused level in a factor.

For example:
> f <- factor(c(NA, 1, 2), levels = 1:3, labels = 1:3)
> mf <- model.frame(~ f, na.action = na.pass, drop.unused.levels = TRUE)
> levels(mf$f)
[1] "1" "2" "3"

The following patch would fix the bug:

Index: src/library/stats/R/models.R
===================================================================
--- src/library/stats/R/models.R	(revision 53094)
+++ src/library/stats/R/models.R	(working copy)
@@ -424,7 +424,7 @@
 	for(nm in names(data)) {
 	    x <- data[[nm]]
 	    if(is.factor(x) &&
-	       length(unique(x)) < length(levels(x)))
+	       length(unique(na.omit(x))) < length(levels(x)))
 		data[[nm]] <- data[[nm]][, drop = TRUE]
 	}
     }
Comment 1 Brian Ripley 2010-10-01 19:37:51 UTC
Thanks, but I think na.omit is overkill here (and this pretest is being done to be efficient).  x[!is.na(x)] seems to suffice for a factor.

Changed for 2.12.0.
Comment 2 Heather Turner 2010-10-02 02:16:58 UTC
Thanks for the fix. Hopefully my sloppy code pin-pointed the bug at least.

On 01/10/10 19:37, r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14393
> 
> Brian Ripley <ripley@stats.ox.ac.uk> changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|NEW                         |CLOSED
>          Resolution|                            |FIXED
> 
> --- Comment #1 from Brian Ripley <ripley@stats.ox.ac.uk> 2010-10-01 14:37:51 EDT ---
> Thanks, but I think na.omit is overkill here (and this pretest is being done to
> be efficient).  x[!is.na(x)] seems to suffice for a factor.
> 
> Changed for 2.12.0.
>