Bug 16648 - complete.cases() give misleading message
Reported: 2015-12-30 01:01 UTC by Jinsong Zhao
Modified: 2016-01-03 21:30 UTC
Description Jinsong Zhao 2015-12-30 01:01:58 UTC
I have a data frame:

> df
        Time Treat
1 2015-03-28     1
2 2015-03-28     1
3 2015-03-28     1
> dput(df)
structure(list(Time = structure(list(sec = c(0, 0, 0), min = c(0L, 
0L, 0L), hour = c(0L, 0L, 0L), mday = c(28L, 28L, 28L), mon = c(2L, 
2L, 2L), year = c(115L, 115L, 115L), wday = c(6L, 6L, 6L), yday = c(86L, 
86L, 86L), isdst = c(0L, 0L, 0L), zone = c("CST", "CST", "CST"
), gmtoff = c(NA_integer_, NA_integer_, NA_integer_)), .Names = c("sec", 
"min", "hour", "mday", "mon", "year", "wday", "yday", "isdst", 
"zone", "gmtoff"), class = c("POSIXlt", "POSIXt")), Treat = c(1L, 
1L, 1L)), .Names = c("Time", "Treat"), row.names = c(NA, 3L), class = "data.frame")

when I apply complete.cases() on it, it gives error message:

> complete.cases(df)
Error in complete.cases(df) : not all arguments have the same length

It seems that complete.case() cannot handle the first column. I think it should give a warning message such as the class/type is not supported, instead of error message without any hints about the error.

> version
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
major          3                           
minor          2.3                         
year           2015                        
month          12                          
day            10                          
svn rev        69752                       
language       R                           
version.string R version 3.2.3 (2015-12-10)
nickname       Wooden Christmas-Tree
Comment 1 Duncan Murdoch 2016-01-02 16:01:54 UTC
The problem here is that complete.cases doesn't use the R function length() to determine the length of the Time column, it uses the C macro LENGTH, which ignores the fact that Time is of class POSIXlt, and sees it as a length 11 list.

A workaround is to convert the time to POSIXct instead of POSIXlt.

It looks quite messy to fix this properly, so I'll just add a note to the documentation warning about the problem.
Comment 2 Duncan Murdoch 2016-01-03 21:30:41 UTC
It actually looks pretty straightforward to translate the C code to R code which will pay attention to the classes.  If that doesn't slow it down too much, I'll make that change.