Bug 14720 - Wishlist: Include NA's in summary methods for date-time classes
Wishlist: Include NA's in summary methods for date-time classes
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Wishlist
R 2.14.0
All All
: P5 enhancement
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-04 01:23 UTC by Mike Toews
Modified: 2014-02-16 11:43 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mike Toews 2011-11-04 01:23:47 UTC
Methods for summary (including summary.Date, summary.POSIXct and summary.POSIXlt) do not summarize the count of NA values. E.g.:

> some_dates <- as.Date(c("2011-02-04", NA, "2011-03-14"))
> summary(some_dates)
        Min.      1st Qu.       Median         Mean      3rd Qu.         Max. 
"2011-02-04" "2011-02-13" "2011-02-23" "2011-02-23" "2011-03-04" "2011-03-14"

However, it would be beneficial to have this information, as other summary methods provide NA counts:

> summary(as.numeric(some_dates))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  15010   15020   15030   15030   15040   15050       1
> summary(factor(some_dates))
2011-02-04 2011-03-14       NA's 
         1          1          1
Comment 1 Peter Dalgaard 2011-11-04 08:57:30 UTC
Actually, I have been wanting them at times too. Whether times are frequently missing depends on what they are times of  (e.g. death). 

The real issue here is that summary() on numeric data is a bit numb-skulled in that it returns a vector of the same class as x, to be printed via format(). While it is only a minor annoyance to get "NA's: 4.000" for a numeric vector, displaying "NA's: 1970-01-05" would be rather confusing. So it takes a redesign of summary(), and summary.data.frame(). And if we mess with that we need to  find out whether back-compatibility is a real issue.
Comment 2 Brian Ripley 2011-11-04 12:01:18 UTC
Please explain why it would be 'beneficial'.

After 10 years, no one else has requested them, and you can easily 
compute them for yourself.  Times are rarely missing ....

On Thu, 3 Nov 2011, r-bugs@r-project.org wrote:

> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14720
>
>           Summary: Include NA's in summary methods for date-time classes
>           Product: R
>           Version: R 2.14.0
>          Platform: All
>        OS/Version: All
>            Status: NEW
>          Severity: enhancement
>          Priority: P5
>         Component: Wishlist
>        AssignedTo: R-core@R-project.org
>        ReportedBy: mwtoews@gmail.com
>   Estimated Hours: 0.0
>
>
> Methods for summary (including summary.Date, summary.POSIXct and
> summary.POSIXlt) do not summarize the count of NA values. E.g.:
>
>> some_dates <- as.Date(c("2011-02-04", NA, "2011-03-14"))
>> summary(some_dates)
>        Min.      1st Qu.       Median         Mean      3rd Qu.         Max.
> "2011-02-04" "2011-02-13" "2011-02-23" "2011-02-23" "2011-03-04" "2011-03-14"
>
> However, it would be beneficial to have this information, as other summary
> methods provide NA counts:
>
>> summary(as.numeric(some_dates))
>   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's
>  15010   15020   15030   15030   15040   15050       1
>> summary(factor(some_dates))
> 2011-02-04 2011-03-14       NA's
>         1          1          1
>
> -- 
> Configure bugmail: https://bugs.r-project.org/bugzilla3/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are the assignee for the bug.
>
> _______________________________________________
> R-core list: https://stat.ethz.ch/mailman/listinfo/r-core
>


-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


Comment 3 Brian Ripley 2011-11-04 12:44:02 UTC
But no promises that it will be done soon.
Comment 4 Brian Ripley 2011-11-13 16:07:19 UTC
There is a version of this now in R-devel (and like numerics, it now prints the number of NAs as an integer).  But summary.data.frame was quite inconsistent with summary.default, so some further tweaking may be desirable before release.

PD commented about times re deaths: it is almost always date of death that is missing, not time, and I had chosen my words carefully.
Comment 5 Jackie Rosen 2014-02-16 11:43:25 UTC
(spam comment removed)