Bug 14073 - Surprising length() of POSIXlt vector
Surprising length() of POSIXlt vector
Status: RESOLVED FIXED
Product: R
Classification: Unclassified
Component: Wishlist
old
All Linux
: P5 normal
Assigned To: Jitterbug compatibility account
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-11-20 05:20 UTC by Jitterbug compatibility account
Modified: 2010-03-22 08:41 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jitterbug compatibility account 2009-11-20 05:20:58 UTC
From: Mark White <mark@celos.net>
Arrays of POSIXlt dates always return a length of 9.  This
is correct (they're really lists of vectors of seconds,
hours, and so forth), but other methods disguise them as
flat vectors, giving superficially surprising behaviour:

  strings <- paste('2009-1-', 1:31, sep='')
  dates <- strptime(strings, format="%Y-%m-%d")

  print(dates)
  #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
  #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
  # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
  # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
  # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
  # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
  # [31] "2009-01-31"

  print(length(dates))
  # [1] 9
  
  str(dates)
  # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...

  print(dates[20])
  # [1] "2009-01-20"

  print(length(dates[20]))
  # [1] 9

I've since realised that POSIXct makes date vectors easier,
but could we also have something like:

  length.POSIXlt <- function(x) { length(x$sec) }

in datetime.R, to avoid breaking functions (like the
str.POSIXt method) which use length() in this way?

Thanks,
Mark <><

------

Version:
 platform = i686-pc-linux-gnu
 arch = i686
 os = linux-gnu
 system = i686, linux-gnu
 status = 
 major = 2
 minor = 10.0
 year = 2009
 month = 10
 day = 26
 svn rev = 50208
 language = R
 version.string = R version 2.10.0 (2009-10-26)

Locale:
C

Search Path:
 .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:methods, Autoloads, package:base

Comment 1 Jitterbug compatibility account 2009-11-20 14:54:34 UTC
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
mark@celos.net wrote:
> Arrays of POSIXlt dates always return a length of 9.  This
> is correct (they're really lists of vectors of seconds,
> hours, and so forth), but other methods disguise them as
> flat vectors, giving superficially surprising behaviour:
> 
>   strings <- paste('2009-1-', 1:31, sep='')
>   dates <- strptime(strings, format="%Y-%m-%d")
> 
>   print(dates)
>   #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
>   #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
>   # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
>   # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
>   # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
>   # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
>   # [31] "2009-01-31"
> 
>   print(length(dates))
>   # [1] 9
>   
>   str(dates)
>   # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
> 
>   print(dates[20])
>   # [1] "2009-01-20"
> 
>   print(length(dates[20]))
>   # [1] 9
> 
> I've since realised that POSIXct makes date vectors easier,
> but could we also have something like:
> 
>   length.POSIXlt <- function(x) { length(x$sec) }
> 
> in datetime.R, to avoid breaking functions (like the
> str.POSIXt method) which use length() in this way?


[You need "wishlist" in the title for this sort of stuff.]

I'd be wary of this. Just the other day we found that identical() broke 
on some objects because a package had length() redefined as a class 
method. I.e. the danger is that something wants to use length() with its 
original low-level interpretation.


> Thanks,
> Mark <><
> 
> ------
> 
> Version:
>  platform = i686-pc-linux-gnu
>  arch = i686
>  os = linux-gnu
>  system = i686, linux-gnu
>  status = 
>  major = 2
>  minor = 10.0
>  year = 2009
>  month = 10
>  day = 26
>  svn rev = 50208
>  language = R
>  version.string = R version 2.10.0 (2009-10-26)
> 
> Locale:
> C
> 
> Search Path:
>  .GlobalEnv, package:stats, package:graphics, package:grDevices, package:utils, package:datasets, package:methods, Autoloads, package:base
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907

Comment 2 Jitterbug compatibility account 2009-11-20 16:53:43 UTC
From: Mark White <mark@celos.net>
Benilton Carvalho writes:
> I'm no expert on this, but my understanding is that the choice was
> to stick to the definition.
> 
> The help file for length() [1] says:
> 
> "For vectors (including lists) and factors the length is the number
> of elements."
> 
> The help file for POSIXlt [2] (for example) says:
> 
> "Class ?"POSIXlt"? is a named list of vectors representing (...)"
> 
> and then lists the 9 elements (sec / min / hour / mday / mon / year
> / wday / yday / isdst).
> 
> So, by [1] length of POSIXlt objects is 9, because it "is a named
> list of vectors representing (...)".

Thanks, all.  Yes, I'd already read both, and it's obviously
true that a length() of 9 is correct (as I said up-front).

The difficulty is that some functions -- importantly
including "[" -- already have methods which make POSIXlt
behave like a vector.  The documentation for POSIXlt just
says it's a list of 9 elements: it mentions methods for
addition etc, but AFAICT it doesn't say that subsetting won't
behave is "["'s help says for a list-like object.

In the end, "[" sees a different length to "[[" and "$"
here, so a length.POSIXlt() just shuffles the issue around.

Anyhow, I somehow missed there have been other PRs on this,
including discussion on r-devel of "[" and logical vs physical
length() under PR#10507.  I'm sorry for being repetitive.

Mark <><

Comment 3 Jitterbug compatibility account 2009-11-21 00:03:20 UTC
From: Martin Maechler <maechler@stat.math.ethz.ch>
>>>>> "PD" == Peter Dalgaard <p.dalgaard@biostat.ku.dk>
>>>>>     on Fri, 20 Nov 2009 09:54:34 +0100 writes:

    PD> mark@celos.net wrote:
    >> Arrays of POSIXlt dates always return a length of 9.  This
    >> is correct (they're really lists of vectors of seconds,
    >> hours, and so forth), but other methods disguise them as
    >> flat vectors, giving superficially surprising behaviour:
    >> 
    >> strings <- paste('2009-1-', 1:31, sep='')
    >> dates <- strptime(strings, format="%Y-%m-%d")
    >> 
    >> print(dates)
    >> #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
    >> #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
    >> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
    >> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
    >> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
    >> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
    >> # [31] "2009-01-31"
    >> 
    >> print(length(dates))
    >> # [1] 9
    >> 
    >> str(dates)
    >> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
    >> 
    >> print(dates[20])
    >> # [1] "2009-01-20"
    >> 
    >> print(length(dates[20]))
    >> # [1] 9
    >> 
    >> I've since realised that POSIXct makes date vectors easier,
    >> but could we also have something like:
    >> 
    >> length.POSIXlt <- function(x) { length(x$sec) }
    >> 
    >> in datetime.R, to avoid breaking functions (like the
    >> str.POSIXt method) which use length() in this way?


    PD> [You need "wishlist" in the title for this sort of stuff.]

    PD> I'd be wary of this. Just the other day we found that identical() broke 
    PD> on some objects because a package had length() redefined as a class 
    PD> method. I.e. the danger is that something wants to use length() with its 
    PD> original low-level interpretation.

Yes, of course.
and Romain mentioned  str().  Note that we have needed to define
a "POSIXt" method for str(), partly just *because* of the
current anomaly:
As Tony Plate, e.g., has argued, entirely correctly in my view,
the anomaly is that    length() and "["   are not compatible;
and while I think no R language definition says that they should
be, I still believe that you need very good reasons for them to
be incompatible, as they are for POSIXlt.

In the current case, for me the only good reason is backwards
compatibility.
My personal taste would be to change it and see what happens.
I would be willing to clean up after that change within R 'base'
and all packages I am coauthoring (quite a few), but of course
there are still a thousand more R packages..
My strong bet would be that less than 1% would be affected,
and my point guess for the percentage affected would be
rather in the order of  1/1000.

The question is if we (you too!), the R community, are willing to
bear the load of cleanup, after such a change which would really
*improve* consistency of that small corner of R.
For me, as I indicated above, I am willing to bear my share
(and actually have got it ready for R-devel)

Martin Maechler, ETH Zurich (and R Core Team)

Comment 4 Jitterbug compatibility account 2009-11-21 14:32:44 UTC
Audit (from Jitterbug):
Sat Nov 21 08:32:44 2009	ripley	moved from incoming to wishlist
Comment 5 Jitterbug compatibility account 2009-11-22 22:21:33 UTC
From: Tony Plate <tplate@acm.org>
maechler@stat.math.ethz.ch wrote:
>>>>>> "PD" == Peter Dalgaard <p.dalgaard@biostat.ku.dk>
>>>>>>     on Fri, 20 Nov 2009 09:54:34 +0100 writes:
>>>>>>             
>
>     PD> mark@celos.net wrote:
>     >> Arrays of POSIXlt dates always return a length of 9.  This
>     >> is correct (they're really lists of vectors of seconds,
>     >> hours, and so forth), but other methods disguise them as
>     >> flat vectors, giving superficially surprising behaviour:
>     >> 
>     >> strings <- paste('2009-1-', 1:31, sep='')
>     >> dates <- strptime(strings, format="%Y-%m-%d")
>     >> 
>     >> print(dates)
>     >> #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
>     >> #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
>     >> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
>     >> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
>     >> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
>     >> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
>     >> # [31] "2009-01-31"
>     >> 
>     >> print(length(dates))
>     >> # [1] 9
>     >> 
>     >> str(dates)
>     >> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
>     >> 
>     >> print(dates[20])
>     >> # [1] "2009-01-20"
>     >> 
>     >> print(length(dates[20]))
>     >> # [1] 9
>     >> 
>     >> I've since realised that POSIXct makes date vectors easier,
>     >> but could we also have something like:
>     >> 
>     >> length.POSIXlt <- function(x) { length(x$sec) }
>     >> 
>     >> in datetime.R, to avoid breaking functions (like the
>     >> str.POSIXt method) which use length() in this way?
>
>
>     PD> [You need "wishlist" in the title for this sort of stuff.]
>
>     PD> I'd be wary of this. Just the other day we found that identical() broke 
>     PD> on some objects because a package had length() redefined as a class 
>     PD> method. I.e. the danger is that something wants to use length() with its 
>     PD> original low-level interpretation.
>
> Yes, of course.
> and Romain mentioned  str().  Note that we have needed to define
> a "POSIXt" method for str(), partly just *because* of the
> current anomaly:
> As Tony Plate, e.g., has argued, entirely correctly in my view,
> the anomaly is that    length() and "["   are not compatible;
> and while I think no R language definition says that they should
> be, I still believe that you need very good reasons for them to
> be incompatible, as they are for POSIXlt.
>
> In the current case, for me the only good reason is backwards
> compatibility.
> My personal taste would be to change it and see what happens.
> I would be willing to clean up after that change within R 'base'
> and all packages I am coauthoring (quite a few), but of course
> there are still a thousand more R packages..
> My strong bet would be that less than 1% would be affected,
> and my point guess for the percentage affected would be
> rather in the order of  1/1000.
>
> The question is if we (you too!), the R community, are willing to
> bear the load of cleanup, after such a change which would really
> *improve* consistency of that small corner of R.
> For me, as I indicated above, I am willing to bear my share
> (and actually have got it ready for R-devel)
>   
Would be great to see this change!  Surely the right way to do things is 
that functions that wish to examine the low level structure of S3 
objects should use unclass() before looking at length and elements, so 
there's no reason for a class such as POSIXlt to not provide a 
logical-level length method.

At a broader level, when I've designed vector/array classes, I've 
wondered what methods I should define, but have been unable to find any 
specification of a set of methods.  When one thinks about it, there are 
actually quite a set of strongly-connected methods with quite a lot a 
behaviors to implement, e.g., length, '[' (with logical, numeric & 
character indicies, including 0 and NA possibilities), '[[', 'c', and 
then optionally 'names', and then for multi-dim objects, 'dim', 
'dimnames', etc.  Consequently, last time this discussion on length and 
'[' methods POSIXlt came up, I wrote a function that automatically 
tested behavior of all these methods on a specified class and summarizes 
the behavior.  If anyone is interested in such a thing, I'd be happy to 
dig it up and distribute it (I'd attach it to this message, but I'm on 
vacation and don't have access to the compute that I think it's on.)

-- Tony Plate

> Martin Maechler, ETH Zurich (and R Core Team)
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

Comment 6 Jitterbug compatibility account 2009-11-30 19:07:10 UTC
From: Martin Maechler <maechler@stat.math.ethz.ch>
>>>>> Tony Plate <tplate@acm.org>
>>>>>     on Sun, 22 Nov 2009 10:21:33 -0600 writes:

    > maechler@stat.math.ethz.ch wrote:
    >>>>>>> "PD" == Peter Dalgaard <p.dalgaard@biostat.ku.dk>
    >>>>>>> on Fri, 20 Nov 2009 09:54:34 +0100 writes:
    >>>>>>> 
    >> 
    PD> mark@celos.net wrote:
    >> >> Arrays of POSIXlt dates always return a length of 9.  This
    >> >> is correct (they're really lists of vectors of seconds,
    >> >> hours, and so forth), but other methods disguise them as
    >> >> flat vectors, giving superficially surprising behaviour:
    >> >> 
    >> >> strings <- paste('2009-1-', 1:31, sep='')
    >> >> dates <- strptime(strings, format="%Y-%m-%d")
    >> >> 
    >> >> print(dates)
    >> >> #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
    >> >> #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
    >> >> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
    >> >> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
    >> >> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
    >> >> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
    >> >> # [31] "2009-01-31"
    >> >> 
    >> >> print(length(dates))
    >> >> # [1] 9
    >> >> 
    >> >> str(dates)
    >> >> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
    >> >> 
    >> >> print(dates[20])
    >> >> # [1] "2009-01-20"
    >> >> 
    >> >> print(length(dates[20]))
    >> >> # [1] 9
    >> >> 
    >> >> I've since realised that POSIXct makes date vectors easier,
    >> >> but could we also have something like:
    >> >> 
    >> >> length.POSIXlt <- function(x) { length(x$sec) }
    >> >> 
    >> >> in datetime.R, to avoid breaking functions (like the
    >> >> str.POSIXt method) which use length() in this way?
    >> 
    >> 
    PD> [You need "wishlist" in the title for this sort of stuff.]
    >> 
    PD> I'd be wary of this. Just the other day we found that identical() broke 
    PD> on some objects because a package had length() redefined as a class 
    PD> method. I.e. the danger is that something wants to use length() with its 
    PD> original low-level interpretation.
    >> 
    >> Yes, of course.
    >> and Romain mentioned  str().  Note that we have needed to define
    >> a "POSIXt" method for str(), partly just *because* of the
    >> current anomaly:
    >> As Tony Plate, e.g., has argued, entirely correctly in my view,
    >> the anomaly is that    length() and "["   are not compatible;
    >> and while I think no R language definition says that they should
    >> be, I still believe that you need very good reasons for them to
    >> be incompatible, as they are for POSIXlt.
    >> 
    >> In the current case, for me the only good reason is backwards
    >> compatibility.
    >> My personal taste would be to change it and see what happens.
    >> I would be willing to clean up after that change within R 'base'
    >> and all packages I am coauthoring (quite a few), but of course
    >> there are still a thousand more R packages..
    >> My strong bet would be that less than 1% would be affected,
    >> and my point guess for the percentage affected would be
    >> rather in the order of  1/1000.
    >> 
    >> The question is if we (you too!), the R community, are willing to
    >> bear the load of cleanup, after such a change which would really
    >> *improve* consistency of that small corner of R.
    >> For me, as I indicated above, I am willing to bear my share
    >> (and actually have got it ready for R-devel)

    > Would be great to see this change!  Surely the right way to do things is 
    > that functions that wish to examine the low level structure of S3 
    > objects should use unclass() before looking at length and elements, so 
    > there's no reason for a class such as POSIXlt to not provide a 
    > logical-level length method.

I have now committed such a change to R-devel (only!), revision 50616.
Thank you and Gabor and others for supporting this.

As said here earlier in this thread:  We must be ready to see
that this change can break other code that implicitly assumed
the "old" i.e.  pre R-devel (2.11.x) behavior.

As I also said earlier, I'm prepared to help package authors to
fix their code accordingly,
but I'd be grateful to be notified *if* problems surface from
this.

Martin Maechler, ETH Zurich


    > At a broader level, when I've designed vector/array classes, I've 
    > wondered what methods I should define, but have been unable to find any 
    > specification of a set of methods.  When one thinks about it, there are 
    > actually quite a set of strongly-connected methods with quite a lot a 
    > behaviors to implement, e.g., length, '[' (with logical, numeric & 
    > character indicies, including 0 and NA possibilities), '[[', 'c', and 
    > then optionally 'names', and then for multi-dim objects, 'dim', 
    > 'dimnames', etc.  Consequently, last time this discussion on length and 
    > '[' methods POSIXlt came up, I wrote a function that automatically 
    > tested behavior of all these methods on a specified class and summarizes 
    > the behavior.  If anyone is interested in such a thing, I'd be happy to 
    > dig it up and distribute it (I'd attach it to this message, but I'm on 
    > vacation and don't have access to the compute that I think it's on.)

    > -- Tony Plate

    >> Martin Maechler, ETH Zurich (and R Core Team)
    >> 
    >> ______________________________________________
    >> R-devel@r-project.org mailing list
    >> https://stat.ethz.ch/mailman/listinfo/r-devel
    >> 
    >> 

    > ______________________________________________
    > R-devel@r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/r-devel

Comment 7 Jitterbug compatibility account 2009-11-30 19:41:41 UTC
From: Benilton Carvalho <bcarvalh@jhsph.edu>
Thank you Martin, for putting this together. Cheers, b
On Nov 30, 2009, at 11:10 AM, maechler@stat.math.ethz.ch wrote:

>>>>>> Tony Plate <tplate@acm.org>
>>>>>>    on Sun, 22 Nov 2009 10:21:33 -0600 writes:
> 
>> maechler@stat.math.ethz.ch wrote:
>>>>>>>> "PD" == Peter Dalgaard <p.dalgaard@biostat.ku.dk>
>>>>>>>> on Fri, 20 Nov 2009 09:54:34 +0100 writes:
>>>>>>>> 
>>> 
>    PD> mark@celos.net wrote:
>>>>> Arrays of POSIXlt dates always return a length of 9.  This
>>>>> is correct (they're really lists of vectors of seconds,
>>>>> hours, and so forth), but other methods disguise them as
>>>>> flat vectors, giving superficially surprising behaviour:
>>>>> 
>>>>> strings <- paste('2009-1-', 1:31, sep='')
>>>>> dates <- strptime(strings, format="%Y-%m-%d")
>>>>> 
>>>>> print(dates)
>>>>> #  [1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
>>>>> #  [6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
>>>>> # [11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
>>>>> # [16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
>>>>> # [21] "2009-01-21" "2009-01-22" "2009-01-23" "2009-01-24" "2009-01-25"
>>>>> # [26] "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30"
>>>>> # [31] "2009-01-31"
>>>>> 
>>>>> print(length(dates))
>>>>> # [1] 9
>>>>> 
>>>>> str(dates)
>>>>> # POSIXlt[1:9], format: "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" ...
>>>>> 
>>>>> print(dates[20])
>>>>> # [1] "2009-01-20"
>>>>> 
>>>>> print(length(dates[20]))
>>>>> # [1] 9
>>>>> 
>>>>> I've since realised that POSIXct makes date vectors easier,
>>>>> but could we also have something like:
>>>>> 
>>>>> length.POSIXlt <- function(x) { length(x$sec) }
>>>>> 
>>>>> in datetime.R, to avoid breaking functions (like the
>>>>> str.POSIXt method) which use length() in this way?
>>> 
>>> 
>    PD> [You need "wishlist" in the title for this sort of stuff.]
>>> 
>    PD> I'd be wary of this. Just the other day we found that identical() broke
>    PD> on some objects because a package had length() redefined as a class
>    PD> method. I.e. the danger is that something wants to use length() with its
>    PD> original low-level interpretation.
>>> 
>>> Yes, of course.
>>> and Romain mentioned  str().  Note that we have needed to define
>>> a "POSIXt" method for str(), partly just *because* of the
>>> current anomaly:
>>> As Tony Plate, e.g., has argued, entirely correctly in my view,
>>> the anomaly is that    length() and "["   are not compatible;
>>> and while I think no R language definition says that they should
>>> be, I still believe that you need very good reasons for them to
>>> be incompatible, as they are for POSIXlt.
>>> 
>>> In the current case, for me the only good reason is backwards
>>> compatibility.
>>> My personal taste would be to change it and see what happens.
>>> I would be willing to clean up after that change within R 'base'
>>> and all packages I am coauthoring (quite a few), but of course
>>> there are still a thousand more R packages..
>>> My strong bet would be that less than 1% would be affected,
>>> and my point guess for the percentage affected would be
>>> rather in the order of  1/1000.
>>> 
>>> The question is if we (you too!), the R community, are willing to
>>> bear the load of cleanup, after such a change which would really
>>> *improve* consistency of that small corner of R.
>>> For me, as I indicated above, I am willing to bear my share
>>> (and actually have got it ready for R-devel)
> 
>> Would be great to see this change!  Surely the right way to do things is
>> that functions that wish to examine the low level structure of S3
>> objects should use unclass() before looking at length and elements, so
>> there's no reason for a class such as POSIXlt to not provide a
>> logical-level length method.
> 
> I have now committed such a change to R-devel (only!), revision 50616.
> Thank you and Gabor and others for supporting this.
> 
> As said here earlier in this thread:  We must be ready to see
> that this change can break other code that implicitly assumed
> the "old" i.e.  pre R-devel (2.11.x) behavior.
> 
> As I also said earlier, I'm prepared to help package authors to
> fix their code accordingly,
> but I'd be grateful to be notified *if* problems surface from
> this.
> 
> Martin Maechler, ETH Zurich
> 
> 
>> At a broader level, when I've designed vector/array classes, I've
>> wondered what methods I should define, but have been unable to find any
>> specification of a set of methods.  When one thinks about it, there are
>> actually quite a set of strongly-connected methods with quite a lot a
>> behaviors to implement, e.g., length, '[' (with logical, numeric &
>> character indicies, including 0 and NA possibilities), '[[', 'c', and
>> then optionally 'names', and then for multi-dim objects, 'dim',
>> 'dimnames', etc.  Consequently, last time this discussion on length and
>> '[' methods POSIXlt came up, I wrote a function that automatically
>> tested behavior of all these methods on a specified class and summarizes
>> the behavior.  If anyone is interested in such a thing, I'd be happy to
>> dig it up and distribute it (I'd attach it to this message, but I'm on
>> vacation and don't have access to the compute that I think it's on.)
> 
>> -- Tony Plate
> 
>>> Martin Maechler, ETH Zurich (and R Core Team)
>>> 
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> 
> 
>> ______________________________________________
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

Comment 8 Brian Ripley 2010-03-22 08:41:13 UTC
changed by MM for 2.11.0 (but there were many cons, too)