Bug 17195 - pmin/pmax issue with ordered factors
Summary: pmin/pmax issue with ordered factors
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Misc (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P5 normal
Assignee: R-core
URL: https://stat.ethz.ch/pipermail/r-deve...
Depends on:
Blocks:
 
Reported: 2016-12-17 05:20 UTC by Suharto Anggono
Modified: 2017-01-09 16:26 UTC (History)
1 user (show)

See Also:


Attachments
patch: use internal method for "ordered" (1016 bytes, patch)
2016-12-17 05:20 UTC, Suharto Anggono
Details | Diff
'as.vector' on 'is.na' and comparison results (3.09 KB, patch)
2016-12-24 15:21 UTC, Suharto Anggono
Details | Diff
'as.vector' on logical-like; not put attributes on classed object (3.28 KB, patch)
2016-12-28 15:22 UTC, Suharto Anggono
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Suharto Anggono 2016-12-17 05:20:09 UTC
Created attachment 2196 [details]
patch: use internal method for "ordered"

Quoting https://stat.ethz.ch/pipermail/r-devel/2016-November/073433.html :

---------------
     Dear all,

pmin/pmax used to work with ordered factors but fail now since this 
summer when those functions have tried to handle more cases.

A simple way to trigger the issue is:

 > min(ordered(c(1,5,6)))
[1] 1
Levels: 1 < 5 < 6
 > pmin(ordered(c(1,5,6)), ordered(c(1,5,6)))
Error in `mostattributes<-`(`*tmp*`, value = attributes(elts[[1L]])) :
   adding class "factor" to an invalid object

A simple fix is to explicitly test for the ordered class and use the 
internal method in that case as proposed in the attached patch.

     Yours,

         Erwan

-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.diff
Type: text/x-patch
Size: 1016 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-devel/attachments/20161130/71dc22b6/attachment.bin>
---------------


Relevant NEWS entry for R 3.3.2:
pmax() and pmin() now work with (more ?) classed objects, such as "Matrix" from
the Matrix package, as documented for a long time.

Looking at the code of function 'pmax' in R 3.3.2, 'as.vector' is the culprit. Further, because of that, comparison method of class "ordered" is not called.
Comment 1 Suharto Anggono 2016-12-24 15:21:41 UTC
Created attachment 2198 [details]
'as.vector' on 'is.na' and comparison results

The original 'pmax' works correctly on data frames without NA. With this change, it no longer does.
Comment 2 Martin Maechler 2016-12-27 12:41:24 UTC
I'm confused (I think) about the two attached patches... From reading I guessed you that 2198 was to be used instead of 2196...
but really  2198 removes all the improvements committed in the summer which enabled   pmax() and pmin() to work correctly, e.g., with Matrix::Diagonal() matrices.

Your first (small) patch (#2196) does rather address the problem with ordered() objects mentioned by Erwan on R-devel (== the message you cited).

I now think you had wanted a different version of your 2nd patch.  
Can you comment?

-----------------

Note:  Ewan is right that the new pmin() / pmax()  do not work at all anymore with ordered factors .. and that is clearly a regression  (some may say it is not a bug, as the help page does not mention factors at all and talks about numeric or character vectors only ... but I'm not going there).

However, I now see that it did work somewhat "strangely" before, when 'ordered' was combined with e.g. simple numeric, and that did not contain levels of the factor:

> of <- ordered(c(1,5,6))
> pmin(ordered(c(1,5,6)), 3) # should give error, warning, or *different* result
[1] 1 5 6
Levels: 1 < 5 < 6
> pmin(ordered(c(1,5,6)), 5) # fine
[1] 1 5 5
Levels: 1 < 5 < 6
> 

The help file clearly says that the first argument determines the "properties" (my language, vague on purpose here) those of the result. ... and it is not so clear what should happen in the first case above... but I think that the behavior should rather change - at least eventually, in a separate "issue".
Comment 3 Martin Maechler 2016-12-27 12:59:29 UTC
(In reply to Suharto Anggono from comment #1)
> Created attachment 2198 [details]
> 'as.vector' on 'is.na' and comparison results
> 
> The original 'pmax' works correctly on data frames without NA. With this
> change, it no longer does.

Can you give a data frame example?   
In  R 3.3.1,  
 pmin(USJudgeRatings, 3)

does not work sensibly, but the d.f. has no NAs.
Comment 4 Suharto Anggono 2016-12-27 16:17:46 UTC
(In reply to Martin Maechler from comment #2)
> I'm confused (I think) about the two attached patches... From reading I
> guessed you that 2198 was to be used instead of 2196...
> but really  2198 removes all the improvements committed in the summer which
> enabled   pmax() and pmin() to work correctly, e.g., with Matrix::Diagonal()
> matrices.
> 
> Your first (small) patch (#2196) does rather address the problem with
> ordered() objects mentioned by Erwan on R-devel (== the message you cited).
> 
> I now think you had wanted a different version of your 2nd patch.  
> Can you comment?

2196 is really a patch attached to Erwan's message.

Yes, 'i4' part could be retained. But, with 2198 as is, Matrix::Diagonal matrices is fine.

Working of example in reg-S4.R :

R> # 'pmax' and 'pmin' are changed according to 2198
R> library(Matrix)
R> (D5. <- Diagonal(x = 5:1))
5 x 5 diagonal matrix of class "ddiMatrix"
     [,1] [,2] [,3] [,4] [,5]
[1,]    5    .    .    .    .
[2,]    .    4    .    .    .
[3,]    .    .    3    .    .
[4,]    .    .    .    2    .
[5,]    .    .    .    .    1
R> pmax(D5., -1)
5 x 5 diagonal matrix of class "ddiMatrix"
     [,1] [,2] [,3] [,4] [,5]
[1,]    5    .    .    .    .
[2,]    .    4    .    .    .
[3,]    .    .    3    .    .
[4,]    .    .    .    2    .
[5,]    .    .    .    .    1
R> pmin(D5., 7)
5 x 5 diagonal matrix of class "ddiMatrix"
     [,1] [,2] [,3] [,4] [,5]
[1,]    5    .    .    .    .
[2,]    .    4    .    .    .
[3,]    .    .    3    .    .
[4,]    .    .    .    2    .
[5,]    .    .    .    .    1
R> pmin(D5.+2, 3)
5 x 5 Matrix of class "dgeMatrix"
     [,1] [,2] [,3] [,4] [,5]
[1,]    7    2    2    2    2
[2,]    2    6    2    2    2
[3,]    2    2    5    2    2
[4,]    2    2    2    4    2
[5,]    2    2    2    2    3
R> pmin(1, D5.)
Note: method with signature ‘numeric#dMatrix’ chosen for function ‘>’,
 target signature ‘numeric#ddiMatrix’.
 "ANY#ddiMatrix" would also be valid
<sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient
 [1] 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1
R> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: i386-w64-mingw32/i386 (32-bit)
Running under: Windows XP (build 2600) Service Pack 2

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Matrix_1.2-7.1

loaded via a namespace (and not attached):
[1] grid_3.3.2      lattice_0.20-34
Comment 5 Suharto Anggono 2016-12-27 16:29:24 UTC
(In reply to Martin Maechler from comment #3)
> (In reply to Suharto Anggono from comment #1)
> > Created attachment 2198 [details]
> > 'as.vector' on 'is.na' and comparison results
> > 
> > The original 'pmax' works correctly on data frames without NA. With this
> > change, it no longer does.
> 
> Can you give a data frame example?   
> In  R 3.3.1,  
>  pmin(USJudgeRatings, 3)
> 
> does not work sensibly, but the d.f. has no NAs.

Two arguments, both are data frames with the same number of rows and columns.
pmin(USJudgeRatings, 11 - USJudgeRatings)
Comment 6 Martin Maechler 2016-12-27 19:53:32 UTC
(In reply to Suharto Anggono from comment #4)
> (In reply to Martin Maechler from comment #2)
> > I'm confused (I think) about the two attached patches... From reading I
> > guessed you that 2198 was to be used instead of 2196...
> > but really  2198 removes all the improvements committed in the summer which
> > enabled   pmax() and pmin() to work correctly, e.g., with Matrix::Diagonal()
> > matrices.
> > 
> > Your first (small) patch (#2196) does rather address the problem with
> > ordered() objects mentioned by Erwan on R-devel (== the message you cited).
> > 
> > I now think you had wanted a different version of your 2nd patch.  
> > Can you comment?
> 
> 2196 is really a patch attached to Erwan's message.
> 
> Yes, 'i4' part could be retained. But, with 2198 as is, Matrix::Diagonal
> matrices is fine.
> 
> Working of example in reg-S4.R :

It only *looks* as if it is working:  It works *wrongly* for the case
   pmin(D.+2, 3)

i.e., the non-diagonal case: if you look,
the values should be <= 3,  but the [1,1] entry is 7 !!!
Comment 7 Suharto Anggono 2016-12-28 15:13:20 UTC
(In reply to Martin Maechler from comment #6)
> (In reply to Suharto Anggono from comment #4)
> > Yes, 'i4' part could be retained. But, with 2198 as is, Matrix::Diagonal
> > matrices is fine.
> > 
> > Working of example in reg-S4.R :
> 
> It only *looks* as if it is working:  It works *wrongly* for the case
>    pmin(D.+2, 3)
> 
> i.e., the non-diagonal case: if you look,
> the values should be <= 3,  but the [1,1] entry is 7 !!!

Ah, I was wrong. Sorry, I didn't realize it. For S4 object, putting original attributes brings the original object back.

I avoided special treatment for S4 object, if possible.
Comment 8 Suharto Anggono 2016-12-28 15:22:06 UTC
Created attachment 2205 [details]
'as.vector' on logical-like; not put attributes on classed object

This is revised 2198.
Comment 9 Martin Maechler 2016-12-30 15:59:20 UTC
A fix for this (and the "length 0" case) from your patches -- thank you!! --
has been committed, in svn rev 71860, to R-devel.  
The plan is to port it to R-patched next week (if no problems surface).
Comment 10 Suharto Anggono 2017-01-07 06:12:14 UTC
With r71886, this fails.
library(Matrix); pmax(Diagonal(x = c(5:2, NA)), -1)

If the example is intended to succeed, a rather blind, but consistent, approach to be used in function 'pmax' is applying the following function to is.na(mmm), is.na(each) and (mmm < each).
function(x) if(isS4(x)) methods::as(x, "logical") else x
Comment 11 Martin Maechler 2017-01-07 12:13:30 UTC
(In reply to Suharto Anggono from comment #10)
> With r71886, this fails.
> library(Matrix); pmax(Diagonal(x = c(5:2, NA)), -1)

Indeed, thank you for reporting this.   The above *does* work in R 3.3.2 and in  R-patched where the  if i4) { .. }  clauses are present.

So this is indeed a bug (in R-devel only) and needs to be addressed.

> If the example is intended to succeed, a rather blind, but consistent,
> approach to be used in function 'pmax' is applying the following function to
> is.na(mmm), is.na(each) and (mmm < each).
> function(x) if(isS4(x)) methods::as(x, "logical") else x

something like that, yes.   
We'd really like a version that is both elegant and efficient notably for the non-S4 cases.

I have liked your original approach of using as little S4 special casing as possible,  but it seems we need to move even a bit further away from that.
Comment 12 Martin Maechler 2017-01-09 16:26:11 UTC
(In reply to Martin Maechler from comment #11)
> (In reply to Suharto Anggono from comment #10)
> > With r71886, this fails.
> > library(Matrix); pmax(Diagonal(x = c(5:2, NA)), -1)
> 
> Indeed, thank you for reporting this.   The above *does* work in R 3.3.2 and
> in  R-patched where the  if i4) { .. }  clauses are present.
> 
> So this is indeed a bug (in R-devel only) and needs to be addressed.
> 
> > If the example is intended to succeed, a rather blind, but consistent,
> > approach to be used in function 'pmax' is applying the following function to
> > is.na(mmm), is.na(each) and (mmm < each).
> > function(x) if(isS4(x)) methods::as(x, "logical") else x
> 
> something like that, yes.   
> We'd really like a version that is both elegant and efficient notably for
> the non-S4 cases.
> 
> I have liked your original approach of using as little S4 special casing as
> possible,  but it seems we need to move even a bit further away from that.

committed svn r 71946 -- addressing the problem closely to what you proposed in comment 10.   ==> I'm again closing the bug report -- at least  "for now"    ;-)