Bug 16404 - Adding incomparables to duplicated.data.frame
Summary: Adding incomparables to duplicated.data.frame
Alias: None
Product: R
Classification: Unclassified
Component: Wishlist (show other bugs)
Version: R 3.2.0
Hardware: Other Other
: P5 enhancement
Assignee: R-core
Depends on:
Reported: 2015-06-01 02:00 UTC by Steven Mortimer
Modified: 2015-06-22 19:16 UTC (History)
3 users (show)

See Also:

Update to duplicated.data.frame (6.99 KB, text/plain)
2015-06-01 02:00 UTC, Steven Mortimer
Documentation update (duplicated.Rd) (7.23 KB, text/x-matlab)
2015-06-01 03:15 UTC, Steven Mortimer
test cases for using incomparables with duplicated.data.frame (2.02 KB, text/plain)
2015-06-01 03:18 UTC, Steven Mortimer
Proposed patch to duplicated.data.frame with documentation (4.80 KB, patch)
2015-06-01 23:07 UTC, Steven Mortimer
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Mortimer 2015-06-01 02:00:23 UTC
Created attachment 1835 [details]
Update to duplicated.data.frame

The duplicated method for data.frames was not yet configured to handle incomparables values as an input parameter. This pull request is for a code change that would enhance the function to accept a list of values. I've attached the code which I believe would fix the issue and I have test cases and performance timings if needed to help ensure the code is working properly. Please contact me with any questions, comments or concerns.

Thank you for your time and consideration,
Steven Mortimer
Comment 1 Steven Mortimer 2015-06-01 03:15:28 UTC
Created attachment 1836 [details]
Documentation update (duplicated.Rd)
Comment 2 Steven Mortimer 2015-06-01 03:18:19 UTC
Created attachment 1837 [details]
test cases for using incomparables with duplicated.data.frame
Comment 3 Steven Mortimer 2015-06-01 16:30:46 UTC
In error I submitted a pull request on a read-only copy of the source code, but I'll provide links to those commits if it's easier to see the diff I'm proposing.


Comment 4 Duncan Murdoch 2015-06-01 16:41:14 UTC
R isn't hosted on Github; that repository is maintained by someone else.  If you would like to submit a patch, please check out the trunk sources from svn.r-project.org, use "svn diff" to calculate the patch, and attach it here.
Comment 5 Martin Maechler 2015-06-01 17:14:59 UTC
I've quickly looked at your proposal --- an enhancement for the data.frame method of duplicated().

One thing: you propose to change 'incomparables' from a vector to a list.
  That is not okay for compatibility and consistency reasons.

What you could do is to allow it to be *either* a vector {==> compatible with current behavior}  *or* a list {==> new behavior}.

This will have a 2nd advantage that would make such a change more acceptable:
the new R code could work a bit along the outline

if(is.list(incomparables)) {
   .... new stuff ...
} else {
   ... __unchanged__ current stuff ...
Comment 6 Steven Mortimer 2015-06-01 23:07:49 UTC
Created attachment 1838 [details]
Proposed patch to duplicated.data.frame with documentation

This is a proposed patch with documentation from the base directory where R/duplicated.R and man/duplicated.Rd is located. The patch is intended to enhance the duplicated.data.frame function by allowing it to accept a list of incomparables
Comment 7 Steven Mortimer 2015-06-01 23:15:46 UTC
Thank you Martin for your suggestion. It was very good guidance. 

I did add a message to the unchanged code portion in case the user doesn't use a list, but meant to use the new functionality. I hope that the message is okay to include in the unchanged code portion, but if it is not okay, then we can remove it and leave the original code completely unchanged.

The new structure is 
if(is.list(incomparables)) {
   .... new stuff ...
} else {
   ... __unchanged__ current stuff ... + plus message hint to use a list

Comment 8 Steven Mortimer 2015-06-22 19:16:15 UTC
Please let me know if there is any additional work required on my part. I haven't heard a response since the calculated patch was submitted (https://bugs.r-project.org/bugzilla3/attachment.cgi?id=1838). 

Thank you,