Created attachment 2288 [details] Diff to package utils implementing packageDate (plus NAMESPACE and Rd) Attached is a small (tested) diff against current SVN which adds a function 'packageDate()' -- I find myself using 'packageVersion(somePkg)' a lot and sometimes wish we had 'packageDate()'. If you consider it to be too trivial I can of course stick it into a local helper package. As an aside, CRAN lets the Date be free-format which sadly prevents us from doing (easy) date arithmetic. So here I just return the character string, and not a Date object (as we would in symmetry with packageVersion()). Illustration using the wonderful CRAN_package_db() follows: R> db <- tools::CRAN_package_db() R> summary(as.Date(db[,"Date"])) Min. 1st Qu. Median Mean "4-12-20" "2014-10-08" "2016-04-16" "2014-07-17" 3rd Qu. Max. NA's "2017-03-08" "2017-12-03" "2698" R> summary(anytime::anydate(db[,"Date"])) Min. 1st Qu. Median Mean "2004-01-03" "2014-09-16" "2016-04-08" "2015-09-10" 3rd Qu. Max. NA's "2017-03-06" "2017-12-03" "2768" R> Regards, Dirk
I take it there is no interest in this?
(In reply to Dirk Eddelbuettel from comment #1) > I take it there is no interest in this? There's some, from me. I think it would make sense _if_ we additionally returned a "Date" object (possibly NA), and also we the function gets an option (i.e. optional argument which when flipped) takes the 'Built:' date if that's available. Maybe some simple (R only, no C(++)) heurestics from anydate() could be used for the NN/NN/NN and NN/NN/NNNN dates?
I agree on Date being preferable, but see the analysis I included in the initial post: too few packages "do it right". Now, over time CRAN could enforce this. Heuristics are fine--what anytime and anydate do internally can be done in R as well. It "simply" tries a bunch of formats. BTW Gabor Csardi has a package that ported the (much more powerful) Date parser from Linus himself (IIRC). I can look that up. But how do we return "either a Date or a character" ?
(In reply to Dirk Eddelbuettel from comment #3) > I agree on Date being preferable, but see the analysis I included in the > initial post: too few packages "do it right". I know... and the new function could entice *some* to do better. > > Now, over time CRAN could enforce this. > > Heuristics are fine--what anytime and anydate do internally can be done in R > as well. It "simply" tries a bunch of formats. good. What I should have said in 'Comment 2' was that the 'Packaged: ' field should be looked at if the regular 'Date:' does not give a valid result; then maybe a 'Date/Publication: ' as CRAN adds and only then a possible 'Built: ' field. > > BTW Gabor Csardi has a package that ported the (much more powerful) Date > parser from Linus himself (IIRC). I can look that up. and you'd propose that full parser to be added to R? > > But how do we return "either a Date or a character" ? I don't understand, where did you cite the "............................" from?
Good comments, and good suggestion re alternate fields for fallback. My last comment was mostly because I did not understand how you suggested to not return a character fallback if no Date was found. But now with suggestion, a quick test: packageDate <- function(pkg, lib.loc = NULL) { res <- suppressWarnings(packageDescription(pkg, lib.loc=lib.loc, fields = "Date")) if (is.na(res)) stop(gettextf("package %s not found", sQuote(pkg)), domain = NA) res <- as.Date(res) if (!is.na(res)) return(res) for (fld in c("Date/Publication", "Built", "Packaged")) { res <- suppressWarnings(packageDescription(pkg, lib.loc=lib.loc, fields = fld)) res <- as.Date(res) if (!is.na(res)) return(res) } res # default NA value } Happy to update the formal diff with something like this. The aforementioned package by Gabor is 'parsedate' but it also contains C code, and is on CRAN. Maybe be overkill here.
(In reply to Dirk Eddelbuettel from comment #5) and then you, DE, entered a better version into a completely unrelated bug report (https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16496) Here it is, modified in the order of fields to be tried, according to my own proposal above: packageDate <- function(pkg, lib.loc = NULL) { for (fld in c("Date", "Packaged", "Date/Publication", "Built")) { res <- suppressWarnings(packageDescription(pkg, lib.loc=lib.loc, fields = fld)) res <- as.Date(res) if (!is.na(res)) return(res) } res # default NA value } This looks quite good now in my view. I don't like the use of suppressWarnings() and may want to add an argument to packageDescription() instead.
Yes, sorry. For me (Chrome, Linux) bugzilla jumps to a _different_ bug report of mine after I save an update. I think the suppressMessages() use was copied over from packageVersion. In r-devel right now: packageVersion <- function(pkg, lib.loc = NULL) { res <- suppressWarnings(packageDescription(pkg, lib.loc=lib.loc, fields = "Version")) if (!is.na(res)) package_version(res) else stop(gettextf("package %s not found", sQuote(pkg)), domain = NA) } I am fine either way and happy to update the suggest patch (including the help page mentioning that we now try multiple fields).
I have now committed a "first proposal" of a new packageDate() to R-devel (svn rev 73925). It does start with trying the "Date" field and trying *some* date formats. but these are both arguments to the functions that also could still get different defaults before release. After applying it the result of installed.packages() for a huge library of "all" (almost) of CRAN and 100s of more packages (mostly bioConductor), I think it could become smarter: If the formats it tried for "Date" gives a date with year before 2000 (say, or '1950') it should drop it and try other formats or other fields. Notably "Packaged" would be quite reliable ... but has not always been present in my huge list of packages. Let's do some experiments with this definition and get proposals for improvements, ideally before release next spring. As "PR" I now close this... which does not preclude us adding feature requests etc here.