Bug 16921 - Rscript does not load the methods package by default
Summary: Rscript does not load the methods package by default
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: Startup (show other bugs)
Version: R 3.3.*
Hardware: All All
: P5 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-05-23 16:21 UTC by Jim Hester
Modified: 2016-05-25 16:55 UTC (History)
3 users (show)

See Also:


Attachments
Patch to include the methods package in Rscript defaults (1.75 KB, patch)
2016-05-23 16:21 UTC, Jim Hester
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jim Hester 2016-05-23 16:21:15 UTC
Created attachment 2093 [details]
Patch to include the methods package in Rscript defaults

This is a proposal to change the default behavior of Rscript to load the methods package by default. This has been a frequent issue when code that runs in an interactive session or by calling R directly fails to run when using Rscript and it is not intuitively obvious to users why the code fails.

In addition, the performance impact is now largely attenuated due to lazy loading of package functions in recent R versions.

    > for i in {1..5}; do time R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e 'invisible()';done
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.137 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.132 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.05s system 95% cpu 0.132 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.08s user 0.04s system 95% cpu 0.128 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats' Rscript -e   0.09s user 0.05s system 94% cpu 0.139 total

    > for i in {1..5}; do time R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript -e 'invisible()';done
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.14s user 0.05s system 96% cpu 0.195 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.14s user 0.05s system 95% cpu 0.193 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.187 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.191 total
    R_DEFAULT_PACKAGES='datasets,utils,grDevices,graphics,stats,methods' Rscript   0.13s user 0.05s system 96% cpu 0.191 total

The attached patch changes the default to 'datasets,utils,grDevices,graphics,stats,methods' and updates the documentation of `?Rscript` and in R-admin. If there are other places where changes are necessary I am happy to provide updated patches.
Comment 1 Dirk Eddelbuettel 2016-05-23 20:27:55 UTC
That is (was?) a conscious design decision, at least when Rscript was released.

[ Disclaimer: I am not unbiased here as littler came out a little earlier, always loaded the methods package, and was (and still is) still quicker to start. ]
Comment 2 Michael Lawrence 2016-05-24 17:44:03 UTC
Thanks for raising this issue. Rscript was designed for running short snippets of R code in the context of programs like R CMD check. It sounds like you've encountered use cases where more complex programs are invoked via the command line. Would you please provide some examples of such cases? Thanks in advance.
Comment 3 Dirk Eddelbuettel 2016-05-24 17:50:48 UTC
Easy:  _Any_ job you may want to run from, say, cron that involves S4.  

Works interactively in R (as methods is loaded), bombs immediately via Rscript.  I never understood how that is supposed to make sense, and I don't even program with S4.
Comment 4 Michael Lawrence 2016-05-24 18:19:20 UTC
Yes, that is pretty obvious, but a lot of times scripts use S4 through packages that they are already attaching. Only things like as() might cause a problem. Personally, I think consistency should win here, but others have a different opinion.
Comment 5 Jim Hester 2016-05-24 18:49:30 UTC
I have run into this a number of times personally, two recent ones were a user seeing different coverage results locally than on Travis (https://github.com/jimhester/covr/issues/180) and coverage of the testthat package being lower than expected (https://github.com/hadley/testthat/pull/475).

Some more examples of people running into similar issues using a simple github search (https://github.com/search?utf8=%E2%9C%93&q=Rscript+methods+is%3Aissue)
 - https://github.com/tudo-r/BatchJobs/issues/27
 - https://github.com/mllg/batchtools/issues/15
 - https://github.com/hadley/readr/issues/347
 - https://github.com/andrie/version.compare/issues/2
 - https://github.com/hadley/dplyr/issues/1760
 - https://github.com/csgillespie/efficientR/issues/21
 - https://github.com/Bioconductor/BiocParallel/issues/44
 - https://github.com/COMBINE-lab/wasabi/issues/1
 - https://github.com/dhimmel/elevcan/issues/1

I am not familiar enough with S4 dispatch to know the exact cases when this poses a problem, but anecdotally it seems to cause problems anytime you call a package using S4 from Rscript as Dirk mentioned.

My feeling is more time has been wasted by users trying to debug the inconsistency than has been gained by the <60ms difference in startup time (from my benchmarks). This may have been a more reasonable trade-off when package loading was slower, but the slowdown with modern R seems slight to me.
Comment 6 Dirk Eddelbuettel 2016-05-24 19:23:37 UTC
I also explained maybe half a dozen times on StackOverflow.

The un-intuitive nature of this is, as Jim argued, indeed less than helpful.  I would welcome a reversal which makes Rscript closer to standard R.
Comment 7 Michael Lawrence 2016-05-24 19:29:48 UTC
Thanks for these examples. While some of them have helpfully exposed bugs in the methods package or other packages, many of them do argue for consistency.
Comment 8 Benjamin Tyner 2016-05-25 16:55:22 UTC
I agree that interactive and non-interactive modes ought to use the same defaultPackages.

I suppose removing methods from the list of default packages in interactive mode would be too disruptive, so that implies the lesser evil is to add it for non-interactive mode.

On the other hand, the fact that some bugs would have gone undetected, had this already been the default, is something to consider.