Bug 16884 - codetools::findGlobals misses global variable
Summary: codetools::findGlobals misses global variable
Status: CLOSED Works as documented
Alias: None
Product: R
Classification: Unclassified
Component: Add-ons (show other bugs)
Version: R 3.2.4
Hardware: x86_64/x64/amd64 (64-bit) OS X Mavericks
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-05-05 22:49 UTC by Dan Sullivan
Modified: 2017-05-19 03:38 UTC (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dan Sullivan 2016-05-05 22:49:21 UTC
Here is a case where the variable "x" is missed by findGlobals

Steps to reproduce:

x <- 1
fn2 <- function() {
  x <- x + y
  x
}
codetools::findGlobals(fn2)

produces

[1] "{"  "+"  "<-" "y"
Comment 1 Yihui Xie 2017-05-17 21:26:19 UTC
I just want to add another (similar) case originally reported at https://github.com/yihui/knitr/issues/1403

> codetools::findGlobals(function(){x<-2*x})
[1] "{"  "*"  "<-"

I'm not sure if it is expected that `x` was not recognized as a global variable.
Comment 2 Tomas Kalibera 2017-05-18 07:18:38 UTC
Please keep in mind that codetools is performing (simple) static analysis of R code. Such analysis can in principle only be approximate and the documentation reminds that:

"
The result is an approximation. R semantics only allow variables that might be local to be identified
"

This means that whenever the analysis thinks that a variable may be local, it treats it as local. In fact it may not be local because of inherent limitations of the analysis or because the variable is local at some locations but global at other locations in the function code. And the latter is the case of your example: variable "x" is actually local in part of the function code.

If the analysis does not decide that a variable may be local, it treats it as global. So, the analysis is written to err on the "local" side - so that when it says "global", the variable is really very likely global. And AFAIK this is used to produce warnings, so some errors are still acceptable (saying something is a global when in fact it cannot be).

In knitr you would need a different analysis. You want to be able to conclude, at least in some cases, that code chunk B following code chunk A does not depend on code chunk A. For correct behavior you need the analysis to be always right when it says that there is no dependency. Not only that it must err on the "there are dependencies" side, but it must be always right when it says there are no dependencies. That would be extremely hard to say  with full R language - and findGlobals is definitely wrong analysis for this task.
Comment 3 Luke Tierney 2017-05-18 19:15:15 UTC
codetools is a recommended package but not a base package, so issues should be sent to the maintainer (me).

As Tomas points out, the analysis is only approximate. For historical reasons, findGlobals identifies only variables that are certain, within the limitations of analysis, to be global. If there are ambiguities, or if a variable is used both as a local and a global then it is considered not global.

For your purpose you would want amgiguities and multiple uses resolved the other way: if it might be used as a global you want to know. I don't think it would be too hard to add and option or an alternate version of findGlobasl to codetools toat did this. If yo uwant to give it a try and provide a patch I'm happy to take a look.
Comment 4 Yihui Xie 2017-05-19 03:38:34 UTC
Thank you both for the advice! It makes perfect sense to me, and I'll follow it in the next version of knitr.