Bug 16689 - Add debugCall and debugMethod to the base and methods packages, resp.
Summary: Add debugCall and debugMethod to the base and methods packages, resp.
Status: ASSIGNED
Alias: None
Product: R
Classification: Unclassified
Component: Misc (show other bugs)
Version: R-devel (trunk)
Hardware: All All
: P5 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-01-31 02:39 UTC by Gabriel Becker
Modified: 2016-03-31 11:21 UTC (History)
4 users (show)

See Also:


Attachments
Code and documentation patch for added debugging functions (15.27 KB, patch)
2016-01-31 02:39 UTC, Gabriel Becker
Details | Diff
Reworked patch, extending debug/debugonce and pushing complexity into methods (19.00 KB, patch)
2016-02-10 18:53 UTC, Gabriel Becker
Details | Diff
Reworked patch v2 (20.83 KB, patch)
2016-03-30 17:28 UTC, Gabriel Becker
Details | Diff
test script to (interactively) test debug patch (1.32 KB, text/plain)
2016-03-30 17:30 UTC, Gabriel Becker
Details
Reworked patch v3 (22.08 KB, patch)
2016-03-30 20:30 UTC, Gabriel Becker
Details | Diff
updated test script (1.63 KB, text/plain)
2016-03-30 20:31 UTC, Gabriel Becker
Details
Reworked patch v4 (21.51 KB, patch)
2016-03-30 21:50 UTC, Gabriel Becker
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Gabriel Becker 2016-01-31 02:39:12 UTC
Created attachment 2013 [details]
Code and documentation patch for added debugging functions

This patch implements two new debugging functions (along with corresponding undebug versions) - debugCall and debugMethod.

debugMethod is added to the methods package and provides a convenient way to debug S4 methods by hiding the .local-based machinery required to get dispatch to work under the current paradigm. If the .local mechanism is in play, debugging occurs on that function (the "actual" method, as it were), otherwise it occurs on the method definition. In both cases, existing trace() based machinery is used to implement this. In the course of this, after discussions with Michael Lawrence, I also added an isRematched function to methods to detect when rematching was used, and refactored unRematchDefinition to use the new function. 

debugCall is added to base. It provides a way for users to debug whatever closure/method will be called by an actual call expression, including S3 methods, and S4 methods (including those implicit in primitives like c() ). I also added a helper function to detect whether a function is a "standard" S3 generic (i.e. the first expression in the body is a call to UseMethod() ).

The code is commented and documentation has been added/updated.
Comment 1 Martin Morgan 2016-01-31 13:29:34 UTC
Too bad that this introduced two new functions, rather than integrating with trace() (which has a signature argument, and useful flexibility) or debug()?
Comment 2 Michael Lawrence 2016-01-31 13:48:53 UTC
Agreed. The .local handling should be shared with trace. The debugMethod function should become debug (signature=).
Comment 3 Michael Lawrence 2016-01-31 14:19:59 UTC
Also, debugCall should return the call invisibly, to support eval (debugCall  ()). Drop the do.eval arg. Add example of that usage to docs.
Comment 4 Gabriel Becker 2016-01-31 18:25:50 UTC
(In reply to Michael Lawrence from comment #3)
> Also, debugCall should return the call invisibly, to support eval (debugCall
> ()). Drop the do.eval arg. Add example of that usage to docs.

I can do this if it's the consensus, but I would argue that computing on the language (i.e., explicit calls to eval) is a step beyond what we would want to require users to know and think about. If the once and do.eval argument order was switched, I think

    res <- debugCall(c(x,y), TRUE)

Would be much easier for them to remember.


(In reply to Martin Morgan from comment #1)
> Too bad that this introduced two new functions, rather than integrating with 
> trace() (which has a signature argument, and useful flexibility) or debug()?

I'll look at rolling .local handling into trace. It may be complicated by the fact that I actually used calls to the current version of trace to implement it. The issue is that the closure to be debugged doesn't actually exist until the method is called. I should be able to figure something out, though.

I could add call handling to debug(), though that would change the semantics of that function pretty radically in certain corner cases, i.e.:

 > f2 = function() print("what?")
 > debug(f())
 > f2()
 debugging in: f2()
 debug: print("what?")
 Browse[2]> 

vs

 > f = function() f2
 > f2 = function() print("what")
 > debugCall(f())
 > f2()
 [1] "what"
 > f()
 debugging in: f()
 debug: f2
 Browse[2]> 

I'm not sure we care too much about that case, but I know R's philosophy is to preserve backwards compatibility unless there's a really good reason not to, so I figured a separate function for debugCall was safer.

Thanks for considering the patch, and for all your work on R. I'll submit a revised patch soon.
Comment 5 Michael Lawrence 2016-02-01 01:51:05 UTC
I remain in favor of the eval(debugCall()) syntax. If clearly documented (including an example), it should be understood by any user who would understand the meaning of "do.eval". I'm not sure which syntax is easier to remember, but debugCall(c(x,y), TRUE) is certainly not very explicit about what is happening. Using eval() ensures that the syntax is explicit, and the user might learn something about eval(). Arguments have a cost. Why have one that just calls a function on (what could be) the result?

Whether the do.eval argument is dropped or not, the function should at least return the call. Much more useful than NULL.

Perhaps the name should be debugcall; do we need a capital letter there? It would be consistent with debugonce().

I agree that having debug() play the direct role of debugCall() is a bad idea. That's not what Martin was suggesting though. An alternative syntax would be: debug(call=). Then, no new functions are added, and the syntax is very similar. Could also have debugonce(call=). Just bringing it up for discussion.
Comment 6 Martin Maechler 2016-02-01 08:15:38 UTC
(In reply to Michael Lawrence from comment #5)
> I remain in favor of the eval(debugCall()) syntax. If clearly documented
> (including an example), it should be understood by any user who would
> understand the meaning of "do.eval". I'm not sure which syntax is easier to
> remember, but debugCall(c(x,y), TRUE) is certainly not very explicit about
> what is happening. Using eval() ensures that the syntax is explicit, and the
> user might learn something about eval(). Arguments have a cost. Why have one
> that just calls a function on (what could be) the result?
> 
> Whether the do.eval argument is dropped or not, the function should at least
> return the call. Much more useful than NULL.

I agree with that (and agree that it should return it via  `invisible(.)` )
> 
> Perhaps the name should be debugcall; do we need a capital letter there? It
> would be consistent with debugonce().

I agree with ML's preference.

> 
> I agree that having debug() play the direct role of debugCall() is a bad
> idea. That's not what Martin was suggesting though. An alternative syntax
> would be: debug(call=). Then, no new functions are added, and the syntax is
> very similar. Could also have debugonce(call=). Just bringing it up for
> discussion.

Good point.  However, I would have considered  'debugCall()' to be a  "once only" operation (for the current call in the current environment) in any case, wouldn't it ?
Comment 7 Michael Lawrence 2016-02-01 16:32:43 UTC
Agreed. The typical usage would be eval(debugcall()). A minor use would be just as a shortcut for debug(generic, signature=). 

Maybe we want an evald() function that does eval(debugcall()) and debugs only once?
Comment 8 Gabriel Becker 2016-02-02 14:50:37 UTC
(In reply to Michael Lawrence from comment #7)
> Agreed. The typical usage would be eval(debugcall()). A minor use would be
> just as a shortcut for debug(generic, signature=). 
> 
> Maybe we want an evald() function that does eval(debugcall()) and debugs
> only once?

I still feel pretty strongly that novice and intermediate users (those who arguably will benefit most from not having to manually track down which closure will be called themselves) should never be expected/have a need to invoke the evaluator directly. To us it's as simple as calling any other function, but none of us have been in that category of users for a very long time. 

I'm fine with debugcall.

> Good point.  However, I would have considered  'debugCall()' to be a  "once 
> only" operation (for the current call in the current environment) in any case, 
> wouldn't it ?

That is a good point. I can change the default of once to TRUE, I agree this is more intuitive in this case. Should the default also be to evaluate to further match this intuition? so you would only need to do

  debugcall(myfun(5, 4))

To actually step into the debugging without lasting debugging effects?
Comment 9 Michael Lawrence 2016-02-02 15:00:08 UTC
Well, you can either tell them to pass do.eval=TRUE or tell them to call eval(). If they don't understand evaluation, then it's just different-looking copy-and-paste magic to them. For the rest of us, calling eval() would seem preferable.

Having debugcall() evaluate the call deviates from the current behavior of the debug*() functions. evald()  would just be a short-cut for eval(debugcall()) that aligns nicely with evalq().
Comment 10 Gabriel Becker 2016-02-10 18:53:38 UTC
Created attachment 2023 [details]
Reworked patch, extending debug/debugonce and pushing complexity into methods

I've reworked this patch based on the discussions here and in person with ML. The new patch is as follows:

debug, debugonce, and undebug now take a signature argument. When specified, it uses trace to debug the specified S4 method (if fun was an S4 generic, including implicit primitive generics). The methods-based machinery for this is pushed "down" to the methods package, analogous to what was done to support the signature argument in trace, leaving only (relatively) simple, straightforward code in base.

trace with non-null signature now automatically masks the method rematching .local-based machinery, including when called via debug. In this case, the at argument is applied to the body of .local, not the rematched body of the "actual" method definition.

isdebugged now detects whether debug/debugonce was called with a signature (these do not use the debug bit, as that doesn't handle rematching) as a second check after checking for the debug bit. 

debugcall (renamed from debugCall) behaves largely as before, but now calls down to debug/debugonce, rather than debugMethod or trace directly. Also the do.eval argument has been removed, and debugcall always invisibly returns the call expression. undebugcall also now calls directly down to undebug after determining the closure/signature pair.

debugMethod has been removed from the patch, as the signature argument for debug/debugonce renders it obsolete.

I also reset the patch to the latest commit, though no changes in the files I have been working in have been made since the initial patch creation
Comment 11 Michael Lawrence 2016-03-28 17:25:32 UTC
Thanks for revising the patch. A few comments:

Why is it called .debugWithSignature() instead of .debugMethod()? Despite being more to the point, it would be more consistent with .isMethodDebuggged().

undebug() calls untrace(). That exposes an implementation detail. It should call a wrapper like .undebugMethod(). 

The check for .isMethodsDispatchOn() seems unnecessary, since you ultimately require the methods package to be loaded. Why not just check for that? The error message about methods not being loaded is a lot easier to understand how to fix compared to saying that methods dispatch is turned off.

IMPORTANT: debugcall() and undebugcall() should only use the formal arguments that are part of the generic signature.

Have you actually tested these functions without the methods package? I see at leaast one case where getGeneric() is called without the methods:: prefix.

The code and documentation needs some general cleanup but I can do that when integrating the patch.
Comment 12 Gabriel Becker 2016-03-30 17:28:31 UTC
Created attachment 2046 [details]
Reworked patch v2

I have implemented the changes suggested by ML. Particularly,

methods::.debugWithSignature now named methods::.debugMethod, and I created methods::.undebugMethod which base::undebug calls rather than untrace.

debugcall and undebugcall now correctly only use the defined signature, rather than the classes of all named arguments

Some changes have been made to ensure non methods-related portions of the code will work without the methods package loaded, and that all code will run without it attached.

Removed calls to .isMethodsDispatchOn, as redundant with requireNamespace calls. 

trace (and thus debug with non-null signature) no longer emits an erroneous warning of the form 

  Warning: Tracing only in the namespace; to untrace you will need:
  untrace("start", where = getNamespace("stats"))

when tracing or debugging S4 methods.
Comment 13 Gabriel Becker 2016-03-30 17:30:27 UTC
Created attachment 2047 [details]
test script to (interactively) test debug patch

This script covers all the various cases for the new debug functionality. It cannot be run in batch, as it debugs things, but can be walked through to ensure the functionality is working.
Comment 14 Michael Lawrence 2016-03-30 17:49:26 UTC
Thanks for your persistence and the improvements. It looks like undebug() will undebug the generic (if debugged) even if a signature is specified. Am I just reading that wrong? Sure, users are unlikely to debug a generic, but this behavior would be confusing.
Comment 15 Duncan Murdoch 2016-03-30 18:09:40 UTC
In S3, debugging a generic is a sensible thing to do, and it steps into the method.  So I think debugging a generic in S4 could happen fairly frequently.
Comment 16 Gabriel Becker 2016-03-30 20:30:51 UTC
Created attachment 2048 [details]
Reworked patch v3

Fix issue so that there is no incorrect crosstalk between debug(f) and debug(f, signature=) or between undebug(f) and undebug(f, signature=)

Changed an accidental ::: to ::, as intended.

Fixed some grammar and clarity problems in documentation. Added usage for methods::isRematched
Comment 17 Gabriel Becker 2016-03-30 20:31:53 UTC
Created attachment 2049 [details]
updated test script

Added test cases to check for incorrrect crosstalk between undebug(f) and undebug(f, signature=)
Comment 18 Gabriel Becker 2016-03-30 21:50:55 UTC
Created attachment 2050 [details]
Reworked patch v4

Remove errant use of '=' for assignment

rework conditional logic in .TraceWithMethods and undebug to reduce code duplication and increase clarity.
Comment 19 Michael Lawrence 2016-03-30 22:17:17 UTC
Looks good. Thanks for responding to my comments (online and offline). I will massage this a little and incorporate it tonight or early tomorrow.
Comment 20 Michael Lawrence 2016-03-31 10:59:09 UTC
Upon closer inspection, it looks like debugcall() and undebugcall() share about 25 lines of code (after I simplified a lot). I'm going to move debugcall(), undebugcall() and isS3stdGeneric() to utils. They fit better there, and we can hide things behind the namespace.
Comment 21 Michael Lawrence 2016-03-31 11:21:48 UTC
Moving to utils will also allow sharing code with .helpForCall(), which unsurprisingly has very similar logic. It handles many cases that this patch forgot, such as generics with "..." as the signature, handling missing arguments with the "missing" class, etc.