Bug 16333 - Allow different way to specify vignette engine
Summary: Allow different way to specify vignette engine
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Misc (show other bugs)
Version: R 3.2.0
Hardware: All All
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2015-04-25 12:40 UTC by flying-sheep
Modified: 2015-04-26 12:35 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description flying-sheep 2015-04-25 12:40:40 UTC
Hi, I was puzzled on how to make a custom vignette engine work, until i was told that i can’t: https://github.com/flying-sheep/nbconvertR/issues/1#issuecomment-96026789

the problem is that the VignetteEngine regex is too restrictive. The IPython notebook format I want to create a vignette from is based on JSON.

there’s no way to get a single backslash into it (all backslashes are part of an escape sequence) but the regex requires a single backslash:

see .get_vignette_metadata in https://svn.r-project.org/R/tags/R-3-2-0/src/library/tools/R/Vignettes.R

therefore you either have to expand the regex to allow more than one backslashes (to fix my specific case) or think of something else.
Comment 1 Duncan Murdoch 2015-04-25 12:49:22 UTC
The problem is that the JSON format doesn't allow comments.  So why not simply have a second file that contains all the information that R needs in the format R is looking for, and have your driver start with that file?
Comment 2 flying-sheep 2015-04-25 13:46:44 UTC
the problem is that your way of specifying this is *very* restrictive.

it requires that the line appears, on its own(!).

that means either latex-style % comments or block comments like

<!--
% \VignetteEngine{knitr::knitr}
-->

it also requires that the whole line may appear like this in the format, with % sign, backslash, and two braces.

as you can see, knitr got lucky that markdown supports HTML block comments.

e.g. http://www.patoline.com/ doesn’t have those, and think about other languages that might have only line comments started with -- or # or sth.

your idea is OK, but it needs a silly extra file just for metadata.

how about changing the regex instead? maybe to

paste0("\\W*\\\\+Vignette", tag, "\\{([^}]*)\\}.*")

this would allow any non-word character in front of it, and any character after it. (thus allowing \n after it)
Comment 3 Duncan Murdoch 2015-04-25 14:19:00 UTC
Your suggestion introduces a new problem:  vignettes that work now will break, if they happen to have text matching your regex, e.g. in a description of how to put together a vignette.  

It would solve the problem that JSON doesn't allow the current pattern to be embedded, but it doesn't solve the general problem for other input formats, e.g. binary formats like .odf or .docx.  My suggestion is much more general, and has the additional benefit of not requiring any additional work by anyone:  JSON users will need to modify their documents in any case, so why not do it in a separate human readable metadata file?
Comment 4 flying-sheep 2015-04-25 16:11:07 UTC
because

1. notebooks already have a metadata section, so this is redundant
2. it’s an extra file that has to be manually excluded using .Rbuildignore
3. it will be opened when clicking on “source” in the vignette browser page

but i have a better idea than changing the regex:

currently tools:::vignetteInfo and tools:::getVignetteEngine are the only metadata-fetching functions, and they depend on those regexes.

vignette workflow is: load the VignetteBuilder package(s) specified in DESCRIPTION. they’ll register vignette engines. then loop all files and look if any registered engine would handle it. for each file, retrieve the engine it actually wants via above metadata functions.

let’s change this, and allow every engine to specify a metadata extraction function <meta>. if unspecified, this falls back to tools:::vignetteInfo.

which <meta> is used depends on the engines whose pattern matches the filename: the <meta> function of the matching engine will be used for creating the file and retrieving metadata.

uses of tools:::getVignetteEngine(...) will be removed and replaced with <meta>(...)$engine, and uses of tools:::vignetteInfo(...) with <meta>(...).

TL;DR: allow registered engines to specify their own metadata retrieving function, and retrieve metadata with the function from the engine whose pattern matches the file name.
Comment 5 Duncan Murdoch 2015-04-25 16:58:52 UTC
That sounds like a sensible suggestion.  As long as it's backwards compatible, I think it could even make it into 3.2.0-patched.  Do you want to submit a patch against R-devel to implement it?
Comment 6 flying-sheep 2015-04-26 12:35:55 UTC
i think i can do this. it will be backwards-compatible because it will only add an optional parameter and restructure internal functions without changing their signatures. any special formatting for the patch (e.g. git compatible)?

—————————————————————————————————————————————————————————————————————————————————

also we have to agree on what’s happening in the rare case that a file matching the pattern of multiple engines whose metadata functions return different stuff.

e.g. let’s say yihui falls on his head and in his daze chooses to supply a metadata function “metaknit” to the knitr::knitr engine that is incompatible with the default one: now vignetteInfo and metaknit ar both run on on a .Rnw file and return different info. which one do we choose?

this problem is mostly academic because it requires either a .Rnw vignette engine author to willingly be incompatible with sweave metadata, or a package author to load two VignetteBuilders that both accept the same files *and* have incompatible metadata functions.

i think either choosing the first and emitting a warning or providing a detailed error message and stopping would be appropriate.