Bug 17185 - utils::vignette does not recognize vignette names correctly if one name is the prefix of another
Summary: utils::vignette does not recognize vignette names correctly if one name is th...
Alias: None
Product: R
Classification: Unclassified
Component: Misc (show other bugs)
Version: R 3.3.*
Hardware: All Windows 64-bit
: P5 minor
Assignee: R-core
Depends on:
Reported: 2016-11-27 12:06 UTC by paul.buerkner@gmail.com
Modified: 2016-12-05 01:38 UTC (History)
2 users (show)

See Also:


Note You need to log in before you can comment on or make changes to this bug.
Description paul.buerkner@gmail.com 2016-11-27 12:06:02 UTC
I initially encountered this problem with the vignettes of the brms package, but Duncan Murdoch suspects it is a bug in utils::vignette.

One vignette (called brms.pdf in the inst/doc directory) is not correctly indexed as "brms", and thus not accessible via vignette("brms"). Instead it takes over the name of another of the packages vignettes called "brms_distreg". I suspect it is because 'brms' is the prefix of 'brms_distreg'.

This only happens when calling the 'vignette' function, but not if one navigates
to the vignettes via 'help.start()'. Also, it does not happen when building the package locally, but only when retrieved from CRAN.
Comment 1 Duncan Murdoch 2016-11-27 12:21:28 UTC
I am seeing a difference between what the HTML help shows and what vignette() shows for version 1.2.0 of brms.  HTML help:

Vignettes from package 'brms'

brms::brms_distreg		Fit Distributional Models with brms	HTML	source	R code
brms::brms_families		Parameterization of response distributions in brms	HTML	source	
brms::brms_monotonic		Estimate monotonic effects with brms	HTML	source	R code
brms::brms_nonlinear		Fit Non-Linear Models with brms	HTML	source	R code
brms::brms_phylogenetics		Fit phylogenetic models with brms	HTML	source	R code
brms::brms		Overview of the brms Package	PDF	source	R code

vignette(package = "brms"):

Vignettes in package ‘brms’:

brms_monotonic        Estimate monotonic effects with brms
                      (source, html)
brms_distreg          Fit Distributional Models with brms (source,
brms_nonlinear        Fit Non-Linear Models with brms (source,
brms_phylogenetics    Fit phylogenetic models with brms (source,
brms_distreg          Overview of the brms Package (source, pdf)
brms_families         Parameterization of response distributions
                      in brms (source, html)

I won't have time to look into this for several weeks.
Comment 2 Henrik Bengtsson 2016-12-05 01:38:43 UTC
Kurt Hornik has already reached out to me to provide a fix for this in the R.rsp package, which I will.  However, for the record, as the author of R.rsp and parts of the vignette code in tools, I'll try to clarify what is causing this problem:

One of the brms vignettes, brms/vignettes/brms.ltx, uses the R.rsp::tex vignette engine.  In R.rsp (<= 0.30.0), this vignette engine does not output any brms.R file during tangling. The rationale for this is that the vignette is a pure LaTeX-based vignette and therefore it makes little sense to generate an R script for it. BTW, this also affects the R.rsp::asis vignette engine.

Moreover, in R one can have more than one tangled output per vignette if "splitting" is enabled (honestly I don't know how / when splitting is used, but the vignette machinery assumes it can be done).  In the case of such multiple tangle output files, any tangled R scripts are basically located as "^brms.*[.]R" in this case.  Now, since there are other vignettes in brms with the same name prefix "brms", the tangle output of those vignettes are also picked up by brms.ltx vignette (just as Paul observed / concluded).  The obvious ad hoc fix is for vignettes to use unique name prefixes (as Paul has already done in his devel version on GitHub).

Now, I do believe these type of file name clashes / ambiguities could and should be resolved in R itself to allow for zero tangle files.  I even made a FIXME note back in 2013 for this in tools:::find_vignette_product():

    # FIXME: we should check a timestamp or something to see that
    #          these were produced by tangling for the "name" vignette,
    #          they aren't just coincidentally similar names.

Source https://svn.r-project.org/R/branches/R-3-3-branch/src/library/tools/R/Vignettes.R.

However, an easier fix is to output an empty tangle file in these cases, which is what R.rsp (> 0.30.0) will do.  When doing so, there is code in `R CMD build` that will actually drop those files immediately such that they are not part of the package *.tar.gz file.  This code is located in src/library/tools/R/build.R, cf. https://github.com/wch/r-source/blob/tags/R-3-3-2/src/library/tools/R/build.R#L321-L332.

For an example where this is used, see the devtools package:

> packageVersion("devtools")
[1] '1.12.0'

> tools::getVignetteInfo("devtools")
     Package    Dir                                                  
[1,] "devtools" "/home/hb/R/x86_64-pc-linux-gnu-library/3.3/devtools"
     Topic          File               Title                   R 
[1,] "dependencies" "dependencies.Rmd" "Devtools dependencies" ""
[1,] "dependencies.html"


> dir(system.file("doc", package = "devtools"))
[1] "dependencies.html" "dependencies.Rmd"  "index.html"

That this happens during the build of a package can be seen if one inspects the content of https://cran.r-project.org/src/contrib/devtools_1.12.0.tar.gz.

I've fixed this in R.rsp devel, which I will submit to CRAN as soon as I've run through all reverse package dependency checks etc.