Bug 17138 - latexToUtf8 hangs for certain unrecognized LaTeX macros
Summary: latexToUtf8 hangs for certain unrecognized LaTeX macros
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Analyses (show other bugs)
Version: R 3.3.0
Hardware: Other Linux
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-09-02 07:37 UTC by Matt
Modified: 2016-09-05 11:16 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matt 2016-09-02 07:37:18 UTC
For example:
tools::latexToUtf8(tools::parseLatex("{\\a'\\i}"))

or
tools::latexToUtf8(tools::parseLatex("{\\'\\i}"))

The hang occurs because when latexToUtf8 is processing a "MACRO" tag, the switch statement at line 100 of the file here: https://svn.r-project.org/R/branches/R-3-3-branch/src/library/tools/R/parseLatex.R
does not include a case for nexttag being "MACRO"
Comment 1 Peter Dalgaard 2016-09-02 14:29:46 UTC
I can reproduce this.

Running with debug(tools::latexToUtf8) I see something that looks like internal code damage:

debug: k <- 1L
Browse[3]> 
debug: while (k <= numargs) {
    if (getNext) {
        j <- j + 1L
        if (j > length(x)) {
            warning("argument for ", c(a), " not found", domain = NA)
            nextobj <- latex_tag("", "TEXT")
            nexttag <- "TEXT"
            nextchars <- ""
...
Browse[3]> 
debug: k <- k + 1L
Browse[3]> 
debug: (while) k <= numargs
Browse[3]> 
debug: if (getNext) {
    j <- j + 1L
    if (j > length(x)) {
        warning("argument for ", c(a), " not found", domain = NA)
        nextobj <- latex_tag("", "TEXT")
        nexttag <- "TEXT"
        nextchars <- ""
    }
    else {
        nextobj <- x[[j]]
.....


Notice the garbled while() construct. This may be only cosmetic; at any rate the diagnosis is correct that the switch() 

switch(nexttag, TEXT = {
    args[[k]] <- latex_tag(nextchars[1L], "TEXT")
    nextchars <- nextchars[-1L]
    if (!length(nextchars)) getNext <- TRUE
    if (args[[k]] %in% whitespace) next
    k <- k + 1L
}, COMMENT = getNext <- TRUE, BLOCK = , ENVIRONMENT = , MATH = {
    args[[k]] <- latexToUtf8(nextobj)
    k <- k + 1L
    getNext <- TRUE
}, `NULL` = stop("Internal error:  NULL tag", domain = NA))

encounters nexttag=="MACRO" and does nothing (in particular, does not increase k) so the loop does not terminate.

This was with R 3.3.0 on an aged iMac running Mavericks.
Comment 2 Peter Dalgaard 2016-09-02 14:30:29 UTC
I can reproduce this.

Running with debug(tools::latexToUtf8) I see something that looks like internal code damage:

debug: k <- 1L
Browse[3]> 
debug: while (k <= numargs) {
    if (getNext) {
        j <- j + 1L
        if (j > length(x)) {
            warning("argument for ", c(a), " not found", domain = NA)
            nextobj <- latex_tag("", "TEXT")
            nexttag <- "TEXT"
            nextchars <- ""
...
Browse[3]> 
debug: k <- k + 1L
Browse[3]> 
debug: (while) k <= numargs
Browse[3]> 
debug: if (getNext) {
    j <- j + 1L
    if (j > length(x)) {
        warning("argument for ", c(a), " not found", domain = NA)
        nextobj <- latex_tag("", "TEXT")
        nexttag <- "TEXT"
        nextchars <- ""
    }
    else {
        nextobj <- x[[j]]
.....


Notice the garbled while() construct. This may be only cosmetic; at any rate the diagnosis is correct that the switch() 

switch(nexttag, TEXT = {
    args[[k]] <- latex_tag(nextchars[1L], "TEXT")
    nextchars <- nextchars[-1L]
    if (!length(nextchars)) getNext <- TRUE
    if (args[[k]] %in% whitespace) next
    k <- k + 1L
}, COMMENT = getNext <- TRUE, BLOCK = , ENVIRONMENT = , MATH = {
    args[[k]] <- latexToUtf8(nextobj)
    k <- k + 1L
    getNext <- TRUE
}, `NULL` = stop("Internal error:  NULL tag", domain = NA))

encounters nexttag=="MACRO" and does nothing (in particular, does not increase k) so the loop does not terminate.

This was with R 3.3.0 on an aged iMac running Mavericks.