Bug 15735 - Weird behavior of update.formula
Summary: Weird behavior of update.formula
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Models (show other bugs)
Version: R 3.0.1
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 major
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2014-04-01 11:40 UTC by Mathias
Modified: 2014-05-19 12:49 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mathias 2014-04-01 11:40:32 UTC
R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

When I update a formula by removing a variable that is not there, the result is identical with the original formula:
> update(y ~ x, . ~ . - w)
y ~ x
This makes sense to me!
> 
Now, I do the same with a more complex formula:
> (myFormula <- as.formula(paste(c("y ~ x0", paste0("x", 1:30)), collapse = "+")))
y ~ x0 + x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + 
    x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + 
    x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30
> (update(myFormula, . ~ . - w1))
y ~ x0 + x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + 
    x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x21 + 
    x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30 - 1
# Intercept is removed -> ???
> 
Worse:
> (updateArgument <- as.formula(paste(c(". ~ . ", paste0("w", 1:20)), collapse = " - ")))
. ~ . - w1 - w2 - w3 - w4 - w5 - w6 - w7 - w8 - w9 - w10 - w11 - 
    w12 - w13 - w14 - w15 - w16 - w17 - w18 - w19 - w20
> update(myFormula, updateArgument)
y ~ x0 + x1 + x2 + x3 + x4 + x5 + x6 + x8 + x9 + x11 + x12 + 
    x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + x23 + x24 + 
    x25 + x26 + x27 + x28 + x29 + x30 + x7:w1 + x10:w1 + x21:w2:w8:w9:w10:w12:w14:w17:w18 + 
    x22:w2:w8:w9:w10:w12:w14:w17:w18 - 1
> 
 
# When I run the code reapeatedly, the results may differ, but is always wrong
Comment 1 Peter Dalgaard 2014-04-01 12:25:32 UTC
I see this on OSX as well. Obviously, there's a bug, apparent inside terms.formula, specifically in     

terms <- .External(C_termsform, x, specials, data, keep.order, 
        allowDotAsName)

where x is

y ~ (x0 + x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + 
    x11 + x12 + x13 + x14 + x15 + x16 + x17 + x18 + x19 + x20 + 
    x21 + x22 + x23 + x24 + x25 + x26 + x27 + x28 + x29 + x30) - 
    w1 - w2 - w3 - w4 - w5 - w6 - w7 - w8 - w9 - w10 - w11 - 
    w12 - w13 - w14 - w15 - w16 - w17 - w18 - w19 - w20
attr(,".Environment")
<environment: R_GlobalEnv>
Comment 2 Brian Ripley 2014-05-19 12:49:33 UTC
It's quite prosaic.  That formula has 32 variables, and the code forgot about the intercept in its internal bitmap allocation.  So the 32nd variable got aliased with the intercept.