Created attachment 1999 [details]
The function dummy.coef.lm fails in more complex cases, notably when terms
include variables that are transformed in the formula of the model.
r.lm <- lm(Fertility ~ cut(Agriculture, breaks=4) + Infant.Mortality,
Error in model.frame.default(Terms, dummy, na.action = function(x) x, :
factor cut(Agriculture, breaks = 4) has new level (0.9995,1]
The problem is that ii works with all.vars , which returns untransformed
variables. This is fixed by using model.frame instead -- which is needed
later in the function anyway.
The function dummy.coef.fix does this.
Thus, dummy.coef.lm should be replaced by dummt.coef.fix .
In the function, there is a warning
warning("some terms will have NAs due to the limits of the method")
I wonder why this is a "limit' (->limitation) of the method.
If some interaction coefficients are undetermined because the respective
combination of levels is not available, NA is the appropriate result.
Are there other cases?
I have extended the function to include confidence intervals and t-tests
and call the extended function allcoef .
The latter are what is shown by summary.lm, except that for the (dumy)
variable that is eliminated by the contrasts . For treatment contrasts,
the added information is trivial (0 with 0 standard error), but for
sum (or weighted sum) contrasts, it is not, and for other contrasts, it may
still recover more useful information.
The function would need some polishing to work in general contexts.
Let me know if you are interested.
Werner Stahel, Jan 4, 2016
Thank you, Werner.
I can confirm that your version works for the example where the current `stats` package one fails.
Your version also fixes the similar problem reported to R-help
"bug in dummy.coef?"
I've spent a bit of time because your version had quite a few changes that were not necessary (you renamed three of the internal variables) and your version must have come from simple "print()"ing of the function definition in an older version of R, so your code misses the comments from the source code and e.g., the newer anyNA() use.
Note that the most current source (of "R-devel") is always (for this function)
((but to find this file, you most easly get a source "tarball" from one of the places linked from https://www.r-project.org/sources.html -- note the daily versions provided by "SfS"!) or if you prefer the web, you can use the 'site:svn.r-project.org/R' trick :
Your question about the warning: I also find it a bit "strange".
One could replace "due to the limits of the method"
by "due to the design" (meaning the linear model design matrix),
but I think you are suggesting that *no* warning should be given there, right?
I did not easily find a case that triggers the warning. Do you have one?