16665
2016-01-11 12:32:10 +0000
dummy.coef fails when transformations are included in formula
2016-01-22 07:45:43 +0000
1
1
1
Unclassified
R
Analyses
R 3.2.3
Other
Linux
ASSIGNED
P5
enhancement
---
1
stahel
R-core
maechler
oldest_to_newest
91498
0
1999
stahel
2016-01-11 12:32:10 +0000
Created attachment 1999
dummy.coef.fix.R
The function dummy.coef.lm fails in more complex cases, notably when terms
include variables that are transformed in the formula of the model.
r.lm <- lm(Fertility ~ cut(Agriculture, breaks=4) + Infant.Mortality,
data=swiss)
dummy.coef(r.lm)
Error in model.frame.default(Terms, dummy, na.action = function(x) x, :
factor cut(Agriculture, breaks = 4) has new level (0.9995,1]
The problem is that ii works with all.vars , which returns untransformed
variables. This is fixed by using model.frame instead -- which is needed
later in the function anyway.
The function dummy.coef.fix does this.
dummy.coef.fix(r.lm)
Thus, dummy.coef.lm should be replaced by dummt.coef.fix .
In the function, there is a warning
warning("some terms will have NAs due to the limits of the method")
I wonder why this is a "limit' (->limitation) of the method.
If some interaction coefficients are undetermined because the respective
combination of levels is not available, NA is the appropriate result.
Are there other cases?
I have extended the function to include confidence intervals and t-tests
and call the extended function allcoef .
The latter are what is shown by summary.lm, except that for the (dumy)
variable that is eliminated by the contrasts . For treatment contrasts,
the added information is trivial (0 with 0 standard error), but for
sum (or weighted sum) contrasts, it is not, and for other contrasts, it may
still recover more useful information.
The function would need some polishing to work in general contexts.
Let me know if you are interested.
Werner Stahel, Jan 4, 2016
91551
1
maechler
2016-01-22 07:45:43 +0000
Thank you, Werner.
I can confirm that your version works for the example where the current `stats` package one fails.
Your version also fixes the similar problem reported to R-help
"bug in dummy.coef?"
https://stat.ethz.ch/pipermail/r-help/2013-October/362106.html
I've spent a bit of time because your version had quite a few changes that were not necessary (you renamed three of the internal variables) and your version must have come from simple "print()"ing of the function definition in an older version of R, so your code misses the comments from the source code and e.g., the newer anyNA() use.
Note that the most current source (of "R-devel") is always (for this function)
https://svn.r-project.org/R/trunk/src/library/base/R/dummy.coef.R
((but to find this file, you most easly get a source "tarball" from one of the places linked from https://www.r-project.org/sources.html -- note the daily versions provided by "SfS"!) or if you prefer the web, you can use the 'site:svn.r-project.org/R' trick :
https://www.google.ch/search?q=site:svn.r-project.org/R++%27dummy.coef%27&ie=utf-8&oe=utf-8&gws_rd=cr&ei=1t2hVqqGDoXxUt_spugL))
Your question about the warning: I also find it a bit "strange".
One could replace "due to the limits of the method"
by "due to the design" (meaning the linear model design matrix),
but I think you are suggesting that *no* warning should be given there, right?
I did not easily find a case that triggers the warning. Do you have one?
Best regards,
Martin
1999
2016-01-11 12:32:10 +0000
2016-01-11 12:32:10 +0000
dummy.coef.fix.R
file_16665.txt
text/plain
2656
stahel
ZHVtbXkuY29lZi5maXggPC0gZnVuY3Rpb24gKG9iamVjdCwgdXNlLm5hID0gRkFMU0UsIC4uLikg
DQp7DQogICAgeGwgPC0gb2JqZWN0JHhsZXZlbHMNCiAgICBpZiAoIWxlbmd0aCh4bCkpIA0KICAg
ICAgICByZXR1cm4oYXMubGlzdChjb2VmKG9iamVjdCkpKQ0KICAgIFRlcm1zIDwtIHRlcm1zKG9i
amVjdCkNCiAgICB0bCA8LSBhdHRyKFRlcm1zLCAidGVybS5sYWJlbHMiKQ0KICAgIGludCA8LSBh
dHRyKFRlcm1zLCAiaW50ZXJjZXB0IikNCiAgICBmYWNzIDwtIGF0dHIoVGVybXMsICJmYWN0b3Jz
IilbLTEsICwgZHJvcCA9IEZBTFNFXQ0KICAgIFRlcm1zIDwtIGRlbGV0ZS5yZXNwb25zZShUZXJt
cykNCiAgICBtZiA8LSBvYmplY3QkbW9kZWwNCiAgICBpZiAoaXMubnVsbChtZikpIG1mIDwtIG1v
ZGVsLmZyYW1lKG9iamVjdCkNCiAgICB4dG5tIDwtIGRpbW5hbWVzKGZhY3MpW1sxXV0gICMjIG5h
bWVzDQogICAgeHRsdiA8LSBsYXBwbHkobWZbLHh0bm0sIGRyb3A9RkFMU0VdLGZ1bmN0aW9uKHgp
IGxldmVscyh4KSkgIyMgbGV2ZWxzDQogICAgeHRubCA8LSBwbWF4KHNhcHBseSh4dGx2LGxlbmd0
aCksMSkgICMjIG51bWJlciBvZiBsZXZlbHMNCiAgICBsdGVybXMgPC0gYXBwbHkoZmFjcywgMkws
IGZ1bmN0aW9uKHgpIHByb2QoeHRubFt4ID4gMF0pKQ0KICAgIG5sIDwtIHN1bShsdGVybXMpDQog
ICAgIyMgZGYuZHVtbXk6IGRhdGEgZnJhbWUgb2YgdmFycw0KICAgIGFyZ3MgPC0gc2V0TmFtZXMo
dmVjdG9yKCJsaXN0IiwgbGVuZ3RoKHh0bm0pKSwgeHRubSkNCiAgICBmb3IgKGkgaW4geHRubSkN
CiAgICAgICAgYXJnc1tbaV1dIDwtIGlmICh4dG5sW1tpXV0gPT0gMSkgIHJlcC5pbnQoMSwgbmwp
ICAgIGVsc2UNCiAgICAgICAgICBmYWN0b3IocmVwLmludCh4dGx2W1tpXV1bMUxdLCBubCksIGxl
dmVscyA9IHh0bHZbW2ldXSkNCiAgICBkZi5kdW1teSA8LSBhcy5kYXRhLmZyYW1lKGFyZ3MpICMg
ZG8uY2FsbCgiZGF0YS5mcmFtZSIsIGFyZ3MpDQogICAgbmFtZXMoZGYuZHVtbXkpIDwtIHh0bm0N
CiAgICAjIyBybm46IG5hbWVzIG9mIHJvd3MNCiAgICBwb3MgPC0gMA0KICAgIHJuIDwtIHJlcC5p
bnQodGwsIGx0ZXJtcykNCiAgICBybm4gPC0gcmVwLmludCgiIiwgbmwpDQogICAgZm9yIChqIGlu
IHRsKSB7DQogICAgICAgIGkgPC0gdW5saXN0KHh0bm1bZmFjc1ssIGpdID4gMF0pDQogICAgICAg
IGlmYWMgPC0gaVt4dG5sW2ldID4gMV0NCiAgICAgICAgaWYgKGxlbmd0aChpZmFjKSA9PSAwTCkg
ew0KICAgICAgICAgICAgcm5uW3BvcyArIDFdIDwtIGoNCiAgICAgICAgfQ0KICAgICAgICBlbHNl
IGlmIChsZW5ndGgoaWZhYykgPT0gMUwpIHsNCiAgICAgICAgICAgIGRmLmR1bW15W3BvcyArIDFM
Omx0ZXJtc1tqXSwgaWZhY10gPC0geHRsdltbaWZhY11dDQogICAgICAgICAgICBybm5bcG9zICsg
MUw6bHRlcm1zW2pdXSA8LSBhcy5jaGFyYWN0ZXIoeHRsdltbaWZhY11dKQ0KICAgICAgICB9DQog
ICAgICAgIGVsc2Ugew0KICAgICAgICAgICAgdG1wIDwtIGV4cGFuZC5ncmlkKHh0bHZbaWZhY10p
DQogICAgICAgICAgICBkZi5kdW1teVtwb3MgKyAxTDpsdGVybXNbal0sIGlmYWNdIDwtIHRtcA0K
ICAgICAgICAgICAgcm5uW3BvcyArIDFMOmx0ZXJtc1tqXV0gPC0gYXBwbHkoYXMubWF0cml4KHRt
cCksIA0KICAgICAgICAgICAgICAgIDFMLCBmdW5jdGlvbih4KSBwYXN0ZSh4LCBjb2xsYXBzZSA9
ICI6IikpDQogICAgICAgIH0NCiAgICAgICAgcG9zIDwtIHBvcyArIGx0ZXJtc1tqXQ0KICAgIH0N
CiAgICBhdHRyKGRmLmR1bW15LCJ0ZXJtcyIpIDwtIGF0dHIobWYsInRlcm1zIikNCiAgICBsY29u
dHIgPC0gb2JqZWN0JGNvbnRyYXN0cw0KICAgIGxjaSA8LSBzYXBwbHkoZGYuZHVtbXksaXMuZmFj
dG9yKQ0KICAgIGxjb250ciA8LSBsY29udHJbbmFtZXMobGNpKVtsY2ldXSAjIyBmYWN0b3JzIHdp
dGggMSBsZXZlbCBoYXZlIGRpc2FwcGVhcmVkICg/KSANCiAgICBtbSA8LSBtb2RlbC5tYXRyaXgo
VGVybXMsIGRmLmR1bW15LCBsY29udHIsIHhsKQ0KICAgIGlmIChhbnkoaXMubmEobW0pKSkgew0K
ICAgICAgICB3YXJuaW5nKCJzb21lIHRlcm1zIHdpbGwgaGF2ZSBOQXMgZHVlIHRvIHRoZSBsaW1p
dHMgb2YgdGhlIG1ldGhvZCIpDQogICAgICAgIG1tW2lzLm5hKG1tKV0gPC0gTkENCiAgICB9DQog
ICAgY29lZiA8LSBvYmplY3QkY29lZmZpY2llbnRzDQogICAgaWYgKCF1c2UubmEpIA0KICAgICAg
ICBjb2VmW2lzLm5hKGNvZWYpXSA8LSAwDQogICAgYXNnbiA8LSBhdHRyKG1tLCAiYXNzaWduIikN
CiAgICByZXMgPC0gc2V0TmFtZXModmVjdG9yKCJsaXN0IiwgbGVuZ3RoKHRsKSksIHRsKQ0KICAg
IGZvciAoaiBpbiBzZXFfYWxvbmcodGwpKSB7DQogICAgICAgIGtlZXAgPC0gYXNnbiA9PSBqDQog
ICAgICAgIGlqIDwtIHJuID09IHRsW2pdDQogICAgICAgIHJlc1tbal1dIDwtIHNldE5hbWVzKGRy
b3AobW1baWosIGtlZXAsIGRyb3AgPSBGQUxTRV0gJSolIA0KICAgICAgICAgICAgY29lZltrZWVw
XSksIHJubltpal0pDQogICAgfQ0KICAgIGlmIChpbnQgPiAwKSB7DQogICAgICAgIHJlcyA8LSBj
KGxpc3QoYChJbnRlcmNlcHQpYCA9IGNvZWZbaW50XSksIHJlcykNCiAgICB9DQogICAgY2xhc3Mo
cmVzKSA8LSAiZHVtbXlfY29lZiINCiAgICByZXMNCn0NCg==