Bug 15386 - F statistic in stats:::add1.glm()
F statistic in stats:::add1.glm()
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Analyses
R 3.0.1
All All
: P5 major
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-09 20:01 UTC by Steven McKinney
Modified: 2013-07-10 12:32 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steven McKinney 2013-07-09 20:01:20 UTC
Reported by Bill Dunlap:

Should the F statistic be the same when using add1() on models created by lm and glm(family=gaussian)?
They are in the single-degree-of-freedom case but not in the multiple-degree-of-freedom case.
MASS:addterm shows the same discrepancy.  It looks like the deviance (==residual sum of squares) gets
divided by the number of degrees of freedom for the term twice in add1.glm.  Using anova() on the output
of lm and glm(family=gaussian) gives the save F statistic as add1.lm gives.

> # factor(carb) consumes 5 degrees of freedom, am 1, compare their F values.
> fit <- lm(mpg ~ hp, data=mtcars) ; add1(fit, ~ hp + factor(carb) + am, test="F")
Single term additions

Model:
mpg ~ hp
             Df Sum of Sq    RSS    AIC F value   Pr(>F)   
<none>                    447.67 88.427                    
factor(carb)  5    103.54 344.13 90.009  1.5044   0.2241   
am            1    202.24 245.44 71.194 23.8952 3.46e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> fit <- glm(mpg ~ hp, data=mtcars) ; add1(fit, ~ hp + factor(carb) + am, test="F")
Single term additions

Model:
mpg ~ hp
             Df Deviance    AIC F value   Pr(>F)   
<none>            447.67 181.24                    
factor(carb)  5   344.13 182.82  0.3009   0.9077   
am            1   245.44 164.01 23.8952 3.46e-05 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> stats:::add1.glm
function (object, scope, scale = 0, test = c("none", "Rao", "LRT",
    "Chisq", "F"), x = NULL, k = 2, ...)
{
    Fstat <- function(table, rdf) {
        dev <- table$Deviance
        df <- table$Df
        diff <- pmax(0, (dev[1L] - dev)/df)
        Fs <- (diff/df)/(dev/(rdf - df))

Is this where the double division is happening?
diff has df in the denominator, then Fs assignment sees
diff divided by df again.  if df is 1, the double division
will go unnoticed.

Proposed fix:
Replace
   diff <- pmax(0, (dev[1L] - dev)/df)
with
   diff <- pmax(0, (dev[1L] - dev))
Comment 1 Peter Dalgaard 2013-07-10 12:32:03 UTC
Yes, that's a bug. Will fix shortly.