Bugzilla – Bug 15386

F statistic in stats:::add1.glm()

Last modified: 2013-07-10 12:32:03 UTC

Reported by Bill Dunlap: Should the F statistic be the same when using add1() on models created by lm and glm(family=gaussian)? They are in the single-degree-of-freedom case but not in the multiple-degree-of-freedom case. MASS:addterm shows the same discrepancy. It looks like the deviance (==residual sum of squares) gets divided by the number of degrees of freedom for the term twice in add1.glm. Using anova() on the output of lm and glm(family=gaussian) gives the save F statistic as add1.lm gives. > # factor(carb) consumes 5 degrees of freedom, am 1, compare their F values. > fit <- lm(mpg ~ hp, data=mtcars) ; add1(fit, ~ hp + factor(carb) + am, test="F") Single term additions Model: mpg ~ hp Df Sum of Sq RSS AIC F value Pr(>F) <none> 447.67 88.427 factor(carb) 5 103.54 344.13 90.009 1.5044 0.2241 am 1 202.24 245.44 71.194 23.8952 3.46e-05 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > fit <- glm(mpg ~ hp, data=mtcars) ; add1(fit, ~ hp + factor(carb) + am, test="F") Single term additions Model: mpg ~ hp Df Deviance AIC F value Pr(>F) <none> 447.67 181.24 factor(carb) 5 344.13 182.82 0.3009 0.9077 am 1 245.44 164.01 23.8952 3.46e-05 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > stats:::add1.glm function (object, scope, scale = 0, test = c("none", "Rao", "LRT", "Chisq", "F"), x = NULL, k = 2, ...) { Fstat <- function(table, rdf) { dev <- table$Deviance df <- table$Df diff <- pmax(0, (dev[1L] - dev)/df) Fs <- (diff/df)/(dev/(rdf - df)) Is this where the double division is happening? diff has df in the denominator, then Fs assignment sees diff divided by df again. if df is 1, the double division will go unnoticed. Proposed fix: Replace diff <- pmax(0, (dev[1L] - dev)/df) with diff <- pmax(0, (dev[1L] - dev))

Yes, that's a bug. Will fix shortly.