This is an example. > set.seed(100) > y <- rnorm(9) > x <- factor(c( + "a", "a", "a", "b", "b", "b", "b", "c", "c" + ), levels = c("a", "b", "c")) > sel <- c( + TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE + ) > table(x, sel) sel x FALSE TRUE a 1 2 b 1 3 c 2 0 So, x takes value "c" entirely when sel is FALSE. I want "b" as base category for x. So, I set contrasts. > contrasts(x) <- + contr.treatment(levels(x), contrasts=FALSE)[, -2, drop=FALSE] > contrasts(x) a c a 1 0 b 0 0 c 0 1 > options("contrasts") $contrasts unordered ordered "contr.treatment" "contr.poly" > lm(y ~ x, subset=sel) Call: lm(formula = y ~ x, subset = sel) Coefficients: (Intercept) xb -0.1853 0.6261 > lm(y ~ x, subset=sel, singular.ok=FALSE) Call: lm(formula = y ~ x, subset = sel, singular.ok = FALSE) Coefficients: (Intercept) xb -0.1853 0.6261 The result of 'lm' above has coefficient for category "b" of x. So, "b" is not base category. The first time I encountered this, I was surprised. I see that it happens because 'lm' calls 'model.frame' with drop.unused.levels = TRUE. In function 'model.frame.default', if drop.unused.levels is TRUE and not all levels is present in a factor, [, drop = TRUE] is applied. In this example, after applying subset=sel, category "c" is not present. In function '[.factor', if drop is TRUE, the result doesn't have "contrasts" attribute. When there is no contrasts specified, options("contrasts") is used, in this case "contr.treatment". So, "a", the first category, is the base category. I can understand that '[.factor' with drop = TRUE drops "contrasts" attribute. If the number of levels is reduced, the original contrasts matrix is no longer valid. What I grumble about is what is done by 'lm'. I specify contrasts with a purpose, but 'lm' doesn't respect my specification, and it is silent when doing that. In this example, a way to achieve what I want is using 'relevel'. But still, I wish 'lm' not to ignore user-specified contrasts silently. Function 'glm' also does it. > sessionInfo() R version 3.1.2 (2014-10-31) Platform: i386-w64-mingw32/i386 (32-bit) locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base

I'll add a warning to model.frame() in R-devel and R-patched.