Bug 16975 - lme fixed sigma - inconsistent REML estimation
Summary: lme fixed sigma - inconsistent REML estimation
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: Models (show other bugs)
Version: R 3.3.0
Hardware: x86_64/x64/amd64 (64-bit) OS X El Capitan
: P5 major
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-06-29 05:06 UTC by Maciej
Modified: 2016-06-29 11:47 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Maciej 2016-06-29 05:06:23 UTC
I would like to estimate Fay-Herriot class models in nlme (small area models). Basically, these models have fixed random error which is assumed to be known from sample survey (sampling error). Hence, the model I would like to specify should have sigma = 1 (it is not estimated).

I have checked new version nlme package (3.1-128) however results are different from those from sae package and proc mixed when it comes to fitting a small area model (in particular a Fay-Herriot area model). 

I am not sure why these results differ. They should be the same because sae::eblupFH fits s mixed model with one random effect and fixed residual variance). 

Moreover, prof. Viechtbauer Wolfgang also shown that the problem is probably related to REML estimation because ML estimates are the same as in metafor package (the last part).

############################## 
## preparation
############################## 

library(sae)
library(nlme)
library(metafor)
data(milk)
milk$var <- milk$SD^2


############################## 
### nlme results
############################## 

> m2 <- lme(fixed = yi ~ as.factor(MajorArea),
           random = ~ 1 | SmallArea,
           control = lmeControl(sigma = 1, 
                                apVar = T),
           weights = varFixed(~var),
           data = milk)


## variance (not the same as in sae and proc mixed)
0.1332918^2 = 0.0177667

> summary(m2)
Linear mixed-effects model fit by REML
 Data: milk 
        AIC       BIC   logLik
  -10.69175 -2.373943 10.34588

Random effects:
 Formula: ~1 | SmallArea
        (Intercept) Residual
StdDev:   0.1332918        1

Variance function:
 Structure: fixed weights
 Formula: ~var 
Fixed effects: yi ~ as.factor(MajorArea) 
                           Value  Std.Error DF   t-value p-value
(Intercept)            0.9680768 0.06849017 39 14.134537  0.0000
as.factor(MajorArea)2  0.1316132 0.10183884 39  1.292367  0.2038
as.factor(MajorArea)3  0.2269008 0.09126952 39  2.486053  0.0173
as.factor(MajorArea)4 -0.2415905 0.08058755 39 -2.997863  0.0047

############################## 
### results based on sae package 
############################## 

library(sae)
va <- eblupFH(formula = yi ~ as.factor(MajorArea), vardir = var, data = milk, method = "REML")

> va$fit
$method
[1] "REML"

$convergence
[1] TRUE

$iterations
[1] 4

$estcoef
                            beta  std.error    tvalue       pvalue
(Intercept)            0.9681890 0.06936208 13.958476 2.793443e-44
as.factor(MajorArea)2  0.1327801 0.10300072  1.289119 1.973569e-01
as.factor(MajorArea)3  0.2269462 0.09232981  2.457995 1.397151e-02
as.factor(MajorArea)4 -0.2413011 0.08161707 -2.956503 3.111496e-03

$refvar
[1] 0.01855022

$goodness
   loglike        AIC        BIC        KIC       AICc      AICb1      AICb2       KICc 
 12.677478 -15.354956  -6.548956 -10.354956         NA         NA         NA         NA 
     KICb1      KICb2 nBootstrap 
        NA         NA   0.000000 

############################## 
### Proc mixed results are consistent with sae 
############################## 

proc SmallArea data= milk order=data method=reml;
class SmallArea;
weight var;
model y= MajorArea2 MajorArea3  MajorArea4 / cl solution outp=predicted;
random SmallArea;
parms (1) (1) / hold=2;
run;
 

## Covariance Parameter Estimates
Cov Parm Estimate
SmallArea 0.01855
Residual 1.0000

### fixed effects
Intercept 0.9682 
majorarea2 0.1328
majorarea3 0.2269
majorarea3 -0.2413


############################## 
### metafor package REML vs ML
##############################

 
milk$var <- milk$SD^2
res <- rma(yi ~ as.factor(MajorArea), var, data=milk)
res
res$tau2

Yields:

Mixed-Effects Model (k = 43; tau^2 estimator: REML)

tau^2 (estimated amount of residual heterogeneity):     0.0185 (SE = 0.0079)
tau (square root of estimated tau^2 value):             0.1362
I^2 (residual heterogeneity / unaccounted variability): 55.21%
H^2 (unaccounted variability / sampling variability):   2.23
R^2 (amount of heterogeneity accounted for):            65.85%

Test for Residual Heterogeneity: 
QE(df = 39) = 86.1840, p-val < .0001

Test of Moderators (coefficient(s) 2,3,4): 
QM(df = 3) = 46.5699, p-val < .0001

Model Results:

                      estimate      se     zval    pval    ci.lb    ci.ub     
intrcpt                  0.9682  0.0694  13.9585  <.0001   0.8322   1.1041  ***
as.factor(MajorArea)2    0.1328  0.1030   1.2891  0.1974  -0.0691   0.3347     
as.factor(MajorArea)3    0.2269  0.0923   2.4580  0.0140   0.0460   0.4079    *
as.factor(MajorArea)4   -0.2413  0.0816  -2.9565  0.0031  -0.4013  -0.0813   **

---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 

res$tau2
[1] 0.01854996

Same as in 'sae' (rounded to 6 decimals) and SAS.

Not a huge difference to lme(), but larger than one would expect due to numerical differences.

If you switch to method="ML" (for both lme() and rma()), then you get 0.1245693^2 = 0.01551751 for lme() and 0.0155174 for rma(), so that's basically the same.

——————
Maciej Beręsewicz
Comment 1 Maciej 2016-06-29 11:47:23 UTC
Please find below code provided by prof. from S-PLUS version. Results are the same as in sae, metafor and proc mixed.


Linear mixed-effects model fit by REML
 Data: dat
      AIC      BIC   logLik
  6.02487 14.34268 1.987565

Random effects:
 Formula:  ~ 1 | SmallArea
        (Intercept) Residual
StdDev:   0.1361994        1

Variance function:
 Structure: fixed weights
 Formula:  ~ var
Fixed effects: yi ~ as.factor(MajorArea)
                           Value  Std.Error DF   t-value p-value
          (Intercept)  0.9977953 0.03179340 39  31.38373  <.0001
as.factor(MajorArea)1  0.0663901 0.05150041 39   1.28912  0.2050
as.factor(MajorArea)2  0.0535187 0.02659572 39   2.01230  0.0511
as.factor(MajorArea)3 -0.0903025 0.01466646 39  -6.15707  <.0001
 Correlation:
                      (Intr) a(MA)1 a(MA)2
as.factor(MajorArea)1  0.075
as.factor(MajorArea)2 -0.157 -0.060
as.factor(MajorArea)3 -0.392 -0.054  0.113

Standardized Within-Group Residuals:
       Min         Q1       Med        Q3     Max
 -1.702152 -0.2304439 0.2200099 0.3981538 1.22415

Number of Observations: 43
Number of Groups: 43
> 0.1361994^2
[1] 0.01855028