Bug 15000 - stat:::logLik.lm() for multivariate lm uses wrong likelihood definition; lack of logLik.mlm
stat:::logLik.lm() for multivariate lm uses wrong likelihood definition; lack...
Status: CLOSED WISHLIST
Product: R
Classification: Unclassified
Component: Models
R 2.15.0 patched
x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 normal
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-24 11:43 UTC by boecka
Modified: 2014-02-16 11:43 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description boecka 2012-07-24 11:43:58 UTC
Session info at the end.
For a multivariate linear model (matrix valued y-matrix) logLik() produces an incorrect result. Example says more,
thank you, best
Andi Böck

set.seed(1)
n <- 50
y1 <- rnorm(n)
y <- cbind(y1, y2 <- y1+rnorm(n, sd=.5)) 
x1 <- rnorm(n)
x2 <- rnorm(n)

mod <- lm(y~x1+x2)
logLik(mod)
# 'log Lik.' -125.3841 (df=4)


# bivariate log-lik
require(mvtnorm)
resids <- residuals(mod)
Sigma_ML <- crossprod(resids) / n
Sigma_wrong <- Sigma_ML
Sigma_wrong[c(2,3)] <- 0 # covariances to zero

# "correct" logLik
sum(dmvnorm(resids, mean=c(0,0), sigma=Sigma_ML, log=T))
# [1] -93.20981

# zero covariance logLik, which is equal to...
sum(dmvnorm(resids, mean=c(0,0), sigma=Sigma_wrong, log=T))
# [1] -124.9283


# ...sum of univariate logLiks
logLik(lm(y1~x1+x2)) + logLik(lm(y2~x1+x2))
# 'log Lik.' -124.9283 (df=4)




R version 2.15.1 (2012-06-22)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252   
[3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C                   
[5] LC_TIME=German_Germany.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] mvtnorm_0.9-9992

loaded via a namespace (and not attached):
[1] tools_2.15.1
Comment 1 Peter Dalgaard 2012-07-26 14:53:14 UTC
I have to disagree there, Brian. 

It's OK to trap the mlm case for now, but I think its valid to wish for logLik.mlm, and I'd call it a toss-up between user error and programmer error when wrong methods get used via inheritance.

There's a bit of trickiness in implementing it, though. dmvnorm is in a contributed package  on which we don't want a dependency, and I'm unsure I'd get the multivariate REML code right on the first try.

Moved to wishlist for now. Patches could be accepted.
Comment 2 Brian Ripley 2012-07-26 16:30:06 UTC
Actually, this is multiple-repsponse not multivariate lm.  The 
difference being what is assumed for the covariance matrix of the error 
terms.

It is user error, which we now trap.



On 24/07/2012 11:43, r-bugs@r-project.org wrote:
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15000
>
>             Summary: stat:::logLik.lm() for multivariate lm uses wrong
>                      likelihood definition; lack of logLik.mlm
>             Product: R
>             Version: R 2.15.0 patched
>            Platform: x86_64/x64/amd64 (64-bit)
>          OS/Version: Windows 64-bit
>              Status: NEW
>            Severity: normal
>            Priority: P5
>           Component: Models
>          AssignedTo: R-core@R-project.org
>          ReportedBy: boecka@gmx.de
>     Estimated Hours: 0.0
>
>
> Session info at the end.
> For a multivariate linear model (matrix valued y-matrix) logLik() produces an
> incorrect result. Example says more,
> thank you, best
> Andi Böck
>
> set.seed(1)
> n <- 50
> y1 <- rnorm(n)
> y <- cbind(y1, y2 <- y1+rnorm(n, sd=.5))
> x1 <- rnorm(n)
> x2 <- rnorm(n)
>
> mod <- lm(y~x1+x2)
> logLik(mod)
> # 'log Lik.' -125.3841 (df=4)
>
>
> # bivariate log-lik
> require(mvtnorm)
> resids <- residuals(mod)
> Sigma_ML <- crossprod(resids) / n
> Sigma_wrong <- Sigma_ML
> Sigma_wrong[c(2,3)] <- 0 # covariances to zero
>
> # "correct" logLik
> sum(dmvnorm(resids, mean=c(0,0), sigma=Sigma_ML, log=T))
> # [1] -93.20981
>
> # zero covariance logLik, which is equal to...
> sum(dmvnorm(resids, mean=c(0,0), sigma=Sigma_wrong, log=T))
> # [1] -124.9283
>
>
> # ...sum of univariate logLiks
> logLik(lm(y1~x1+x2)) + logLik(lm(y2~x1+x2))
> # 'log Lik.' -124.9283 (df=4)
>
>
>
>
> R version 2.15.1 (2012-06-22)
> Platform: x86_64-pc-mingw32/x64 (64-bit)
>
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252
> [3] LC_MONETARY=German_Germany.1252 LC_NUMERIC=C
> [5] LC_TIME=German_Germany.1252
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] mvtnorm_0.9-9992
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.1
>



-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


Comment 3 Jackie Rosen 2014-02-16 11:43:23 UTC
(spam comment removed)