Bug 14682 - getQ0() returns a non-positive covariance matrix
Summary: getQ0() returns a non-positive covariance matrix
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Analyses (show other bugs)
Version: R 3.0.0
Hardware: All Linux
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2011-09-20 17:06 UTC by Raphael Rossignol
Modified: 2015-12-14 13:46 UTC (History)
2 users (show)

See Also:


Attachments
This is pure C implementation of the getQ0bis. (5.14 KB, patch)
2013-05-10 16:01 UTC, Matwey V. Kornilov
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Raphael Rossignol 2011-09-20 17:06:15 UTC
The function makeARIMA(), designed to construct some state space representation of an ARIMA model, uses a C function called getQ0, which can be found at the end of arima.c in R source files (library stats). getQ0 takes two arguments, phi and theta, and returns the covariance matrix of the state prediction error at time zero. The reference for getQ0 (cited by help(arima)) is:
Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980) Algorithm
     AS154. An algorithm for exact maximum likelihood estimation of
     autoregressive-moving average models by means of Kalman filtering.
     _Applied Statistics_ *29*, 311-322.
where it is called subroutine STARMA (and coded in fortran 77).

My problem is that getQ0 returns incorrect covariance matrices in certain cases. Indeed, below is an example of a SARIMA(1,0,1)x(1,0,0)_12 where getQ0 gives a covariance matrix which possess negative eigenvalues ! It seems also that the answer may differ from one machine to another. 
Example: the following instructions:

 s <- 12
 phis <- 0.99
 phi1 <- 0.0001
 phi <- c(phi1,rep(0,s-2),phis,-phi1*phis)
 theta <- 0.7
 out <- makeARIMA(phi,theta,NULL)
 min(eigen(out$Pn)$value)

return a negative value on three different machines.
1) It gives:
[1] -83.45901

on Ubuntu 10.04.3 LTS 64 bits, with an Intel(R) Core(TM) i7 CPU 870 with sessionInfo():
R version 2.10.1 (2009-12-14) 
x86_64-pc-linux-gnu 

locale:
 [1] LC_CTYPE=fr_FR.utf8       LC_NUMERIC=C             
 [3] LC_TIME=fr_FR.utf8        LC_COLLATE=fr_FR.utf8    
 [5] LC_MONETARY=C             LC_MESSAGES=fr_FR.utf8   
 [7] LC_PAPER=fr_FR.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=fr_FR.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

2) It gives:

[1] -53.60753

on a Debian 5.0.8 (32 bits) with an Intel Xeon 5150 with sessionInfo():
R version 2.7.1 (2008-06-23) 
i486-pc-linux-gnu 

locale:
LC_CTYPE=fr_FR.utf8;LC_NUMERIC=C;LC_TIME=fr_FR.utf8;LC_COLLATE=fr_FR.utf8;LC_MONETARY=C;LC_MESSAGES=fr_FR.utf8;LC_PAPER=fr_FR.utf8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_FR.utf8;LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     


3) It gives also:
[1] -53.60753

on Ubuntu 10.04.3 LTS on a Dell latitude D430 (32 bits, Intel(R) Core(TM)2 CPU U7600) with sessionInfo():
R version 2.10.1 (2009-12-14) 
i486-pc-linux-gnu 

locale:
 [1] LC_CTYPE=fr_FR.utf8       LC_NUMERIC=C             
 [3] LC_TIME=fr_FR.utf8        LC_COLLATE=fr_FR.utf8    
 [5] LC_MONETARY=C             LC_MESSAGES=fr_FR.utf8   
 [7] LC_PAPER=fr_FR.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=fr_FR.utf8 LC_IDENTIFICATION=C      

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base  


--------
Above, I obtained getQ0 results through makeARIMA(), but of course, the result is the same invoking "out <- .Call(stats:::R_getQ0,phi,theta)" instead of "out <- makeARIMA(phi,theta,NULL)$Pn".

There are serious consequences of this "bug" in the functions KalmanLike() and arima(). Indeed, arima() in its default behaviour uses first CSS method to get the initial value for an MLE search through optim. To compute the likelihood, it uses getQ0 at the initialization of the Kalman Filter. It may happen that getQ0 returns a covariance matrix which possess negative eigenvalues and that this gives a negative gain in the Kalman filter, which in turn gives a likelihood equal to NaN. Here is a reproducible example where I forced the use of 'ML'.
> set.seed(1)
> x <- arima.sim(100,model=list(ar=phi,ma=theta))
> KalmanLike(x,mod=out,fast=FALSE)
$Lik
ssq 
NaN 

$s2
     ssq 
0.8482354 

> arima(x,order=c(1,0,1),seasonal=list(period=12,order=c(1,0,0)),include.mean=FALSE,init=c(phi1,theta,phis),method='ML')

Erreur dans optim(init[mask], armafn, method = optim.method, hessian = TRUE,  : 
  valeur non-finie fournie par optim

This error message ("Error in optim ... non-finite finite-difference value") was already noted in the following message, which remained without answer:
https://stat.ethz.ch/pipermail/r-devel/2009-February/052009.html

Below are my attempts to resolve the problem.

I tried to replace getQ0 in two ways. The first one is to compute first the covariance matrix of (X_{t-1},...,X_{t-p},Z_t,...,Z_{t-q}) and this is achieved through the method of difference equations (page 93 of Brockwell and Davis). This way was apparently suggested by a referee to Gardner et al. paper (see page 314 of their paper).
Q0bis <- function(phi,theta){
  ## Computes the initial covariance matrix for the state space representation of Gardner et al.
  p <- length(phi)
  q <- length(theta)
  r <- max(p,q+1)
  ttheta <- c(1,theta,rep(0,max(p,q+1)-q-1))
  A1 <- matrix(0,r,p)
  B <- (col(A1)+row(A1)<=p+1)
  C <- (col(A1)+row(A1)-1)
  A1[B] <- phi[C[B]]
  A2 <- matrix(0,r,q+1)
  C <- (col(A2)+row(A2)-1)
  B <- (C<=q+1)
  A2[B] <- ttheta[C[B]]
  A <- cbind(A1,A2) 
  if (p==0) {
    S <- diag(q+1)
  }  
  else {
    ## Compute the autocovariance function of U, the AR part of X
    r2 <- max(p+q,p+1)
    tphi <- c(1,-phi)
    C1 <- matrix(0,r2,r2)
    F <- row(C1)-col(C1)+1
    E <- (1<=F)&(F<=p+1)
    C1[E] <- tphi[F[E]]
    C2 <- matrix(0,r2,r2)
    F <- col(C2)+row(C2)-1
    E <- (F<=p+1) & col(C2)>=2
    C2[E] <- tphi[F[E]]
    Gam <- (C1+C2) 
    g <- matrix(0,r2,1)
    g[1] <- 1
    rU <- solve(Gam,g)
    SU <- toeplitz(rU[1:(p+q),1])
    ## End of the difference equations method
    ## Then, compute correlation matrix of X
    A2 <- matrix(0,p,p+q)
    C <- col(A2)-row(A2)+1
    B <- (1<=C)&(C<=q+1)
    A2[B] <- ttheta[C[B]]
    SX <- A2%*%SU%*%t(A2)
    ## Now, compute correlation matrix between X and Z    
    C1 <- matrix(0,q,q)
    F <- row(C1)-col(C1)+1
    E <- (F>=1) & (F<=p+1)
    C1[E] <- tphi[F[E]]
    g <- matrix(0,q,1)
    if (q) g[1:q,1] <- ttheta[1:q]
    rXZ <- forwardsolve(C1,g)
    SXZ <- matrix(0, p, q+1)
    F <- col(SXZ)-row(SXZ)
    E <- F>=1
    SXZ[E] <- rXZ[F[E]]
    S <- rbind(cbind(SX,SXZ),cbind(t(SXZ),diag(q+1)))
  }
  Q0 <- A%*%S%*%t(A)
}

The second way is to resolve brutally the equation of Gardner et al. in the form (12), page 314 of their paper. 

Q0ter <- function(phi,theta){
  p <- length(phi)
  q <- length(theta)
  r <- max(p,q+1)
  T <- matrix(0,r,r)
  if (p) T[1:p,1] <- phi
  if (r) T[1:(r-1),2:r] <- diag(r-1)
  V <- matrix(0,r,r)
  ttheta <- c(1,theta)
  V[1:(q+1),1:(q+1)] <- ttheta%x%t(ttheta)
  V <- matrix(V,ncol=1)
  S <- diag(r*r)-T%x%T
  Q0 <- solve(S,V)
  Q0 <- matrix(Q0,ncol=r)
}

My conclusion for now is that these two other ways give the same answer (as returned by all.equal) and this answer is sometimes different than the one returned by getQ0. It may happen that they give a Q0 with negative eigenvalues, but they are very very small, and then, the likelihood computed by KalmanLike is a number (and not NaN). Here is a function allowing to compare the three methods
test <- function(phi,theta){
  out <- makeARIMA(phi,theta,NULL)
  Q0bis <- Q0bis(phi,theta)
  Q0ter <- Q0ter(phi,theta)
  mod <- out
  modbis <- out
  modter <- out
  modbis$Pn <- Q0bis
  modter$Pn <- Q0ter
  set.seed(1)
  x <- arima.sim(100,model=list(ar=phi,ma=theta))
  s <- KalmanLike(x,mod=mod,fast=FALSE)
  sbis <- KalmanLike(x,modbis)
  ster <- KalmanLike(x,modter)
  test12 <- all.equal(out$Pn,Q0bis)
  test13 <- all.equal(out$Pn,Q0ter)
  test23 <- all.equal(Q0bis,Q0ter)
  list(eigen=min(eigen(out$Pn)$value),eigenbis=min(eigen(Q0bis)$value),eigenter=min(eigen(Q0ter)$value),test12=test12,test13=test13,test23=test23,s=s,sbis=sbis,ster=ster)
}

Here is the result of the test with the values of phi and theta above, on the Dell latitude listed above:
> test(phi,theta)
$eigen
[1] -53.60753

$eigenbis
[1] 8.271806e-24

$eigenter
[1] 1.136868e-13

$test12
[1] "Mean relative difference: 0.2892978"

$test13
[1] "Mean relative difference: 0.2892978"

$test23
[1] TRUE

$s
$s$Lik
ssq 
NaN 

$s$s2
      ssq 
0.8482354 


$sbis
$sbis$Lik
      ssq 
0.1646761 

$sbis$s2
      ssq 
0.8631637 


$ster
$ster$Lik
      ssq 
0.1646761 

$ster$s2
      ssq 
0.8631637 

The results are extremely similar on the other 32-bits system listed above, and similar on the 64-bits system. 

By the way, Q0bis is only twice slower than makeARIMA() (but it is not optimized), and Q0ter is 50 times slower than Q0bis. Then, if the numerical instability of getQ0() is confirmed, I would suggest to trade the algorithm for the one in Q0bis().

Regards,

Raphael Rossignol

Assistant Professor of Mathematics
Univ. Grenoble 1
Comment 1 Brian Ripley 2011-11-04 12:53:01 UTC
There was nothing reproducible in the referenced posting.
Comment 2 Raphael Rossignol 2011-11-09 09:32:49 UTC
Maybe my post was too long ... I apologize.
To be more concise, on all 32-bits systems I tried it (four different machines, all with linux ubuntu and R's version ranging from 2.7 to 2.11), the following code returned a negative value (namely -53.60753)

 s <- 12
 phis <- 0.99
 phi1 <- 0.0001
 phi <- c(phi1,rep(0,s-2),phis,-phi1*phis)
 theta <- 0.7
 out <- makeARIMA(phi,theta,NULL)
 min(eigen(out$Pn)$value)

Mr Ripley, you say that nothing is reproducible. Does that mean that you obtain a non-negative answer to the lines of code above ? Could you please tell us what do you obtain on your own computer, what is your operating system and what R version are you using, so that we can understand what happens here ?

And is there anyone in the R-core team obtaining the same result as me ?

Raphael Rossignol
Comment 3 Martin Maechler 2011-11-10 19:53:09 UTC
>>>>>   <r-bugs@r-project.org>
>>>>>     on Wed, 9 Nov 2011 03:32:51 -0500 writes:


    > https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14682
    > --- Comment #2 from Raphael Rossignol
    > <raphaelrossignol@free.fr> 2011-11-09 03:32:49 EST ---
    > Maybe my post was too long ... I apologize.  To be more
    > concise, on all 32-bits systems I tried it (four different
    > machines, all with linux ubuntu and R's version ranging
    > from 2.7 to 2.11), the following code returned a negative
    > value (namely -53.60753)

 > s <- 12
 > phis <- 0.99
 > phi1 <- 0.0001
 > phi <- c(phi1,rep(0,s-2),phis,-phi1*phis)
 > theta <- 0.7
 > out <- makeARIMA(phi,theta,NULL)
 > min(eigen(out$Pn)$value)


    > Mr Ripley, you say that nothing is reproducible. Does that
    > mean that you obtain a non-negative answer to the lines of
    > code above ? Could you please tell us what do you obtain
    > on your own computer, what is your operating system and
    > what R version are you using, so that we can understand
    > what happens here ?

    > And is there anyone in the R-core team obtaining the same
    > result as me ?

Yes, I seem to get the same as you:

-53.6.   on 32-bit Linux (Fedora F15)
-85...  on 64-bit Linux (Fedora F15)

in more details:

## on 64-bit :
> round(eigen(out$Pn)$values, 2)

 [1] 230.86 230.21 116.55  95.07  95.01  73.67  73.46  52.23  51.85  30.53 
[11]   0.00 -83.38 -83.46

## on 32-bit :
>  round(eigen(out$Pn)$values, 2)

 [1] 200.65 137.31 136.87  83.53  82.92  73.65  73.47  63.97  63.91   9.98
[11]   9.94   0.00 -53.61

So I think we can confirm the behavior you are seeing.

Indeed, the example is relatively close to the stationarity border,
and the reciprocal condition number

  > rcond(out$Pn) 
  [1] 4.494902e-06 
  > # for 64-bit  
  and   2.77..e-6   for 32-bit.

Also suggest that we are close to where one expects numerical
problems.
OTOH, I think you demonstrated alternative code which produced
"better"  such  var/cov - matrices,
so I tend to agree that indeed the current algorithm can be
improved.

To do so in a reliable way which is uniformly better (i.e. not
worse) than the current code is quite a bit harder though...
and I cannot volunteer to get involved here for the next few weeks.

Ideally, however I think "we" (the R users and developers interested
and knowledgable on the topic) should start using e-mail or
another more convenient way to discuss and resolve the issue,
rather than the R-bugs interface...
even though you were okay to use it to get started.

Best regards,
Martin Maechler


Comment 4 Brian Ripley 2011-11-11 12:30:18 UTC
I did *not* say this report was not reproducible, and did *not* mark it as such.

I commented
'There was nothing reproducible in the referenced posting'

Professor Ripley
Comment 5 Matwey V. Kornilov 2013-05-10 16:01:33 UTC
Created attachment 1448 [details]
This is pure C implementation of the getQ0bis.

Hi, the bug is annoying me, so I've implemented getQ0bis in C, it is ready to replace initial getQ0 implementation in R stats.
Comment 6 Martin Maechler 2014-01-15 10:41:49 UTC
(In reply to Matwey V. Kornilov from comment #5)
> Created attachment 1448 [details]
> This is pure C implementation of the getQ0bis.
> 
> Hi, the bug is annoying me, so I've implemented getQ0bis in C, it is ready
> to replace initial getQ0 implementation in R stats.

Thank you Matwey... your C code had one logical error  which gave "random" results occasionally ... and took me more time to find and fix than it should have  {rrx[.] was indexed with *negative* indices, for q >= 1}.

I have added (a slightly modified version of) your C code, but in the R interface allowed both now, with a new argument  
SSinit (:= State-Space initialization).

This is in R-devel  svn revision  64777.
Also, I've added   src/library/stats/tests/arimaML.R
which contains Rossignol's R functions Q0bis() and Q0ter()  [also slightly edited], his example above and some systematic variations of his example.
mailto:matwey.kornilov@gmail.com
?KalmanLike  in its  "Details" now has

     The state-space initialization has used Gardner et al's method
     (‘SSinit = "Gardner1980"’), as only method for years.  However,
     that suffers sometimes from deficiencies when close to
     non-stationarity.  For this reason, it may be replaced as default
     in the future and only kept for reproducibility reasons.  Explicit
     specification of ‘SSinit’ is therefore recommended, notably also
     in ‘arima()’.
     ............................

so I do mention that we may replace the default by the new method.

With thanks to Raphael Rossignol and Matvey Kornilov.
Comment 7 Matwey V. Kornilov 2014-01-15 11:13:32 UTC
(In reply to Martin Maechler from comment #6)
> Thank you Matwey... your C code had one logical error  which gave "random"
> results occasionally ... and took me more time to find and fix than it
> should have  {rrx[.] was indexed with *negative* indices, for q >= 1}.

Argh, sorry for that.

> I have added (a slightly modified version of) your C code, but in the R
> interface allowed both now, with a new argument  
> SSinit (:= State-Space initialization).
> 
> This is in R-devel  svn revision  64777.
> Also, I've added   src/library/stats/tests/arimaML.R

Really great, thank you.