Bug 14962 - Incorrect SVD results
Incorrect SVD results
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Accuracy
R 2.15.0 patched
All All
: P5 major
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-27 00:45 UTC by Robert McGehee
Modified: 2012-07-21 09:56 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Robert McGehee 2012-06-27 00:45:35 UTC
Hi,
My colleagues and I have found that R occasionally gives very incorrect svd/La.svd results for matrices that are not full rank and have a few very small eigenvalues. We've managed to reproduce the problem on every version of R between R 2.9.1 and R 2.15.1 using different versions of BLAS (ATLAS, R internal & vecLib), different gcc versions (4.1.x-4.5.x) and on both Mac (32- and 64-bit) and Linux (64-bit). We also verified the problem exists in the current (2.15.1) pre-compiled OS X binary, and on Linux compiled out-of-the-box with no added compiler flags or libraries. Normally I'd assume that SVD inaccuracies were due to BLAS bugs or an over-aggressive compiler, but that doesn't seem to be the case here given (unless all three BLAS libraries share the bug).

As a linear algebra reminder, the singular values of a correlation matrix are the same as the eigenvalues, and the sum of the singular values of a NxN correlation matrix is N. The below code produces a 600x600 correlation matrix with rank 400 and checks that the sum of the singular values is 600. If the value is correct, the random seed is incremented, a new matrix is generated and tested. Every installation we've tested fails by the 7th iteration, and most by the first or second.

Comments, confirmations and suggestions appreciated.
Thanks, Robert

PS. Version 2.15.1 was not listed as an available version in Bugzilla, so I selected R 2.15.0-patched instead.

################################
## Test whether svd calculation is correct
## 'ev' is eigenvalues of the matrix
ev <- exp(c(5.606, 5.009, 4.304, 3.954, 3.858, 3.782, 3.6, 3.504, 3.489,
            3.269, 3.086, 2.817, 2.588, 2.322, 2.016, 1.67, 1.324, 0.877,
            0.367, -0.127, -0.538, -0.83, -1.039, -1.289, -1.63, -1.804,
            -1.861, -2.221, -2.278, -2.402, -2.458, -2.578, -2.615, -2.828,
            -2.938, -2.97, -3.143, -3.221, -3.32, -3.417, -3.545, -3.789,
            -3.809, -3.944, -4.074, -4.21, -4.301, -4.375, -4.463, -4.496,
            -4.556, -4.578, -4.693, -4.711, -4.784, -4.844, -4.955, -5.048,
            -5.098, -5.214, -5.287, -5.373, -5.408, -5.442, -5.497, -5.584,
            -5.662, -5.749, -5.776, -5.837, -5.895, -5.935, -5.97, -6.048,
            -6.153, -6.168, -6.258, -6.279, -6.319, -6.331, -6.405, -6.467,
            -6.54, -6.574, -6.605, -6.639, -6.698, -6.72, -6.827, -6.852,
            -6.887, -6.948, -6.984, -7.056, -7.081, -7.098, -7.156, -7.223,
            -7.247, -7.311, -7.335, -7.358, -7.395, -7.433, -7.481, -7.492,
            -7.572, -7.591, -7.612, -7.673, -7.694, -7.748, -7.777, -7.798,
            -7.825, -7.848, -7.888, -7.907, -7.976, -8, -8.047, -8.083, -8.124,
            -8.149, -8.177, -8.205, -8.246, -8.26, -8.276, -8.314, -8.355,
            -8.373, -8.389, -8.401, -8.451, -8.501, -8.514, -8.572, -8.601,
            -8.641, -8.688, -8.707, -8.715, -8.734, -8.753, -8.768, -8.802,
            -8.849, -8.873, -8.899, -8.914, -8.965, -8.98, -9.007, -9.027,
            -9.07, -9.083, -9.127, -9.157, -9.168, -9.205, -9.215, -9.227,
            -9.257, -9.282, -9.308, -9.343, -9.351, -9.392, -9.408, -9.439,
            -9.462, -9.469, -9.494, -9.521, -9.54, -9.558, -9.593, -9.627,
            -9.674, -9.684, -9.702, -9.729, -9.759, -9.778, -9.79, -9.823,
            -9.862, -9.87, -9.885, -9.905, -9.91, -9.937, -9.962, -9.988,
            -9.994, -10.037, -10.051, -10.065, -10.103, -10.123, -10.142,
            -10.157, -10.162, -10.196, -10.216, -10.228, -10.243, -10.277,
            -10.295, -10.321, -10.341, -10.354, -10.385, -10.406, -10.413,
            -10.436, -10.446, -10.471, -10.497, -10.516, -10.529, -10.548,
            -10.559, -10.572, -10.603, -10.613, -10.638, -10.647, -10.653,
            -10.681, -10.694, -10.7, -10.733, -10.768, -10.778, -10.793,
            -10.818, -10.853, -10.866, -10.888, -10.896, -10.93, -10.93,
            -10.96, -10.972, -10.981, -10.995, -11.008, -11.037, -11.046,
            -11.077, -11.09, -11.108, -11.128, -11.144, -11.165, -11.177,
            -11.192, -11.215, -11.229, -11.263, -11.281, -11.298, -11.31,
            -11.335, -11.351, -11.354, -11.389, -11.4, -11.404, -11.432,
            -11.461, -11.472, -11.485, -11.506, -11.533, -11.56, -11.561,
            -11.597, -11.62, -11.633, -11.656, -11.659, -11.68, -11.693,
            -11.716, -11.73, -11.743, -11.767, -11.777, -11.794, -11.815,
            -11.844, -11.864, -11.869, -11.897, -11.907, -11.931, -11.944,
            -11.947, -11.984, -12.014, -12.02, -12.03, -12.069, -12.084,
            -12.101, -12.11, -12.134, -12.141, -12.181, -12.197, -12.21,
            -12.228, -12.229, -12.256, -12.275, -12.284, -12.3, -12.324,
            -12.341, -12.372, -12.378, -12.404, -12.426, -12.443, -12.469,
            -12.474, -12.484, -12.501, -12.528, -12.535, -12.549, -12.564,
            -12.599, -12.618, -12.644, -12.651, -12.672, -12.705, -12.715,
            -12.738, -12.747, -12.796, -12.824, -12.853, -12.857, -12.863,
            -12.899, -12.912, -12.925, -12.966, -12.981, -12.991, -13.028,
            -13.049, -13.077, -13.081, -13.103, -13.123, -13.142, -13.166,
            -13.21, -13.212, -13.223, -13.266, -13.306, -13.326, -13.358,
            -13.399, -13.437, -13.46, -13.49, -13.502, -13.523, -13.561,
            -13.626, -13.653, -13.664, -13.753, -13.788, -13.842, -13.864,
            -13.947, -13.955, -14.032, -14.038, -14.121, -14.154, -30.019,
            -30.376, -31.213, -32.392, -33.689, -33.79, -33.869, -33.893,
            -33.926, -34.05))

x1 <- 600
x2 <- 400
## Create the matrix eigenvectors from a covariance matrix of random values
R <- matrix(rnorm(x1*x2), x1, x2)
ee <- eigen(tcrossprod(R))

## Create the matrix eigenvectors from a covariance matrix of random values
for (i in 1:10) {
    set.seed(i)
    R <- matrix(rnorm(x1*x2), x1, x2)
    ee <- eigen(tcrossprod(R))

    ## 'cx' is a positive semi-definite matrix with eigenvalues of 'ev'
    cx <- ee$vectors[, 1:x2] %*% tcrossprod(diag(ev), ee$vectors[,1:x2])
    ## Scale 'cx' to the corresponding correlation matrix and verify diag=1
    cx <- cov2cor(cx)
    stopifnot(all(diag(cx)==1))

    ## Calculate sum of singular values (should be 'x1')
    sv <- sum(svd(cx)$d)
    if (abs(sv-x1) > 1e-2) {
        cat("Seed=", i, "\n")
        stop("Average singular value of correlation matrix is ", round(sv, 3), ". Should be ", x1)
    }
}
#################################
Comment 1 Peter Dalgaard 2012-06-27 12:53:51 UTC
I almost had this down to wrong user expectation, but then again, perhaps not...

Watch this:


> cx2 <- crossprod(sqrt(ev)* t(ee$vectors[,1:x2]))
> cx2<-cov2cor(cx2)
> range(cx-cx2)
[1] -5.551115e-16  6.661338e-16
> sum(svd(cx2)$d)
[1] 600
> sum(svd(cx)$d)
[1] 3923.904

> sx<-La.svd(cx)
> range(sx$u%*%diag(sx$d)%*%sx$vt-cx)
[1] -8.249147 11.896032

> sx<-La.svd(cx2)
> range(sx$u%*%diag(sx$d)%*%sx$vt-cx2)
[1] -4.951595e-14  8.870682e-14

So, the computation is extremely sensitive to small changes in the input. That may just be a fact of life, but it does worry me that we are returning something that clearly isn't a decomposition, without any error indications.
Comment 2 Peter Dalgaard 2012-06-28 13:24:04 UTC
The issue goes away when relinking against LAPACK 3.4.1.

R's generic LAPACK is extracted from version 3.1.1 and could be upgraded with some effort (it isn't a straightforward copy). The situation when linking against OS libraries may be trickier (e.g. OSX Lion's Accelerate.framework appears to be based on 3.2.1)
Comment 3 Robert McGehee 2012-06-28 20:00:08 UTC
(In reply to comment #2)
> The issue goes away when relinking against LAPACK 3.4.1.

That did it, thank you! I should have tried that right away.
Comment 4 Robert Almgren 2012-07-12 00:43:34 UTC
(In reply to comment #1)
> I almost had this down to wrong user expectation, but then again,
> perhaps not... it does worry me that we are returning something
> that clearly isn't a decomposition, without any error indications.

Not only that, some of the singular values are negative.
Comment 5 Brian Ripley 2012-07-21 09:56:34 UTC
Depends on the platform's LAPACK.

R's has been upgraded to 3.4.1 in R-devel, but still fails in some places where the system lapack is used (e.g. OS X Lion).