Bugzilla – Bug 14962

Incorrect SVD results

Last modified: 2012-07-21 09:56:34 UTC

Hi, My colleagues and I have found that R occasionally gives very incorrect svd/La.svd results for matrices that are not full rank and have a few very small eigenvalues. We've managed to reproduce the problem on every version of R between R 2.9.1 and R 2.15.1 using different versions of BLAS (ATLAS, R internal & vecLib), different gcc versions (4.1.x-4.5.x) and on both Mac (32- and 64-bit) and Linux (64-bit). We also verified the problem exists in the current (2.15.1) pre-compiled OS X binary, and on Linux compiled out-of-the-box with no added compiler flags or libraries. Normally I'd assume that SVD inaccuracies were due to BLAS bugs or an over-aggressive compiler, but that doesn't seem to be the case here given (unless all three BLAS libraries share the bug). As a linear algebra reminder, the singular values of a correlation matrix are the same as the eigenvalues, and the sum of the singular values of a NxN correlation matrix is N. The below code produces a 600x600 correlation matrix with rank 400 and checks that the sum of the singular values is 600. If the value is correct, the random seed is incremented, a new matrix is generated and tested. Every installation we've tested fails by the 7th iteration, and most by the first or second. Comments, confirmations and suggestions appreciated. Thanks, Robert PS. Version 2.15.1 was not listed as an available version in Bugzilla, so I selected R 2.15.0-patched instead. ################################ ## Test whether svd calculation is correct ## 'ev' is eigenvalues of the matrix ev <- exp(c(5.606, 5.009, 4.304, 3.954, 3.858, 3.782, 3.6, 3.504, 3.489, 3.269, 3.086, 2.817, 2.588, 2.322, 2.016, 1.67, 1.324, 0.877, 0.367, -0.127, -0.538, -0.83, -1.039, -1.289, -1.63, -1.804, -1.861, -2.221, -2.278, -2.402, -2.458, -2.578, -2.615, -2.828, -2.938, -2.97, -3.143, -3.221, -3.32, -3.417, -3.545, -3.789, -3.809, -3.944, -4.074, -4.21, -4.301, -4.375, -4.463, -4.496, -4.556, -4.578, -4.693, -4.711, -4.784, -4.844, -4.955, -5.048, -5.098, -5.214, -5.287, -5.373, -5.408, -5.442, -5.497, -5.584, -5.662, -5.749, -5.776, -5.837, -5.895, -5.935, -5.97, -6.048, -6.153, -6.168, -6.258, -6.279, -6.319, -6.331, -6.405, -6.467, -6.54, -6.574, -6.605, -6.639, -6.698, -6.72, -6.827, -6.852, -6.887, -6.948, -6.984, -7.056, -7.081, -7.098, -7.156, -7.223, -7.247, -7.311, -7.335, -7.358, -7.395, -7.433, -7.481, -7.492, -7.572, -7.591, -7.612, -7.673, -7.694, -7.748, -7.777, -7.798, -7.825, -7.848, -7.888, -7.907, -7.976, -8, -8.047, -8.083, -8.124, -8.149, -8.177, -8.205, -8.246, -8.26, -8.276, -8.314, -8.355, -8.373, -8.389, -8.401, -8.451, -8.501, -8.514, -8.572, -8.601, -8.641, -8.688, -8.707, -8.715, -8.734, -8.753, -8.768, -8.802, -8.849, -8.873, -8.899, -8.914, -8.965, -8.98, -9.007, -9.027, -9.07, -9.083, -9.127, -9.157, -9.168, -9.205, -9.215, -9.227, -9.257, -9.282, -9.308, -9.343, -9.351, -9.392, -9.408, -9.439, -9.462, -9.469, -9.494, -9.521, -9.54, -9.558, -9.593, -9.627, -9.674, -9.684, -9.702, -9.729, -9.759, -9.778, -9.79, -9.823, -9.862, -9.87, -9.885, -9.905, -9.91, -9.937, -9.962, -9.988, -9.994, -10.037, -10.051, -10.065, -10.103, -10.123, -10.142, -10.157, -10.162, -10.196, -10.216, -10.228, -10.243, -10.277, -10.295, -10.321, -10.341, -10.354, -10.385, -10.406, -10.413, -10.436, -10.446, -10.471, -10.497, -10.516, -10.529, -10.548, -10.559, -10.572, -10.603, -10.613, -10.638, -10.647, -10.653, -10.681, -10.694, -10.7, -10.733, -10.768, -10.778, -10.793, -10.818, -10.853, -10.866, -10.888, -10.896, -10.93, -10.93, -10.96, -10.972, -10.981, -10.995, -11.008, -11.037, -11.046, -11.077, -11.09, -11.108, -11.128, -11.144, -11.165, -11.177, -11.192, -11.215, -11.229, -11.263, -11.281, -11.298, -11.31, -11.335, -11.351, -11.354, -11.389, -11.4, -11.404, -11.432, -11.461, -11.472, -11.485, -11.506, -11.533, -11.56, -11.561, -11.597, -11.62, -11.633, -11.656, -11.659, -11.68, -11.693, -11.716, -11.73, -11.743, -11.767, -11.777, -11.794, -11.815, -11.844, -11.864, -11.869, -11.897, -11.907, -11.931, -11.944, -11.947, -11.984, -12.014, -12.02, -12.03, -12.069, -12.084, -12.101, -12.11, -12.134, -12.141, -12.181, -12.197, -12.21, -12.228, -12.229, -12.256, -12.275, -12.284, -12.3, -12.324, -12.341, -12.372, -12.378, -12.404, -12.426, -12.443, -12.469, -12.474, -12.484, -12.501, -12.528, -12.535, -12.549, -12.564, -12.599, -12.618, -12.644, -12.651, -12.672, -12.705, -12.715, -12.738, -12.747, -12.796, -12.824, -12.853, -12.857, -12.863, -12.899, -12.912, -12.925, -12.966, -12.981, -12.991, -13.028, -13.049, -13.077, -13.081, -13.103, -13.123, -13.142, -13.166, -13.21, -13.212, -13.223, -13.266, -13.306, -13.326, -13.358, -13.399, -13.437, -13.46, -13.49, -13.502, -13.523, -13.561, -13.626, -13.653, -13.664, -13.753, -13.788, -13.842, -13.864, -13.947, -13.955, -14.032, -14.038, -14.121, -14.154, -30.019, -30.376, -31.213, -32.392, -33.689, -33.79, -33.869, -33.893, -33.926, -34.05)) x1 <- 600 x2 <- 400 ## Create the matrix eigenvectors from a covariance matrix of random values R <- matrix(rnorm(x1*x2), x1, x2) ee <- eigen(tcrossprod(R)) ## Create the matrix eigenvectors from a covariance matrix of random values for (i in 1:10) { set.seed(i) R <- matrix(rnorm(x1*x2), x1, x2) ee <- eigen(tcrossprod(R)) ## 'cx' is a positive semi-definite matrix with eigenvalues of 'ev' cx <- ee$vectors[, 1:x2] %*% tcrossprod(diag(ev), ee$vectors[,1:x2]) ## Scale 'cx' to the corresponding correlation matrix and verify diag=1 cx <- cov2cor(cx) stopifnot(all(diag(cx)==1)) ## Calculate sum of singular values (should be 'x1') sv <- sum(svd(cx)$d) if (abs(sv-x1) > 1e-2) { cat("Seed=", i, "\n") stop("Average singular value of correlation matrix is ", round(sv, 3), ". Should be ", x1) } } #################################

I almost had this down to wrong user expectation, but then again, perhaps not... Watch this: > cx2 <- crossprod(sqrt(ev)* t(ee$vectors[,1:x2])) > cx2<-cov2cor(cx2) > range(cx-cx2) [1] -5.551115e-16 6.661338e-16 > sum(svd(cx2)$d) [1] 600 > sum(svd(cx)$d) [1] 3923.904 > sx<-La.svd(cx) > range(sx$u%*%diag(sx$d)%*%sx$vt-cx) [1] -8.249147 11.896032 > sx<-La.svd(cx2) > range(sx$u%*%diag(sx$d)%*%sx$vt-cx2) [1] -4.951595e-14 8.870682e-14 So, the computation is extremely sensitive to small changes in the input. That may just be a fact of life, but it does worry me that we are returning something that clearly isn't a decomposition, without any error indications.

The issue goes away when relinking against LAPACK 3.4.1. R's generic LAPACK is extracted from version 3.1.1 and could be upgraded with some effort (it isn't a straightforward copy). The situation when linking against OS libraries may be trickier (e.g. OSX Lion's Accelerate.framework appears to be based on 3.2.1)

(In reply to comment #2) > The issue goes away when relinking against LAPACK 3.4.1. That did it, thank you! I should have tried that right away.

(In reply to comment #1) > I almost had this down to wrong user expectation, but then again, > perhaps not... it does worry me that we are returning something > that clearly isn't a decomposition, without any error indications. Not only that, some of the singular values are negative.

Depends on the platform's LAPACK. R's has been upgraded to 3.4.1 in R-devel, but still fails in some places where the system lapack is used (e.g. OS X Lion).