14438
2010-11-18 14:26:35 +0000
qr.X has wrong column names with pivoting
2010-11-23 08:10:07 +0000
1
1
1
Unclassified
R
Analyses
R 2.12.0
ix86 (32-bit)
Linux
CLOSED
FIXED
P5
trivial
---
1
jari.oksanen
R-core
oldest_to_newest
84917
0
jari.oksanen
2010-11-18 14:26:35 +0000
qr.X() mixes column names when there is pivoting. The returned data columns are in the same order as in the original data, but the column names are for pivoted columns.
A simple example. First we create a tiny matrix with duplicated columns to guarantee redundant data and pivoting:
> X <- matrix(1:10, ncol=2)
> X <- X[, c(1,1,2,2)]
> colnames(X) <- c("X1","Dup1", "X2", "Dup2")
> X
X1 Dup1 X2 Dup2
[1,] 1 1 6 6
[2,] 2 2 7 7
[3,] 3 3 8 8
[4,] 4 4 9 9
[5,] 5 5 10 10
Here X1 is equal to Dup1, and X2 equal to Dup2.
Then QR decomposition
> Q <- qr(X)
> Q$rank
[1] 2
> Q <- qr(X)
> Q$rank
[1] 2
> Q$pivot
[1] 1 3 2 4
All OK, but here qr.X():
> qr.X(Q)
X1 X2 Dup1 Dup2
[1,] 1 1 6 6
[2,] 2 2 7 7
[3,] 3 3 8 8
[4,] 4 4 9 9
[5,] 5 5 10 10
Here the column data are in the original order of X, but column names were reordered, and column 'X2' actually contains data of 'Dup1' (and column 'Dup1' mirros the confusion).
Obviously this happens at the end of the function qr.X. Here is the end of the file trunk/src/library/base/R/qr.R (from rev 53629):
if(pivoted) # res may have more columns than length(qr$pivot)
res[, qr$pivot] <- res[, ip]
res
}
Where
res[, qr$pivot] <- res[, ip]
reorders column data without touching the original column names.
Cheers, Jari Oksanen
84939
1
ripley
2010-11-23 08:10:07 +0000
changed in 2.12.0 patched.