Bug 16638

Summary: Spearman correlation occasionally returns values below -1
Product: R Reporter: Keith Hughitt <keith.hughitt>
Component: AccuracyAssignee: R-core <R-core>
Status: CLOSED FIXED    
Severity: normal CC: murdoch
Priority: P5    
Version: R-devel (trunk)   
Hardware: x86_64/x64/amd64 (64-bit)   
OS: Linux   
Attachments: Small reproducible example matrix.

Description Keith Hughitt 2015-12-19 20:47:33 UTC
Created attachment 1948 [details]
Small reproducible example matrix.

While working with a Spearman-based correlation matrix, I noticed that some of the values of the matrix were outside of the range [-1, 1].

Upon further inspection, I found some values that at first glance appeared to be equal to -1, but actually were slightly lower.

I've attached a small matrix which can be used to reproduce the issue:

> load("test.RData")
> mat
         6961        6962       6963       6964       6966       6967       6968       6969
a  0.44052070 -0.44598314 -1.4587333  1.4641958 -0.7586740 -1.0223099  1.0590426  0.7219413
b -0.19429487  0.29103209  0.8106627 -0.9073999  0.3305041  0.3665862 -0.4325788 -0.2645114
c -0.03715713  0.09068725  0.4146191 -0.4681492  0.1675740  0.4106130 -0.3456950 -0.2324920
> sprintf("%0.50f", min(cor(t(mat), method='spearman')))
[1] "-1.00000000000000022204460492503130808472633361816406"

R Under development (unstable) (2015-12-04 r69737)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] setwidth_1.0-4 colorout_1.1-0

This was reproducible across several recent version of R running on different Linux systems.
Comment 1 Duncan Murdoch 2015-12-22 17:18:02 UTC
I'll fix this to guarantee that cor() always returns a value in the [-1, 1] interval.
Comment 2 Duncan Murdoch 2015-12-22 21:36:50 UTC
Now committed in R-devel.
Comment 3 Keith Hughitt 2015-12-23 12:11:58 UTC
Thanks for the quick fix!