Bug 16672 - quantile produces decreasing output
Summary: quantile produces decreasing output
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: Language (show other bugs)
Version: R 3.2.3
Hardware: x86_64/x64/amd64 (64-bit) Windows 64-bit
: P5 normal
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2016-01-18 00:52 UTC by Andre Mikulec
Modified: 2017-01-04 21:15 UTC (History)
1 user (show)

See Also:


Attachments
Fix attempt: not use arithmetic if interpolated values are equal (1.19 KB, patch)
2016-03-27 14:07 UTC, Suharto Anggono
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Andre Mikulec 2016-01-18 00:52:36 UTC
stats::quantile produces output that 
starts going up, then goes down, then resumes going up again 

options(width = 60)     
options(digits = 22) 
options(max.print=99999)
options(scipen=255) 
options(warn=2)

x <- c(NA, 10.5999999999999996, NA, NA, NA, 10.5999999999999996,
 NA, NA, NA, NA, NA, 11.3000000000000007, NA, NA,
NA, NA, NA, NA, NA, --5.2000000000000002)

> print(x)
 [1]                  NA 10.5999999999999996
 [3]                  NA                  NA
 [5]                  NA 10.5999999999999996
 [7]                  NA                  NA
 [9]                  NA                  NA
[11]                  NA 11.3000000000000007
[13]                  NA                  NA
[15]                  NA                  NA
[17]                  NA                  NA
[19]                  NA  5.2000000000000002


options(width = 30)

y <-  quantile(x = x, probs = seq(0,1,1/10), na.rm = TRUE)

names(y) <- NULL # easier to see ouput

> print(y)
 [1]  5.2000000000000002
 [2]  6.8200000000000003
 [3]  8.4399999999999995
 [4] 10.0600000000000005
 [5] 10.6000000000000014 # (CURRENT) HIGH VALUE
 [6] 10.5999999999999996 # STARTS DECREASING
 [7] 10.5999999999999996 # SAME
 [8] 10.6699999999999999 # RE-BEGINS INCREASING
 [9] 10.8799999999999990
[10] 11.0899999999999999
[11] 11.3000000000000007

> is.unsorted(y) # decreases somewhere
[1] TRUE

> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats     graphics
[3] grDevices utils
[5] datasets  methods
[7] base

Andre Mikulec
Andre_Mikulec@Hotmail.com
Comment 1 Andre Mikulec 2016-01-20 00:17:24 UTC
A possible fix (in pseudo-code) may be the following:

loop over the x vector values values ( position index is [i] )

  if (x[i-1] <= x[i]) then skip ahead to the next loop
   # otherwise
   set "x[i-1] -> x[i]" # do not bother to decrease
                        # just keep the previous high
      
Andre Mikulec
Andre_Mikulec@Hotmail.com
Comment 2 Andre Mikulec 2016-01-30 18:29:35 UTC
I do not (currently) have a package 'base' example.

But anyways, here is how I am (currently), patching it in my code.

vec_interval <- y
zoo::rollapply(data = vec_interval
   , width = length(vec_interval)
   , FUN = function(x) { 
      max(x) 
     }
   , partial = TRUE, align = "right") -> vec_interval

 [1]  5.2000000000000002
 [2]  6.8200000000000003
 [3]  8.4399999999999995
 [4] 10.0600000000000005
 [5] 10.6000000000000014 # (CURRENT) HIGH VALUE
 [6] 10.6000000000000014 # SAME
 [7] 10.6000000000000014 # SAME
 [8] 10.6699999999999999 # INCREASING
 [9] 10.8799999999999990
[10] 11.0899999999999999
[11] 11.3000000000000007

Andre Mikulec
Andre_Mikulec@Hotmail.com
Comment 3 Suharto Anggono 2016-03-27 14:07:34 UTC
Created attachment 2045 [details]
Fix attempt: not use arithmetic if interpolated values are equal

For integer input, output storage mode may change because of this change.
Comment 4 Andre Mikulec 2016-12-27 19:53:54 UTC
(In reply to Suharto Anggono from comment #3)
> Created attachment 2045 [details]
> Fix attempt: not use arithmetic if interpolated values are equal
> 
> For integer input, output storage mode may change because of this change.

Yes, Suharto Anggono's suggestion(see attachment) sound like a good idea to me.

Looks like Tony Plate saw his problem eleven(11) years ago.

[Rd] bug? quantile() can return decreasing sample quantiles for increasing probabilities
Tony Plate 
https://stat.ethz.ch/pipermail/r-devel/2005-February/032282.html
Comment 5 Martin Maechler 2016-12-30 22:00:40 UTC
I tend to agree with the patch.. and my tests show no negative effects.
Note than Duncan Murdoch also suggested a change in this direction should be only beneficial (in the 2005 R-devel thread mentioned).

I will commit my changes next year (when more people are back at work).
Comment 6 Martin Maechler 2017-01-04 07:58:38 UTC
Fixed for R-devel with svn rev 71880 | 2017-01-03 09:06:14 +0100.
Thank you both, Andre, for the report,
and Suharto, for the patch!
Comment 7 Suharto Anggono 2017-01-04 16:32:37 UTC
(In reply to Suharto Anggono from comment #3)
> Created attachment 2045 [details]
> Fix attempt: not use arithmetic if interpolated values are equal
> 

In the added parts, maybe use '!=' instead of '>' or '<', so that 'quantile' continues to work on vector of mode "complex".

R help on 'quantile', in "Details" section, mentions "'quantile' can be applied to complex vectors".
Comment 8 Martin Maechler 2017-01-04 21:15:50 UTC
(In reply to Suharto Anggono from comment #7)
> (In reply to Suharto Anggono from comment #3)
> > Created attachment 2045 [details]
> > Fix attempt: not use arithmetic if interpolated values are equal
> > 
> 
> In the added parts, maybe use '!=' instead of '>' or '<', so that 'quantile'
> continues to work on vector of mode "complex".
> 
> R help on 'quantile', in "Details" section, mentions "'quantile' can be
> applied to complex vectors".

Very good point, thank you very much! --- committed as  71906  (plus an example)