Bug 14032 - weighted.mean uses zero when na.rm=TRUE
weighted.mean uses zero when na.rm=TRUE
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Analyses
old
All All
: P5 normal
Assigned To: Jitterbug compatibility account
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-10-29 22:59 UTC by Jitterbug compatibility account
Modified: 2009-10-30 20:07 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jitterbug compatibility account 2009-10-29 22:59:09 UTC
From: Arni Magnusson <arnima@hafro.is>
The weighted.mean() function replaces NA values with 0.0 when the user 
specifies na.rm=TRUE:

   x <- c(101, 102, NA)
   mean(x, na.rm=TRUE)                         # 101.5, correct
   weighted.mean(x, na.rm=TRUE)                # 67.66667, wrong
   weighted.mean(x, w=c(1,1,1), na.rm=TRUE)    # 67.66667, wrong
   weighted.mean(x, w=c(1,1,1)/3, na.rm=TRUE)  # 67.66667, wrong

The weights are normalized w<-w/sum(w) before removing the NA values, 
effectively replacing x[is.na(x)]<-0. This bug was introduced between 
versions 2.9.2 and 2.10.0.

Thanks,

Arni

Comment 1 Jitterbug compatibility account 2009-10-30 20:07:46 UTC
From: Peter Dalgaard <p.dalgaard@biostat.ku.dk>
arnima@hafro.is wrote:
> The weighted.mean() function replaces NA values with 0.0 when the user 
> specifies na.rm=TRUE:
> 
>    x <- c(101, 102, NA)
>    mean(x, na.rm=TRUE)                         # 101.5, correct
>    weighted.mean(x, na.rm=TRUE)                # 67.66667, wrong
>    weighted.mean(x, w=c(1,1,1), na.rm=TRUE)    # 67.66667, wrong
>    weighted.mean(x, w=c(1,1,1)/3, na.rm=TRUE)  # 67.66667, wrong
> 
> The weights are normalized w<-w/sum(w) before removing the NA values, 
> effectively replacing x[is.na(x)]<-0. This bug was introduced between 
> versions 2.9.2 and 2.10.0.

Yes,

r48644 on May 27, specifically.

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907

Comment 2 Jitterbug compatibility account 2009-10-30 20:45:00 UTC
NOTES:
 Fixed in 2.10.0 patched
Comment 3 Jitterbug compatibility account 2009-10-30 20:45:02 UTC
Audit (from Jitterbug):
Fri Oct 30 15:45:01 2009	ripley	changed notes
Fri Oct 30 15:45:02 2009	ripley	moved from incoming to Analyses-fixed
Comment 4 Jitterbug compatibility account 2009-10-31 14:06:02 UTC
From: Henrik Bengtsson <hb@stat.berkeley.edu>
Here some redundancy tests that may be useful (I use similar ones for
aroma.light::weightedMedian):

n <- 10
x <- 1:n

# No weights
m1 <- mean(x)
m2 <- weighted.mean(x)
stopifnot(all.equal(m1, m2))

# Equal weights on different scales
w1 <- rep(1, n)
m1 <- weighted.mean(x, w1)
w2 <- rep(100, n)
m2 <- weighted.mean(x, w2)
stopifnot(all.equal(m1,m2))

# Pull the mean towards first value
w1[1] <- 5000
m1 <- weighted.mean(x, w1)
w2[1] <- 500000
m2 <- weighted.mean(x, w2)
stopifnot(all.equal(m1,m2))

# Zero weights
x <- 1:n
w <- rep(1, n)
w[8:n] <- 0
m1 <- weighted.mean(x, w)
m2 <- mean(x[1:7])
stopifnot(all.equal(m1,m2))

# All weights set to zero
x <- 1:n
w <- rep(0, n)
m1 <- weighted.mean(x, w)
m2 <- mean(x[w > 0])
stopifnot(all.equal(m1,m2))

# Missing values
x <- 1:n
w <- rep(1, n)
x[4:5] <- NA
m1 <- weighted.mean(x, w, na.rm=TRUE)
m2 <- mean(x, na.rm=TRUE)
stopifnot(all.equal(m1,m2))

/Henrik


On Fri, Oct 30, 2009 at 8:07 AM, Peter Dalgaard
<p.dalgaard@biostat.ku.dk> wrote:
> arnima@hafro.is wrote:
>>
>> The weighted.mean() function replaces NA values with 0.0 when the user
>> specifies na.rm=TRUE:
>>
>>   x <- c(101, 102, NA)
>>   mean(x, na.rm=TRUE)                         # 101.5, correct
>>   weighted.mean(x, na.rm=TRUE)                # 67.66667, wrong
>>   weighted.mean(x, w=c(1,1,1), na.rm=TRUE)    # 67.66667, wrong
>>   weighted.mean(x, w=c(1,1,1)/3, na.rm=TRUE)  # 67.66667, wrong
>>
>> The weights are normalized w<-w/sum(w) before removing the NA values,
>> effectively replacing x[is.na(x)]<-0. This bug was introduced between
>> versions 2.9.2 and 2.10.0.
>
> Yes,
>
> r48644 on May 27, specifically.
>
> --
>   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
>  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk)              FAX: (+45) 35327907
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>