Bug 15101 - Performance bug - assignment inside an object is much slower
Summary: Performance bug - assignment inside an object is much slower
Status: NEW
Alias: None
Product: R
Classification: Unclassified
Component: Low-level (show other bugs)
Version: R 2.15.1 patched
Hardware: x86_64/x64/amd64 (64-bit) Mac OS X v10.8
: P5 enhancement
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2012-11-07 19:30 UTC by Michael Neale
Modified: 2012-11-08 21:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Neale 2012-11-07 19:30:19 UTC
The following code demonstrates much (up to 6x) slower performance of assignment inside an object.  This turns out to have really serious performance downsides within our statistical package.  While the test script accesses a single slot of the object for simplicity, in our package we are inserting data into several possible slots of an object.  Which slots are accessed, and in what order, is unknown until runtime.

# Comparison of cpu performance of assignment inside vs. outside an object
#---------------------------------------------------------------#
veclen <- c(10^1, 10^2, 10^3, 10^4, 2 * 10^4)

cat("Outside an object\n")

for(i in veclen) {
   v <- as.numeric(1:i)
   names(v) <- as.character(v)
   runtime <- system.time(for(ii in v) { v[[as.character(ii)]] <- 0 })
   cat(i, runtime["elapsed"], "\n")
}

setClass(Class = "Foo",
    representation = representation(
        data = "numeric"))

cat("Inside an object\n")

for(i in veclen) {
    obj <- new("Foo")
    v <- as.numeric(1:i)
    names(v) <- as.character(v)
    obj@data <- v
    runtime <- system.time(for(ii in v) { obj@data[[as.character(ii)]] <- 0 })
    cat(i, runtime["elapsed"], "\n")
}

#---------------------------------------------------------------#
Results: 

Outside an object
10 0 
100 0.001 
1000 0.017 
10000 1.068 
20000 3.978 

Inside an object
10 0.062 
100 0.003 
1000 0.139 
10000 5.128 
20000 23.911
#---------------------------------------------------------------#

Any ideas on how to fix this problem?
Comment 1 Marek Gagolewski 2012-11-08 21:01:18 UTC
Confirming on R version 2.15.2 (2012-10-26), platform==x86_64-redhat-linux-gnu 
and on different examples (not named vectors).

Interestingly, writing into l[["data"]], where l <- list(data=...) is also much faster than changing obj@data.


What's even more interesting, READING from an S4 object, on the other hand, is faster than from a list:


veclen <- 10^(4:7)

cat("direct\n")

for(i in veclen) {
   v <- 1
   sum <- 0
   runtime <- system.time(for(j in 1:i) { sum <- sum + v })
   cat(i, sum(runtime[1:2]), "\n")
}



cat("List [[\"name\"]]\n")

for(i in veclen) {
   obj <- list(data=1)
   sum <- 0
   runtime <- system.time(for(j in 1:i) { sum <- sum + obj[["data"]] })
   cat(i, sum(runtime[1:2]), "\n")
}



setClass(Class = "Foo",
         representation = representation(
            data = "numeric"))


cat("S4\n")

for(i in veclen) {
   obj <- new("Foo")
   obj@data <- 1
   sum <- 0
   runtime <- system.time(for(j in 1:i) { sum <- sum + obj@data })
   cat(i, sum(runtime[1:2]), "\n")
}



## RESULTS:

direct
10000 0.008 
1e+05 0.085 
1e+06 0.899 
1e+07 9.377 
List [["name"]]
10000 0.011 
1e+05 0.128 
1e+06 1.282 
1e+07 13.204 
S4
10000 0.009 
1e+05 0.098 
1e+06 1.009 
1e+07 10.317