Created attachment 1682 [details]
modified apply.c file from R 3.1.1 defining vapply2
vapply has unnecessary overhead related to the method used for copying result data into the final result array. The data copying loop checks the type of the result on every iteration, rather than once, outside the loop. Additionally, using memcpy, where possible, simplifies the code and provides the opportunity for the compiler to specify a more efficient copying strategy.
Speed improvements with trivial functions and large input have been in the 8-20% range on various systems with gcc 4.3 and 4.4
Created attachment 1683 [details]
Vignette showing relative speed of vapply and tests.
Your vignette uses the extreme case where FUN.VALUE is of length() 2e5 ... where using memcpy() instead of a for() loop is of course very very much more efficient.
OTOH, the typical use case has length(FUN.VALUE) == 1,
so I have decided to use memcpy() whenever the length differs from one,
and in the very common case of 1, use a simplified version of the current code.
This only bloats the source code very mildly and should be (slightly) more efficient even in the very common use case (length == 1).
svn rev 67091 :
Committed the version which special cases "length 1" and otherwise uses your solution -- with switch() however, instead of if(.) .. else if(..) .. else if(..)