Bugzilla – Bug 15261
cat() with more than one argument and more then one separator cycles separators improperly
Last modified: 2013-04-09 05:50:48 UTC
This is a part of "Details" section in the documentation for function 'cat' in R 3.0.0.
'cat' is useful for producing output in user-defined functions.
It converts its arguments to character vectors, concatenates them
to a single character vector, appends the given 'sep = ' string(s)
to each element and then outputs them.
If it is true, it should be the same whether giving to 'cat' several character vectors or giving to 'cat' a single character vector that is the concatenation of them. In reality, the two cases can give different results when 'sep' contains more than one elements. This is an example.
> cat("a", "b", "c", "d", sep=c("-", "+", "x")); cat("|\n")
> cat(c("a", "b", "c", "d"), sep=c("-", "+", "x")); cat("|\n")
This is a test case of behavior of 'cat'.
> cat(c("a", "b", "c"), c(1, 2, 3),
+ sep=c("-", "+", "x", "?", "@")); cat("|\n")
In the output above, element "x" from 'sep' is not used. The "-" is used instead. It happens at the boundary between the two objects given to 'cat'. So, it seems that
(1) between adjacent objects, the first element of 'sep' is always used;
(2) between adjacent elements within each object, elements of 'sep' are used as if the objects are concatenated after being converted to character vectors.
I also have an issue with the "Note" section:
If any element of 'sep' contains a newline character, it is
treated as a vector of terminators rather than separators, an
element being output after every vector element _and_ a newline
after the last. Entries are recycled as needed.
I think, 'sep' is always treated as a vector of separators. But, in the special case where any element of 'sep' contains a newline character, a final newline is added in the output. This is an example.
> cat(c("a", "b", "c"), sep=c("-", "+\n")); cat("|\n")
In the output above, 'c' is directly followed by newline. There is no other character in between.
R version 3.0.0 (2013-04-03)
Platform: i386-w64-mingw32/i386 (32-bit)
 LC_COLLATE=English_United States.1252
 LC_CTYPE=English_United States.1252
 LC_MONETARY=English_United States.1252
 LC_TIME=English_United States.1252
attached base packages:
 stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
For consideration, this is the description of argument 'sep' in http://www.uni-muenster.de/ZIV.BennoSueselbeck/s-html/helpfiles/cat.html, which I believe to be the help page of 'cat' in S-PLUS 3.4.
vector of character strings to insert between successive data items of each object. This argument is used cyclically and if it contains a newline, the output will contain a final newline.
Recycling of arguments is standard in R functions.
(In reply to comment #1)
> Recycling of arguments is standard in R functions.
OK. Let me state this more clearly. I don't complain about recycling of arguments.
Take a look at
cat(c("a", "b", "c"), c(1, 2, 3), sep=c("-", "+", "x", "?", "@"))
I would expect that the output is
But, the output is
Notice that, between c and 1, it is -, not x
To the extreme, the output of
cat("a", "b", "c", "d", sep=c("-", "+", "x"))
Notice that only - (the first element of 'sep') is used.
This behavior is not stated in the help page. So, I hope that this fact is stated, that, between objects, the _first_ element of 'sep' is always used.
There are two bugs: a) the sep index (ntot) is advanced even after the last element (that's why "x" doesn't appear) and b) between elements index 0 is always used. I have fixed both, such that cat(x, y, sep=z) and cat(c(x, y), sep=z) have the same effect (as suggested by the documentation). The only exception is still a zero-length vector [other than NULL] which behaves like "" (this would be easy to change, but looking at the code it seems this was intentional and it is intuitive to force a separator between arguments in all cases).