Bug 15452 - read.csv renames column "X" to "X.1". Suggest changing make.names
Summary: read.csv renames column "X" to "X.1". Suggest changing make.names
Status: CLOSED FIXED
Alias: None
Product: R
Classification: Unclassified
Component: I/O (show other bugs)
Version: R 2.15.3
Hardware: All All
: P5 minor
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2013-09-12 21:57 UTC by Tim Hesterberg
Modified: 2013-09-27 18:09 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Tim Hesterberg 2013-09-12 21:57:42 UTC
> data <- data.frame(X = 1:3, Y = 2:4)
> write.csv(data, "~/temp.csv")
> 
> data2 <- read.csv("~/temp.csv", row.names = 1)
> names(data2)
[1] "X.1" "Y"  

read.table calls make.names(col.names, unique = TRUE)
before it removes the first column from the list.
make.names turns c("", "X", "Y") into c("X", "X.1", "Y")

The particular example I gave could be fixed by changing
read.table. But it might be better to change make.names instead.
I'd argue that make.names should change the "" and not the "X".
The following version of make.names preserves names that don't
need to be changed:

make.names2 <-
function (names, unique = FALSE, allow_ = TRUE)
{
    names2 <- .Internal(make.names(as.character(names), allow_))
    if (unique) {
      # Prefer to leave names that were originally OK alone.
      o <- order(names != names2)
      names2[o] <- make.unique(names2[o])
    }
    names2
}

# Some examples
# Two cases show problems with existing make.names
names1 <- c("", "X")
make.names(names1, unique = TRUE)  # Bad, X becomes X.1
make.names2(names1, unique = TRUE)

names2 <- c("X 1", "X.1")
make.names(names2, unique = TRUE) # bad, X.1 becomes X.1.1
make.names2(names2, unique = TRUE)

# A couple more tests, as sanity checks
names3 <- c("X", "", "X.1")
make.names(names3, unique = TRUE) # OK, "" -> X.2, others untouched
make.names2(names3, unique = TRUE)

names4 <- c("a", "", "d", "", "X.1")
make.names(names4, unique = TRUE) # OK, blanks become X and X.2, others OK
make.names2(names4, unique = TRUE)
Comment 1 Duncan Murdoch 2013-09-27 18:09:38 UTC
That seems reasonable, but I can imagine some code relying on the current behaviour.  I'll put it into R-devel for now; if it turns out to be harmless, I'll backport to R-patched later.