Bug 17293 - utils::tar fails on large folders if "tar" argument is used
Summary: utils::tar fails on large folders if "tar" argument is used
Status: UNCONFIRMED
Alias: None
Product: R
Classification: Unclassified
Component: I/O (show other bugs)
Version: 3.4.0
Hardware: x86_64/x64/amd64 (64-bit) Linux
: P5 major
Assignee: R-core
URL:
Depends on:
Blocks:
 
Reported: 2017-06-20 19:37 UTC by meik michalke
Modified: 2017-10-05 19:37 UTC (History)
0 users

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description meik michalke 2017-06-20 19:37:36 UTC
if you use the tar() function from utils in R 3.4 on a directory with many files (e.g., if it includes a git repository) and set the "tar" argument to anything other than an empty string, the function does nothing, not even throw a warning or error.

the problem seems to be that internally, tar() appends every single file recursively to the system call, which then gets so long that the system tar command just fails.

i managed to cat() a generated system call of tar() to a temporary file -- it was 158 kilobytes(!) long.
Comment 1 meik michalke 2017-06-22 13:39:04 UTC
here's instructions how to reproduce the problem (replace "touch" on non-unix systems):

demoDir <- file.path(tempdir(), "a_rather_long_path_name_to_make_tar_faint_more_quickly_because_it_will_repeat_this_string_for_each_file")
dir.create(demoDir)
# let's create 1000 empty dummy files
for (thisFile in 1:1000) {
  system(paste(Sys.which("touch"), file.path(demoDir, paste0("a_long_file_name_to_make_tar_faint_more_quickly_because_it_will_append_each_single_file_internally_", thisFile))))
}
# doesn't work:
tar(file.path(tempdir(), "tar_fail.tar"), files=demoDir, tar="/bin/tar")
dir(tempdir())
# still works:
tar(file.path(tempdir(), "tar_nofail.tar"), files=demoDir, tar="")
dir(tempdir())


on my system, 600 files were enough to trigger the bug. when i shorten the file names, more files are excepted.
Comment 2 meik michalke 2017-07-02 12:03:23 UTC
still broken in R 3.4.1
Comment 3 meik michalke 2017-10-05 19:37:40 UTC
still broken in R 3.4.2