Bug 14654 - spurious warning from "R CMD INSTALL" / internal untar2
spurious warning from "R CMD INSTALL" / internal untar2
Status: CLOSED FIXED
Product: R
Classification: Unclassified
Component: Add-ons
R-devel (trunk)
Other Linux
: P5 enhancement
Assigned To: R-core
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-08-10 02:46 UTC by Hin-Tak Leung
Modified: 2014-02-16 11:43 UTC (History)
2 users (show)

See Also:


Attachments
source tar ball example showing this problem (6.78 MB, application/x-compressed-tar)
2011-08-10 02:46 UTC, Hin-Tak Leung
Details
a very small tar.gz with shows this problem (140 bytes, application/octet-stream)
2011-08-10 02:53 UTC, Hin-Tak Leung
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Hin-Tak Leung 2011-08-10 02:46:40 UTC
Created attachment 1218 [details]
source tar ball example showing this problem

spurious warning from "R CMD INSTALL" of the attached tar ball, one per file entry as far as I see, but extract and install anyway.
(GNU tar does not emit warnings). 
 
----------------------
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/DESCRIPTION'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/NAMESPACE'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/compare.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/contingency.table.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/convert.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/glm-test.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/imputation.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/indata.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/ld.R'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'snpStats/R/long.R'
Warning in untar2(tarfile, files, list, exdir) :
....
--------------

This is a follow-up of a discussion about pax headers on R-devel in April, then a discussion on git-archive's documention on the kernel.org git mailing list in the last couple of weeks.
Comment 1 Hin-Tak Leung 2011-08-10 02:53:09 UTC
Created attachment 1219 [details]
a very small tar.gz with shows this problem

Minimum tar ball to show this problem.

R CMD INSTALL -l /tmp test2.tar.gz

------------
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'test2/'
Warning in untar2(tarfile, files, list, exdir) :
  checksum error for entry 'test2/empty2'
ERROR: cannot extract package from ‘test-pax/test2.tar.gz’
-------------

The R internal untar/untar2 implementation also seems to have problems with files which are smaller than 512 byte (as this is), when run interactively with (file("...", open="rb"), list=TRUE). That's just a 2nd problem I noticed, and seems to be unrelated to the spurious warnings.
Comment 2 Simon Urbanek 2011-08-10 03:36:42 UTC
Fixed in R-devel and patched (your checksum field has more than 6 digits which is highly unusual [since it can't be larger than 6 digits] but technically allowable).


> The R internal untar/untar2 implementation also seems to have problems with
> files which are smaller than 512 byte (as this is), when run interactively with
> (file("...", open="rb"), list=TRUE). That's just a 2nd problem I noticed, and
> seems to be unrelated to the spurious warnings.

I'm pretty sure you forgot to use gzfile() or chain it through gzcon() since tar files cannot have less than 512 bytes.
Comment 3 Simon Urbanek 2011-08-10 03:48:03 UTC
I should add that the original tar format mandated that checksums are terminated by NUL SPACE, so in that sense your tar file is invalid (there can't be more than 6 digits since the checksum field consists of 8 bytes). untar2 will now be more forgiving, but whatever program created that tar file should be fixed.
Comment 4 Hin-Tak Leung 2011-08-10 04:03:53 UTC
(In reply to comment #3)
> I should add that the original tar format mandated that checksums are
> terminated by NUL SPACE, so in that sense your tar file is invalid (there can't
> be more than 6 digits since the checksum field consists of 8 bytes). untar2
> will now be more forgiving, but whatever program created that tar file should
> be fixed.

Thanks for the quick response. I shall forward your comments to the git people.
Comment 5 Hin-Tak Leung 2011-08-10 04:20:10 UTC
(In reply to comment #3)
> ...but whatever program created that tar file should
> be fixed.

FWIW, last week in the middile of the discussion, the git people looked at library/utils/R/tar.R briefly and commented that:

------------
For reference, the documentation of the pax format including a
suggestion to treat unknown types like regular files can be found here
(search for "typename"):

  http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html
------------

While it is documented that the R internal untar2() won't handle unknown entry types, it appears that the git people disagree on how it should be handled - unknown entry types apparently should be extracted as regular files according to the posix standard, rather than bailing out.

You can see the rest of the discussion on the kernel.org git mailing list archive under "git-archive's wrong documentation: really write pax rather than tar".
Comment 6 Brian Ripley 2011-08-10 06:44:46 UTC
tar is not pax: references to standards for pax are not relevant
(and POSIX no longer sets a stndard for tar)
Comment 7 Hin-Tak Leung 2011-08-10 07:09:23 UTC
(In reply to comment #6)
> tar is not pax: references to standards for pax are not relevant
> (and POSIX no longer sets a stndard for tar)

Just for the sake of argument, in the absence of a standard (as you rightly wrote), then wouldn't it be defined by 'what _a_ unix tar utility read/write'?

These are what GNU tar currently does:  

       gnu    GNU tar 1.13.x format
       oldgnu GNU format as per tar <= 1.12
       pax    POSIX 1003.1-2001 (pax) format
       posix  same as pax
       ustar  POSIX 1003.1-1988 (ustar) format
       v7     old V7 tar format
Comment 8 Simon Urbanek 2011-08-10 13:35:16 UTC
Obviously that is entirely irrelevant - what have GNU tar's capabilities to do with this bug report? GNU tar doesn't produce the broken tar files so your arguments pointless. Can you move your questions about tar formats to R-devel, please?
Comment 9 Brian Ripley 2011-08-14 12:28:41 UTC
On Wed, 10 Aug 2011, r-bugs@r-project.org wrote:

> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14654
>
> --- Comment #7 from Hin-Tak Leung <htl10@users.sourceforge.net> 2011-08-10 02:09:23 EDT ---
> (In reply to comment #6)
>> tar is not pax: references to standards for pax are not relevant
>> (and POSIX no longer sets a stndard for tar)


> Just for the sake of argument, in the absence of a standard (as you rightly
> wrote), then wouldn't it be defined by 'what _a_ unix tar utility read/write'?


I 'rightly wrote", *but* you read inaccurately.

> These are what GNU tar currently does:
>
>       gnu    GNU tar 1.13.x format
>       oldgnu GNU format as per tar <= 1.12
>       pax    POSIX 1003.1-2001 (pax) format
>       posix  same as pax
>       ustar  POSIX 1003.1-1988 (ustar) format
>       v7     old V7 tar format


No, it doesn't.  That is what it claims to do.  There are bugs in the 
GNU implementation of ustar (that we work around).  And in fact GNU 
tar and libarchive tar (sometimes known as bsdtar) don't interoperate: 
neither can read the default format of the other.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595


Comment 10 Jackie Rosen 2014-02-16 11:43:20 UTC
(spam comment removed)