How to get the deterministic result for FreeBSD tar(1)?

Fabian Keil freebsd-listen at fabiankeil.de
Tue Dec 8 17:42:49 UTC 2015


Yuri <yuri at rawbw.com> wrote:

> I have two identical directories (no diffs, all identical mtime 
> attributes) compressed by this command:
> find dir -print0 | LC_ALL=C sort -z | tar cf archive.tgz --format=bsdtar 
> --no-recursion --null -T -
> 
> The results are different: 3 files out of 10,000 have pax attributes set 
> that are different:
> - 27 ctime=1449566560.642715
> +27 ctime=1449566903.167521
[...] 
> So I have two questions:
> 1. How do I actually achieve the output determinism for tar(1)?

You can use an mtree spec to set fake timestamps etc.

For an example see patch 12 in this set:
https://www.fabiankeil.de/sourcecode/electrobsd/reproducible-build-goo-r291706-29246dc.diff

Patch 5 contains a script to regenerate tar files with normalized
timestamps (and some other attributes) but of course generating the
files twice is a bit silly if it can be avoided.

> 2. Is there an agreement that this is a bug that too long or non-ASCII 
> path name triggers the leakage of ctime into a tar file?

My general impression is that large parts of tar's behaviour are
undefined (due to lack of documentation) and it's not obvious to
me that this isn't one of them.

Fabian
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/freebsd-hackers/attachments/20151208/2634a319/attachment.sig>


More information about the freebsd-hackers mailing list