From nobody Sun Feb 18 05:21:00 2024 X-Original-To: freebsd-questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TcvCS68Bwz5BWQw for ; Sun, 18 Feb 2024 05:21:08 +0000 (UTC) (envelope-from freebsd@dreamchaser.org) Received: from ns.dreamchaser.org (ns.dreamchaser.org [66.109.141.57]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature ECDSA (P-256) client-digest SHA256) (Client CN "discoveriesinwood.com", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TcvCS2c2Xz4bGC; Sun, 18 Feb 2024 05:21:08 +0000 (UTC) (envelope-from freebsd@dreamchaser.org) Authentication-Results: mx1.freebsd.org; none Received: from [192.168.151.122] (breakaway.dreamchaser.org [192.168.151.122]) by ns.dreamchaser.org (8.17.1/8.17.1) with ESMTP id 41I5L0Nw044987; Sat, 17 Feb 2024 22:21:01 -0700 (MST) (envelope-from freebsd@dreamchaser.org) Message-ID: <51badc56-7e4d-4527-81ae-b665bd51d90a@dreamchaser.org> Date: Sat, 17 Feb 2024 22:21:00 -0700 List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Reply-To: freebsd@dreamchaser.org Subject: Re: hard link pointing to itself? Content-Language: en-US To: "Greg 'groggy' Lehey" Cc: FreeBSD Mailing List References: From: Gary Aitken In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: inspected by milter-greylist-4.6.4 (ns.dreamchaser.org [192.168.151.101]); Sat, 17 Feb 2024 22:21:02 -0700 (MST) for IP:'192.168.151.122' DOMAIN:'breakaway.dreamchaser.org' HELO:'[192.168.151.122]' FROM:'freebsd@dreamchaser.org' RCPT:'' X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.6.4 (ns.dreamchaser.org [192.168.151.101]); Sat, 17 Feb 2024 22:21:02 -0700 (MST) X-Spamd-Bar: ---- X-Rspamd-Queue-Id: 4TcvCS2c2Xz4bGC X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:21947, ipnet:66.109.128.0/19, country:US] On 2/17/24 18:14, Greg 'groggy' Lehey wrote: > On Saturday, 17 February 2024 at 8:28:53 -0700, Gary Aitken wrote: >> running 13.2-release, created a tar archive, went to extract on >> another 13.2-release system, and got several messages of the form: >> >> $ tar xf tmp.tar path-to-file/filename.jpg: Skipping hardlink >> pointing to itself: path-to-file/filename.jpg: File exists > > Fascinating. I'm rearranging the rest of your message to hopefully > explain things better. > Yes, it's confusing. It confused me too. I went and took a look at > the sources (in this case the file > /usr/src/contrib/libarchive/libarchive/archive_write_disk_posix.c), > and found what's going on--I think. The hard links aren't in the > file system, they're in the tar archive. And one of the more > obscure things about a tar archive is that it needs to keep track of > files with multiple links (names). It stores the file under one > name, and if there are any more, it creates a reference to the same > file. It seems that this somewhat confusing message is saying that > it discovered some inconsistency that it (and the author of > libarchive) wasn't expecting. From the source to > archive_write_disk_header() (round line 563): > > /* * Extract this entry to disk. * * TODO: Validate hardlinks. > According to the standards, we're * supposed to check each extracted > hardlink and squawk if it refers * to a file that we didn't restore. > I'm not entirely convinced this * is a good idea, but more > importantly: Is there any way to validate * hardlinks without keeping > a complete list of filenames from the * entire archive?? Ugh. */ > > Without going into too much detail, this looks like some kind of > bug. I've tried to think of a number of scenarios, but I can't at > the moment. bummer :-( Thanks for the deep dive. I understand the difference between hard and symlinks, and the inode # and count. I forgot about the different hard links being able to have different names, which may be my problem somehow. > It would be interesting to know what you were trying to > do. Does it happen when you try to extract the entire archive to an > empty hierarchy? Does this file have multiple links? Yes, the files have multiple links; count is 2. Ah... thank you for the hints, found them. Problem was: file system looked like: /home/me/A/B/C/D/file.jpg /X/file.jpg tarball created: cd /home/me/A/B/C tar cf foo.tar D so the other file was not in the tarball. located by doing: $ ls -il foo.jpg # get inode # of bad file $ df | grep home # make sure /home is a mount point $ cd /home $ find -xX . -type f | xargs -L 1 ls -il | grep inode# > Another thing that might be interesting would be to try GNU tar > (gtar, in the ports). It might accept the archive, or it might > produce a different error result. Thanks, may try that. > My guess is that there might be two different issues here. The > message you show is a warning, though it does mean that the file > doesn't get restored. Was there another message at the end? The job was backgrounded, but terminated after a few errors. As far as I know no other message. So I guess a question now is, is there a way to get tar to somehow ignore the inode count / force it to 1 when tar creates the tarball? Or maybe just tell tar to ignore hard links when extracting. Too late read the tar manpage with all those args and work on further tonight; will poke at it tomorrow. Thanks again, Gary > Greg -- When replying to this message, please copy the original > recipients. If you don't, I may ignore the reply or reply to the > original recipients. For more information, see > http://www.lemis.com/questions.html Sent from my desktop computer. > See complete headers for address and phone numbers. This message is > digitally signed. If your Microsoft mail program reports problems, > please read http://lemis.com/broken-MUA.php