From nobody Thu May 18 09:53:01 2023 X-Original-To: questions@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4QMQJl21jmz3bSgk for ; Thu, 18 May 2023 09:53:11 +0000 (UTC) (envelope-from kayasaman@optiplex-networks.com) Received: from mail.optiplex-networks.com (mail.optiplex-networks.com [212.159.80.20]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4QMQJk1nz0z3lcF for ; Thu, 18 May 2023 09:53:10 +0000 (UTC) (envelope-from kayasaman@optiplex-networks.com) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=optiplex-networks.com header.s=AE93A2AC-7F67-11EA-90AE-8A1FE64F6997 header.b=s1GoVtFJ; spf=pass (mx1.freebsd.org: domain of kayasaman@optiplex-networks.com designates 212.159.80.20 as permitted sender) smtp.mailfrom=kayasaman@optiplex-networks.com Received: from localhost (localhost [127.0.0.1]) by mail.optiplex-networks.com (Postfix) with ESMTP id B99AE15C2C76 for ; Thu, 18 May 2023 10:53:02 +0100 (BST) Received: from mail.optiplex-networks.com ([127.0.0.1]) by localhost (mail.optiplex-networks.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Be9DxQNk1QdW for ; Thu, 18 May 2023 10:53:02 +0100 (BST) Received: from localhost (localhost [127.0.0.1]) by mail.optiplex-networks.com (Postfix) with ESMTP id 55BF515C2DBC for ; Thu, 18 May 2023 10:53:02 +0100 (BST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.optiplex-networks.com 55BF515C2DBC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=optiplex-networks.com; s=AE93A2AC-7F67-11EA-90AE-8A1FE64F6997; t=1684403582; bh=0atxIPpgU48wPROSVRC9pKm5GV8HnQCP6U/WplW9qwM=; h=Message-ID:Date:MIME-Version:To:From; b=s1GoVtFJMsgI35MvE+Bq5fyfLJYme6D5LkXY+obrnUxcHMZnfkQ5yi4r3JNydtfn3 JHWqYX8Qpw5khB8nweuVRPlRhZ/d/gjYZrGUep4Qbn6DGohzYhHaHu9HhAdFw7V+Z8 EBSKTp5JsOb0il0SeHcLTnwyuHjIY9xs7Y0MRaacMq4L+lEGo/H69lPjkLpMc1Tfot oOPetaequQ7Szq6zmjxwLkOs050GhwAtq+kv53x+CF4qE64KDIwv23+sbtSiVR2x9J dtxw3h6Yi9icmhAJgTo9P4odLmSTJYjyuU/jJR3EUedAXsBEbeSGRNFr+9JolRkdXp l8CmDVpleaTSg== X-Virus-Scanned: amavisd-new at mail.optiplex-networks.com Received: from mail.optiplex-networks.com ([127.0.0.1]) by localhost (mail.optiplex-networks.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id WaJDndYX2Cqn for ; Thu, 18 May 2023 10:53:02 +0100 (BST) Received: from [192.168.20.23] (unknown [192.168.20.23]) by mail.optiplex-networks.com (Postfix) with ESMTPSA id D5AA615C2C76 for ; Thu, 18 May 2023 10:53:01 +0100 (BST) Message-ID: Date: Thu, 18 May 2023 10:53:01 +0100 List-Id: User questions List-Archive: https://lists.freebsd.org/archives/freebsd-questions List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-questions@freebsd.org X-BeenThere: freebsd-questions@freebsd.org MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.1 Subject: Re: Tool to compare directories and delete duplicate files from one directory Content-Language: en-US To: questions@freebsd.org References: <9887a438-95e7-87cc-a162-4ad7a70d744f@optiplex-networks.com> <7c2429c5-55d0-1649-a442-ce543f2d46c2@holgerdanske.com> <6a0aba81-485a-8985-d20d-6da58e9b5580@optiplex-networks.com> <347612746.1721811.1683912265841@fidget.co-bxl> <08804029-03de-e856-568b-74494dfc81cf@holgerdansk e.com> <126434505.494354.1684104532813@ichabod.co-bxl> <818813a2-8ab0-df5 4-3c59-0e1ba9ce743d@holgerdanske.com> <941908372.622746.1684189567246@ichabod.co-bxl> <1e30ac66-a339-ce08-75ac-8e566f4d2278@optiplex-networks.com> <3e2b4ee6-c098-456a-bb3a-4b1f45e4d888@holgerdanske.com> From: Kaya Saman In-Reply-To: <3e2b4ee6-c098-456a-bb3a-4b1f45e4d888@holgerdanske.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4QMQJk1nz0z3lcF X-Spamd-Bar: / X-Spamd-Result: default: False [-0.50 / 15.00]; R_DKIM_ALLOW(-0.20)[optiplex-networks.com:s=AE93A2AC-7F67-11EA-90AE-8A1FE64F6997]; R_SPF_ALLOW(-0.20)[+mx]; MIME_GOOD(-0.10)[text/plain]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; PREVIOUSLY_DELIVERED(0.00)[questions@freebsd.org]; DKIM_TRACE(0.00)[optiplex-networks.com:+]; MLMMJ_DEST(0.00)[questions@freebsd.org]; ASN(0.00)[asn:6871, ipnet:212.159.64.0/18, country:GB]; FROM_HAS_DN(0.00)[]; local_wl_ip(0.00)[212.159.80.20] X-Rspamd-Pre-Result: action=no action; module=multimap; Matched map: local_wl_ip X-ThisMailContainsUnwantedMimeParts: N On 5/18/23 01:35, David Christensen wrote: > On 5/17/23 00:55, Kaya Saman wrote: >> >> On 5/15/23 23:26, Sysadmin Lists wrote: >>>> ---------------------------------------- >>>> From: David Christensen >>>> Date: May 15, 2023, 1:43:38 AM >>>> To: >>>> Subject: Re: Tool to compare directories and delete duplicate files=20 >>>> from one directory >>>> >>>> >>>> I looks like your script only finds duplicates when the subpath is >>>> identical (?): >>>> >>> Yeah. Wasn't that the original problem description? I went off the=20 >>> example >>> given by Paul earlier in this thread, and it looked like only files=20 >>> with >>> matching subpaths were being considered (because the OP accidentally=20 >>> rsync'd >>> files from a source to a bunch of destination dirs). >>> >> >> Glad to see this thread has turned into an interesting discussion.... >> >> >> Just as the OP :-) I will clarify.... >> >> There was no accidental rsync in place. >> >> >> Due to lack of storage my files where basically all over the place on=20 >> different zpools. The problem is that most of those were on iscsi=20 >> drives (all running Freebsd), so I needed to get them in a single=20 >> place. Of course as the files where all over things became a mess. >> >> I bought a few new drives and created a new zpool just for this case.=20 >> So virtually I had to sync the multiple directories to a single=20 >> destination. *but* of course I didn't use the --remove-source-files=20 >> option as I didn't want things to be destructive. >> >> >> But then I needed the extra space too and that's where this post came=20 >> from. >> >> >> Regards, >> >> >> Kaya > > > I seem to recall that you decided to run a Perl script posted by a=20 > reader.=C2=A0 How has that worked out? Very well. > > > My first response presupposed that you wanted to delete /dir1, /dir2,=20 > and /dir3.=C2=A0 Further messages indicated that you wanted to keep tho= se=20 > directories and any unique files they contain.=C2=A0 Please clarify you= r=20 > plans for those directories and their contents. Nope..... I wanted to delete the duplicate files within /dir1/path...=20 /dir2/path... and /dir3/path.... while keeping any files that differ. > > > How do you plan to validate the consolidation process when it is=20 > complete? The consolidation process is already finished. Rsync already took care=20 of that. I used: rsync -avvc --progress --ignore-existing src dst The script I was given then simply deleted the duplicates from the=20 source directories <- in fact this is really specific to me; as I just=20 wanted to make my life easier in order to find the files that have the=20 same names but different sizes. Now that I have only the different files left, I can merge them by=20 changing the directory name and adding a .1 or so to the end and then=20 simply rsync those directories over in addition. Again, it's just a really specific use case for this particular merge to=20 me at the moment. > > > David > > Regards, Kaya