Re: Tool to compare directories and delete duplicate files from one directory
- In reply to: David Christensen : "Re: Tool to compare directories and delete duplicate files from one directory"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 18 May 2023 09:53:01 UTC
On 5/18/23 01:35, David Christensen wrote: > On 5/17/23 00:55, Kaya Saman wrote: >> >> On 5/15/23 23:26, Sysadmin Lists wrote: >>>> ---------------------------------------- >>>> From: David Christensen <dpchrist@holgerdanske.com> >>>> Date: May 15, 2023, 1:43:38 AM >>>> To: <questions@freebsd.org> >>>> Subject: Re: Tool to compare directories and delete duplicate files >>>> from one directory >>>> >>>> >>>> I looks like your script only finds duplicates when the subpath is >>>> identical (?): >>>> >>> Yeah. Wasn't that the original problem description? I went off the >>> example >>> given by Paul earlier in this thread, and it looked like only files >>> with >>> matching subpaths were being considered (because the OP accidentally >>> rsync'd >>> files from a source to a bunch of destination dirs). >>> >> >> Glad to see this thread has turned into an interesting discussion.... >> >> >> Just as the OP :-) I will clarify.... >> >> There was no accidental rsync in place. >> >> >> Due to lack of storage my files where basically all over the place on >> different zpools. The problem is that most of those were on iscsi >> drives (all running Freebsd), so I needed to get them in a single >> place. Of course as the files where all over things became a mess. >> >> I bought a few new drives and created a new zpool just for this case. >> So virtually I had to sync the multiple directories to a single >> destination. *but* of course I didn't use the --remove-source-files >> option as I didn't want things to be destructive. >> >> >> But then I needed the extra space too and that's where this post came >> from. >> >> >> Regards, >> >> >> Kaya > > > I seem to recall that you decided to run a Perl script posted by a > reader. How has that worked out? Very well. > > > My first response presupposed that you wanted to delete /dir1, /dir2, > and /dir3. Further messages indicated that you wanted to keep those > directories and any unique files they contain. Please clarify your > plans for those directories and their contents. Nope..... I wanted to delete the duplicate files within /dir1/path... /dir2/path... and /dir3/path.... while keeping any files that differ. > > > How do you plan to validate the consolidation process when it is > complete? The consolidation process is already finished. Rsync already took care of that. I used: rsync -avvc --progress --ignore-existing src dst The script I was given then simply deleted the duplicates from the source directories <- in fact this is really specific to me; as I just wanted to make my life easier in order to find the files that have the same names but different sizes. Now that I have only the different files left, I can merge them by changing the directory name and adding a .1 or so to the end and then simply rsync those directories over in addition. Again, it's just a really specific use case for this particular merge to me at the moment. > > > David > > Regards, Kaya