Re: Tool to compare directories and delete duplicate files from one directory
Date: Fri, 05 May 2023 03:01:54 UTC
On Thu, May 4, 2023 at 10:30 PM Kaya Saman <kayasaman@optiplex-networks.com> wrote: > > On 5/5/23 03:08, Paul Procacci wrote: > > There are multiple reasons why it may not work. My guess is because the > potential for characters that could be showing up within the filenames and > whatnot. > > This can be solved with an interpreted language that's a bit more > forgiving. > Take the following perl script. It does the same thing as the shell > script (almost). It renames the source file instead of making a copy of it. > > run as: ./test.pl /absolute/path/to/master_dir /absolute_path_to_dir_x > > ################################################################################### > > #!/usr/bin/env perl > > use strict; > use warnings; > > sub msgDie > { > my ($ret) = shift; > my ($msg) = shift // "$0 dir_base dir\n"; > print $msg; > exit($ret); > } > > msgDie(1) unless(scalar @ARGV eq 2); > > my $base = $ARGV[0]; > my $dir = $ARGV[1]; > > msgDie(1, "base directory doesn't exist\n") unless -d $base; > msgDie(1, "source directory doesn't exist\n") unless -d $dir; > > opendir(my $dh, $dir) or msgDie("Unable to open directory: $dir\n"); > while(readdir $dh) > { > next if($_ eq '.' || $_ eq '..'); > if( ! -f "$base/$_" ){ > rename("$dir/$_", "$base/$_"); > next; > } > > my ($ref) = (stat("$base/$_"))[7]; > my ($src) = (stat("$dir/$_"))[7]; > unlink("$dir/$_") if($ref == $src); > } > > ################################################################################### > > ~Paul > > > > This didn't seem to work :-( > > > What exactly happened is this: > > > I created a set of test directories in /tmp > > > So, I have /tmp/test1 and /tmp/test2 > > > to mimic the structure of the directories I intend to run this thing I did > this: > > > create a subdir called: dupdir in /tmp/test1 and /tmp/test2 > > > /tmp/test2/dupdir contains these files: dup and dup1 > > > /tmp/test1/dupdir contains a modified 'dup' file but copied dup1 file. > > > However*, now things get interesting as dup from test1 contains "1234567" > and dup from test2 contains "111" <- this is to simulate the file size > difference. > > > > > > Worked for me! Regardless. Use rsync then. rsync --ignore-existing --remove-source-files /src /dest This would at the very least move non-existent files from the source over to the dest AND remove those source files AFTER the transfer happens. You'll be 1/2 way there doing that. What you'll be left with are file that exist in BOTH src AND DEST. ~Paul