Re: Tool to compare directories and delete duplicate files from one directory
Date: Thu, 04 May 2023 22:32:04 UTC
On Thu, May 4, 2023 at 5:47 PM Kaya Saman <kayasaman@optiplex-networks.com> wrote: > > On 5/4/23 17:29, Paul Procacci wrote: > > > > On Thu, May 4, 2023 at 11:53 AM Kaya Saman < > kayasaman@optiplex-networks.com> wrote: > >> Hi, >> >> >> I'm wondering if anyone knows of a tool like diff or so that can also >> delete files based on name and size from either left/right or >> source/destination directory? >> >> >> Basically what I have done is performed an rsync without using the >> --remove-source-files option onto a newly bought and created disk pool >> (yes zpool) that i am trying to consolidate my data - as it's currently >> spread out over multiple pools with the same folder name. >> >> >> The issue I am facing mainly is that I perform another rsync and use the >> --remove-source-files option, rsync will delete files based on name >> while there are some files that have the same name but not same size and >> I would like to retain these files. >> >> >> Right now I have looked at many different options in both rsync and >> other tools but found nothing suitable. I even tested using a few test >> dirs and files that I put into /tmp and whatever I tried, the files of >> different size either got transferred or deleted. >> >> >> How would be a good way to approach this problem? >> >> >> Even if I create some kind of shell script and use diff, I think it will >> only compare names and not file sizes. >> >> >> I'm really lost here.... >> >> >> Regards, >> >> >> Kaya >> >> >> >> > It sounds like you want fdupes. It's in the ports tree. > > ~Paul > > -- > __________________ > > :(){ :|:& };: > > > > I tried fdupes and installed it a while back. For me it felt like it only > works on a single directory. > > > My dir structure is that I have" > > > /dir <- main directory where everything has now been rsync'ed to > > /dir_1 <- old directory with partial content > > /dir_2 <- more partial content > > /dir_3 <- more partial content > > > The key thing here is that I need to compare: > > > /dir_(x) with /dir > > > if the files are different sizes in /dir_(x) then leave them, otherwise > delete if both name and file size are the same. > Then a tiny shell script does the job assuming your files don't have any spaces and no weird characters exist: #!/bin/sh for i in b c d; do ls $i/ | while read file; do [ ! -f a/$file ] && cp $i/$file a/$file && continue ref=`stat -f '%z' a/$file` src=`stat -f '%z' %i/$file` [ $ref -eq $src ] && rm -f $i/file done done Change paths accordingly and backup your stuff. ;) ~Paul -- __________________ :(){ :|:& };: