Re: diff(1) goes into cpu-hogging endless loop
Date: Sat, 25 Mar 2023 22:29:47 UTC
On Sat, Mar 25, 2023 at 09:55:14PM +0000, Jamie Landeg-Jones wrote: > Hi, A "diff" of 2 files: > > 1 77,933,904 bytes > 2 63,013,818 bytes > > , goes into an endless loop, whilst "gdiff" completes the operation in > about 5 seconds. > > I tested using the latest "diff" from current, and get the same result. > > Splitting both files into 10Mb chunks, and diffing these was successful. > > A ktrace of the "diff" actually stops producing any output after about > 5 seconds, whilst the cpu looping continues. > > Any ideas on what to do next? Does anyone else get the same result? > > The files are just utf-8 freebsd git logs, and are available here if > anyone would like to test: > > http://www.catflap.org/jamie/1.xz (13,282,864 bytes) > http://www.catflap.org/jamie/2.xz (12,221,164 bytes) > > Cheers, Jamie My guess is that you are hitting a worst case in the stone algorithm. I have a WIP review to integrate the Myers algorithm from libdiff here: https://reviews.freebsd.org/D36860 - Tom