zfs diff

From: Eugene M. Zheganin <eugene_at_zheganin.net>
Date: Wed, 12 Feb 2025 16:41:17 UTC
Hello,

I have a 13.2-RELEASE-p3 system with a large storage attached:

===Cut===

NAME    SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP DEDUP    
HEALTH  ALTROOT
tank    135T   122T  12.8T        -         -    66%    90% 1.27x    
ONLINE  -
zroot  31.5G  27.8G  3.74G        -         -    70%    88% 1.00x    
ONLINE  -

===Cut===

In order to process some newly incoming files I'd like to use the zfs 
diff functionality to get the list of the files created or modified. So 
I wrote a simple script (/root/periodic/zfsdiff) diffing two dataset 
snapshots between today and yesterday. Most of these launches do merely 
work. But not all of them. Some (like 15%) just are waiting for 
something infinitely, while seemingly doing nothing:

===Cut===

39935  -  I      18101:52,29 zfs diff tank/data/tank2@2025-01-20 
tank/data/tank2@2025-01-21
46118  -  Is         0:00,00 /bin/sh /root/periodic/zfsdiff
46126  -  I        354:34,75 zfs diff tank/data/tank0@2025-02-03 
tank/data/tank0@2025-02-04
49620  -  I       2155:14,42 zfs diff tank/data/tank1@2025-02-10 
tank/data/tank1@2025-02-11
53243  -  Is         0:00,00 /bin/sh /root/periodic/zfsdiff
53255  -  I       3607:34,83 zfs diff tank/data/tank0@2025-02-09 
tank/data/tank0@2025-02-10
56849  -  Is         0:00,00 /bin/sh /root/periodic/zfsdiff
59725  -  I       3630:23,01 zfs diff tank/data/tank2@2025-01-27 
tank/data/tank2@2025-01-28
65460  -  I       1425:25,55 zfs diff tank/data/tank1@2025-02-03 
tank/data/tank1@2025-02-04
82371  -  I        111:25,63 zfs diff tank/data/tank3@2025-02-11 
tank/data/tank3@2025-02-12
98172  -  Is         0:00,00 /bin/sh /root/periodic/zfsdiff
98223  -  I       4792:11,99 zfs diff tank/data/tank3@2025-02-04 
tank/data/tank3@2025-02-05
40589  2  IN     18108:48,07 zfs diff tank/data/tank2@2025-01-20 
tank/data/tank2@2025-01-21
28649  6  I+       471:24,81 zfs diff tank/data/tank1@2025-02-03

===Cut===

Surprisingly, this has little to no correlation to the size of the 
snapshot, for instance I have the relatively small snapshot diff that 
fails to process (notice the idle process above):

===Cut===

tank/data/tank1@2025-02-03                       31.6M      - 16.0T  -
tank/data/tank1@2025-02-04                       32.5M      - 16.0T  -

===Cut===

Also, some of these leave no output, without any traces of the script 
killed or crashed which is very suspicious as well. You could say that 
this probably means there were no changes, but the snapshot size thinks 
there were some.

Is there any trick there ? Does this look like a race condition, do I 
have to run these sequentially, like one diff at a time ? Can those 
interfere with only their fellow diffs, or also with snapshot creation ?


Thanks.

Eugene.