Re: Inconsistency between space used by ZFS snapshots reported by zfs list

From: Miroslav Lachman <000.fbsd_at_quip.cz>
Date: Sat, 03 Aug 2024 14:15:55 UTC
On 02/08/2024 23:29, Eric Borisch wrote:
> On Fri, Aug 2, 2024 at 4:05 PM Miroslav Lachman <000.fbsd@quip.cz 
> <mailto:000.fbsd@quip.cz>> wrote:
> 
>     Many times it happened to me that I was looking for where the used
>     space
>     in the pool were allocated and I couldn't get the right result. Listing
>     "zfs list -o space" lists info that snapshots are taking up the most
>     space, but when I list the snapshots for a given filesystem (zfs
>     list -r
>     -t snapshot), the sum of the occupied space does not match previously
>     reported used size.
> 
>     # zfs list -o space ssdtank1/vol1/db/postgres
>     NAME                   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV
>     USEDCHILD
>     ank1/vol1/db/postgres  21.8G   169G      102G   67.0G             0B
>           0B
> 
>     # zfs list -H -p -r -t snapshot tank1/vol1/db/postgres | awk 'BEGIN {
>     used=0 } { used=used+$2 } END { print used/1024/1024/1024"GB" }'
>     41.9309GB
> 
>     USEDSNAP from zfs list -o space: 102 G
>     sum of snapshots sizes: 41.9309 GB
> 
>     Why it doesn't match?
>     What is the real space used by snapshots? zfs list -o space or the sum
>     of snapshots sizes listed by zfs list -r -t snapshot?
> 
>     The machine is FreeBSD 13.3-p4 amd64.
> 
> 
>     Kind regards
>     Miroslav Lachman
> 
> 
>  From zfsprops(7):
> 
>     The used space of a snapshot (see the Snapshots section of
>     zfsconcepts(7)) is
>     space that is *referenced exclusively by this snapshot*.  If this
>     snapshot is
>     destroyed, the amount of used space will be freed. *Space that is
>     shared by
>     multiple snapshots isn't accounted for in this metric.* When a
>     snapshot is
>     destroyed, space that was previously shared with this snapshot ca*n
>     become*
>     *unique to snapshots adjacent to it, thus changing the used space of
>     those*
>     *snapshots*.  The used space of the latest snapshot can also be
>     affected by
>     changes in the file system.  Note that the used space of a snapshot
>     is a subset
> 
>     of the written space of the snapshot.
> 
> 
> So you have roughly 102-42 = 60GB of space that is referenced by /*more 
> than one*/ snapshot, *and not* by the current (live filesystem) dataset.
> 
> When you ask "what is the real space", it depends on what you're asking. 
> All of the snapshots together take up 102G. For how much space you would 
> recover for removing a given snapshot /now/, the /used/ property 
> provides that, but note that each snapshot's /used/ property may 
> increase as other snapshots are deleted. If you were to delete snapshots 
> one by one, and check the /used/ property of the to-be-deleted snapshot 
> just before deleting each one, adding those up should total close to 102G.

My problem is mainly in the following case. At 21:47 the Postgres 
database deletion was started. After the internal cleanup job (vaccum) 
runs, the FREE on the pool drops from 61GB to 15GB - so 46GB of raw 
space is gone.
zfs list showed AVAIL 29GB before deletion, after delete & vaccum only 
7.6GB AVAIL.
In the meantime, snapshots are created from cron every 15 minutes, and 
that's where I would expect to see the 21GB or so that the available 
space has changed by.

(I shortened the path to avoid line wrapping)
NAME                                               USED AVAIL REFER
vol1/db/postgres@zfsnapsg_2024-08-02_21.00.00--M  40.8M     - 66.4G
vol1/db/postgres@zfsnapsg_2024-08-02_21.02.00--H  53.8M     - 66.4G
vol1/db/postgres@zfsnapsg_2024-08-02_21.15.00--M   496M     - 66.4G
vol1/db/postgres@zfsnapsg_2024-08-02_21.30.00--M   379M     - 66.5G
vol1/db/postgres@zfsnapsg_2024-08-02_21.45.00--M   454M     - 66.5G
vol1/db/postgres@zfsnapsg_2024-08-02_22.00.00--M  1.13G     - 67.0G
vol1/db/postgres@zfsnapsg_2024-08-02_22.02.00--H   613M     - 66.8G
vol1/db/postgres@zfsnapsg_2024-08-02_22.15.00--M  2.29G     - 66.8G
vol1/db/postgres@zfsnapsg_2024-08-02_22.30.00--M  6.12G     - 67.0G
vol1/db/postgres@zfsnapsg_2024-08-02_22.45.00--M  1012M     - 66.9G
vol1/db/postgres@zfsnapsg_2024-08-02_23.00.00--M  30.2M     - 66.8G
vol1/db/postgres@zfsnapsg_2024-08-02_23.02.00--H  18.8M     - 66.8G
vol1/db/postgres@zfsnapsg_2024-08-02_23.15.00--M   129M     - 66.6G

Deletion and cleanup happened between 21:45 and 22:45.

So if this "zfs list-t snapshot" shows only exclusive "not shared" data 
in snapshots, then you can never estimate how many snapshots together 
contain space. If snapshot A takes up 3GB, snapshot B takes up 5GB, but 
they have 2GB of shared data, then snapshot A will show 1GB and snapshot 
B will show 3GB, which adds up to only 4GB, even though the total space 
taken up by both snapshots (after including the shared space) will be 1 
+ 2 + 3 = 6GB. Do I get it right?

> (Side note: awk, auto-initializes variables to 0, so you don't need the 
> BEGIN clause.)

Yeah, I normally don't use BEGIN, I put it there by coincidence :)

Kind regards
Miroslav Lachman