From nobody Sat Aug 03 14:15:55 2024 X-Original-To: freebsd-fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Wbl9X5Gx9z5SW0K for ; Sat, 03 Aug 2024 14:16:00 +0000 (UTC) (envelope-from SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Wbl9X02Ghz4Ggg for ; Sat, 3 Aug 2024 14:15:59 +0000 (UTC) (envelope-from SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz) Authentication-Results: mx1.freebsd.org; dkim=pass header.d=quip.cz header.s=private header.b=lCD8saP1; dkim=pass header.d=quip.cz header.s=private header.b="vYmfN/SW"; dmarc=none; spf=none (mx1.freebsd.org: domain of "SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz" has no SPF policy when checking 94.124.105.4) smtp.mailfrom="SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz" Received: from elsa.codelab.cz (localhost [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 394AED788B; Sat, 3 Aug 2024 16:15:57 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quip.cz; s=private; t=1722694557; bh=YaPPe7j6zDI7a7MQLQT62Iyv6WPxAmiYvihGemuFbCY=; h=Date:Subject:To:References:From:Cc:In-Reply-To; b=lCD8saP1Nl8asmIX/bD+S82xzoVGZLcY3dBr8p5fiM++f7cOpxMbtmF7drCLIYra+ 3/qPKOkRkcDYDSBLCp88xWqqn+klwzSz043mMgTwRMAkj7tIPJJCU+Txo4/0pemL5S NjkRNTKcor1+JYt7fBvUYN9a2546g4guu7/bhWvo= Received: from [192.168.145.49] (ip-89-177-27-225.bb.vodafone.cz [89.177.27.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 51CBAD788A; Sat, 3 Aug 2024 16:15:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quip.cz; s=private; t=1722694556; bh=YaPPe7j6zDI7a7MQLQT62Iyv6WPxAmiYvihGemuFbCY=; h=Date:Subject:To:References:From:Cc:In-Reply-To; b=vYmfN/SW5a557KnV6ER2bwRg9zfG2xNIIRw3Tj1F37CroKwLGxWmFGCH1np06csVU 4/+iB7uci7DWXaaoBDzROdziKdJ0M/nDWTQ8My+2AK7wBuhclD5g3U+JzFpoldJDhQ G40invVwvWNVLoaSDtuHeI6AK/Hc6n2c+ynnziZo= Message-ID: Date: Sat, 3 Aug 2024 16:15:55 +0200 List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@FreeBSD.org MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Inconsistency between space used by ZFS snapshots reported by zfs list To: Eric Borisch References: <181aa62d-a940-4bd1-a057-89766f095edf@quip.cz> Content-Language: en-US From: Miroslav Lachman <000.fbsd@quip.cz> Cc: "freebsd-fs@freebsd.org" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spamd-Bar: -- X-Spamd-Result: default: False [-2.94 / 15.00]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_MEDIUM(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.95)[-0.948]; FORGED_SENDER(0.30)[000.fbsd@quip.cz,SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz]; R_DKIM_ALLOW(-0.20)[quip.cz:s=private]; MIME_GOOD(-0.10)[text/plain]; XM_UA_NO_VERSION(0.01)[]; DMARC_NA(0.00)[quip.cz]; RCVD_VIA_SMTP_AUTH(0.00)[]; R_SPF_NA(0.00)[no SPF record]; ASN(0.00)[asn:42000, ipnet:94.124.104.0/21, country:CZ]; TO_DN_EQ_ADDR_SOME(0.00)[]; TO_DN_SOME(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_TO(0.00)[gmail.com]; RCPT_COUNT_TWO(0.00)[2]; MLMMJ_DEST(0.00)[freebsd-fs@freebsd.org]; FROM_NEQ_ENVFROM(0.00)[000.fbsd@quip.cz,SRS0=oXap=PC=quip.cz=000.fbsd@elsa.codelab.cz]; FROM_HAS_DN(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; RCVD_TLS_LAST(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_NA(0.00)[]; DKIM_TRACE(0.00)[quip.cz:+] X-Rspamd-Queue-Id: 4Wbl9X02Ghz4Ggg On 02/08/2024 23:29, Eric Borisch wrote: > On Fri, Aug 2, 2024 at 4:05 PM Miroslav Lachman <000.fbsd@quip.cz > > wrote: > > Many times it happened to me that I was looking for where the used > space > in the pool were allocated and I couldn't get the right result. Listing > "zfs list -o space" lists info that snapshots are taking up the most > space, but when I list the snapshots for a given filesystem (zfs > list -r > -t snapshot), the sum of the occupied space does not match previously > reported used size. > > # zfs list -o space ssdtank1/vol1/db/postgres > NAME                   AVAIL   USED  USEDSNAP  USEDDS  USEDREFRESERV > USEDCHILD > ank1/vol1/db/postgres  21.8G   169G      102G   67.0G             0B >      0B > > # zfs list -H -p -r -t snapshot tank1/vol1/db/postgres | awk 'BEGIN { > used=0 } { used=used+$2 } END { print used/1024/1024/1024"GB" }' > 41.9309GB > > USEDSNAP from zfs list -o space: 102 G > sum of snapshots sizes: 41.9309 GB > > Why it doesn't match? > What is the real space used by snapshots? zfs list -o space or the sum > of snapshots sizes listed by zfs list -r -t snapshot? > > The machine is FreeBSD 13.3-p4 amd64. > > > Kind regards > Miroslav Lachman > > > From zfsprops(7): > > The used space of a snapshot (see the Snapshots section of > zfsconcepts(7)) is > space that is *referenced exclusively by this snapshot*.  If this > snapshot is > destroyed, the amount of used space will be freed. *Space that is > shared by > multiple snapshots isn't accounted for in this metric.* When a > snapshot is > destroyed, space that was previously shared with this snapshot ca*n > become* > *unique to snapshots adjacent to it, thus changing the used space of > those* > *snapshots*.  The used space of the latest snapshot can also be > affected by > changes in the file system.  Note that the used space of a snapshot > is a subset > > of the written space of the snapshot. > > > So you have roughly 102-42 = 60GB of space that is referenced by /*more > than one*/ snapshot, *and not* by the current (live filesystem) dataset. > > When you ask "what is the real space", it depends on what you're asking. > All of the snapshots together take up 102G. For how much space you would > recover for removing a given snapshot /now/, the /used/ property > provides that, but note that each snapshot's /used/ property may > increase as other snapshots are deleted. If you were to delete snapshots > one by one, and check the /used/ property of the to-be-deleted snapshot > just before deleting each one, adding those up should total close to 102G. My problem is mainly in the following case. At 21:47 the Postgres database deletion was started. After the internal cleanup job (vaccum) runs, the FREE on the pool drops from 61GB to 15GB - so 46GB of raw space is gone. zfs list showed AVAIL 29GB before deletion, after delete & vaccum only 7.6GB AVAIL. In the meantime, snapshots are created from cron every 15 minutes, and that's where I would expect to see the 21GB or so that the available space has changed by. (I shortened the path to avoid line wrapping) NAME USED AVAIL REFER vol1/db/postgres@zfsnapsg_2024-08-02_21.00.00--M 40.8M - 66.4G vol1/db/postgres@zfsnapsg_2024-08-02_21.02.00--H 53.8M - 66.4G vol1/db/postgres@zfsnapsg_2024-08-02_21.15.00--M 496M - 66.4G vol1/db/postgres@zfsnapsg_2024-08-02_21.30.00--M 379M - 66.5G vol1/db/postgres@zfsnapsg_2024-08-02_21.45.00--M 454M - 66.5G vol1/db/postgres@zfsnapsg_2024-08-02_22.00.00--M 1.13G - 67.0G vol1/db/postgres@zfsnapsg_2024-08-02_22.02.00--H 613M - 66.8G vol1/db/postgres@zfsnapsg_2024-08-02_22.15.00--M 2.29G - 66.8G vol1/db/postgres@zfsnapsg_2024-08-02_22.30.00--M 6.12G - 67.0G vol1/db/postgres@zfsnapsg_2024-08-02_22.45.00--M 1012M - 66.9G vol1/db/postgres@zfsnapsg_2024-08-02_23.00.00--M 30.2M - 66.8G vol1/db/postgres@zfsnapsg_2024-08-02_23.02.00--H 18.8M - 66.8G vol1/db/postgres@zfsnapsg_2024-08-02_23.15.00--M 129M - 66.6G Deletion and cleanup happened between 21:45 and 22:45. So if this "zfs list-t snapshot" shows only exclusive "not shared" data in snapshots, then you can never estimate how many snapshots together contain space. If snapshot A takes up 3GB, snapshot B takes up 5GB, but they have 2GB of shared data, then snapshot A will show 1GB and snapshot B will show 3GB, which adds up to only 4GB, even though the total space taken up by both snapshots (after including the shared space) will be 1 + 2 + 3 = 6GB. Do I get it right? > (Side note: awk, auto-initializes variables to 0, so you don't need the > BEGIN clause.) Yeah, I normally don't use BEGIN, I put it there by coincidence :) Kind regards Miroslav Lachman