lockup during zfs destroy

Wed Oct 4 17:58:03 UTC 2017

On Wed, Oct 4, 2017 at 9:27 AM, Freddie Cash <fjwcash at gmail.com> wrote:

> On Wed, Oct 4, 2017 at 9:15 AM, javocado <javocado at gmail.com> wrote:
>
>> I am trying to destroy a dense, large filesystem and it's not going well.
>>
>> Details:
>> - zpool is a raidz3 with 3 x 12 drive vdevs.
>> - target filesystem to be destroyed is ~2T with ~63M inodes.
>> - OS: FreeBSD 10.3amd with 192 GB of RAM.
>> - 120 GB of swap (90GB recently added as swap-on-disk)
>>
>
> Do you have dedupe enabled on any filesystems in the pool?  Or was it
> enabled at any point in the past?
>
> This is a common occurrence when destroying large filesystems or lots of
> filesystems/snapshots on pools that have/had dedupe enabled and there's not
> enough RAM/L2ARC to contain the DDT.  The system runs out of usable wired
> memory and locks up.  Adding more RAM and/or being patient with the
> boot-wait-lockup-repeat cycle will (usually) eventually allow it to finish
> the destroy.
>
> There was a loader.conf tunable (or sysctl) added in the 10.x series that
> mitigates this by limiting the number of delete operations that occur in a
> transaction group, but I forget the details on it.
>
> Not sure if this affects pools that never had dedupe enabled or not.
>
> (We used to suffer through this at least once a year until we enabled a
> delete-oldest-snapshot-before-running-backups process to limit the number
> of snapshots.)
>

Found it.  You can set vfs.zfs.free_max_blocks in /etc/sysctl.conf.  That
will limit the number to-be-freed blocks in a single transaction group.
You can play with that number until you find a value that won't run the
system out of kernel memory trying to free all those blocks in a single
transaction.

On our problem server, running dedupe with only 64 GB of RAM for a 53 TB
pool, we set it to 200,000 blocks:

vfs.zfs.free_max_blocks=200000

-- 
Freddie Cash
fjwcash at gmail.com