CFT: TRIM Consolodation on UFS/FFS filesystems
Kirk McKusick
mckusick at mckusick.com
Mon Aug 20 19:35:40 UTC 2018
I have recently added TRIM consolodation support for the UFS/FFS
filesystem. This feature consolodates large numbers of TRIM commands
into a much smaller number of commands covering larger blocks of
disk space. Best described by the commit message:
Author: mckusick
Date: Sun Aug 19 16:56:42 2018
New Revision: 338056
URL: https://svnweb.freebsd.org/changeset/base/338056
Log:
Add consolodation of TRIM / BIO_DELETE commands to the UFS/FFS filesystem.
When deleting files on filesystems that are stored on flash-memory
(solid-state) disk drives, the filesystem notifies the underlying
disk of the blocks that it is no longer using. The notification
allows the drive to avoid saving these blocks when it needs to
flash (zero out) one of its flash pages. These notifications of
no-longer-being-used blocks are referred to as TRIM notifications.
In FreeBSD these TRIM notifications are sent from the filesystem
to the drive using the BIO_DELETE command.
Until now, the filesystem would send a separate message to the drive
for each block of the file that was deleted. Each Gigabyte of file
size resulted in over 3000 TRIM messages being sent to the drive.
This burst of messages can overwhelm the drive's task queue causing
multiple second delays for read and write requests.
This implementation collects runs of contiguous blocks in the file
and then consolodates them into a single BIO_DELETE command to the
drive. The BIO_DELETE command describes the run of blocks as a
single large block being deleted. Each Gigabyte of file size can
result in as few as two BIO_DELETE commands and is typically less
than ten. Though these larger BIO_DELETE commands take longer to
run, they do not clog the drive task queue, so read and write
commands can intersperse effectively with them.
Though this new feature has been throughly reviewed and tested, it
is being added disabled by default so as to minimize the possibility
of disrupting the upcoming 12.0 release. It can be enabled by running
``sysctl vfs.ffs.dotrimcons=1''. Users are encouraged to test it.
If no problems arise, we will consider requesting that it be enabled
by default for 12.0.
Reviewed by: kib
Tested by: Peter Holm
Sponsored by: Netflix
This support is off by default, but I am hoping that I can get enough
testing to ensure that it (a) works, and (b) is helpful that it will
be reasonable to have it turned on by default in 12.0. The cutoff for
turning it on by default in 12.0 is September 19th. So I am requesting
your testing feedback in the near-term. Please let me know if you have
managed to use it successfully (or not) and also if it provided any
performance difference (good or bad).
To enable TRIM consolodation either use `sysctl vfs.ffs.dotrimcons=1'
or just set the `dotrimcons' variable in sys/ufs/ffs/ffs_alloc.c to 1.
Everything you need to test TRIM consolodation is obtained by setting
the above sysctl. However, if you want to collect statistics on how
effective the TRIM consolodation is working, the attached diff will
allow you to easily get statitics on how the TRIM is going. Compile your
kernel and the mount command. Note that if you do not do a buildworld,
you will need to copy /sys/sys/mount.h to /usr/include/sys/mount.h to
get the patched mount command to compile. Then run `mount -v'
(or `mount -v | grep /mnt' to get just the statistics for /mnt).
Removing a 30Mb file without TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 952 total blocks 7616)
While removing the same file with TRIM consolodation:
/dev/md0 on /mnt (ufs, local, writes: sync 10 async 482, reads: sync 7 async 0, fsid d43f795b6a7d34fb, TRIM: total 3 total blocks 7616)
It also tracks pending blocks and pending files. These numbers are only
printed out when they are non-zero. Here is an example running with soft
updates right after a file has been rm'ed, but its blocks not yet released:
/dev/md0 on /mnt (ufs, local, soft-updates, writes: sync 2 async 251, reads: sync 5 async 0, fsid 303f795b1be0c459, pending blocks 7616, pending files 1)
Finally it tracks inflight BIO_DELETEs and total blocks represented by
those inflight BIO_DELETEs. These numbers are also only printed out when
they are non-zero. These statistics let you see how much of a backlog
of BIO_DELETEs you have backed up at/in the disk drive and you can track
how quickly they drain.
Kirk McKusick
More information about the freebsd-fs
mailing list