Interesting: ZFS scrub prefetch hurting sequential scrub performance?
Borja Marcos
borjam at sarenet.es
Thu Jan 3 10:34:40 UTC 2019
Hi,
I have noticed that my scrubs have become painfully slow. I am wondering wether I’ve just hit some worst case or maybe
there is some interaction between the ZFS sequential scrub and scrub prefetch. I don’t recall seeing this behavior
before the sequential scrub code was committed.
Did I hit some worst case or should scrub prefetch be disabled with the new sequential scrub code?
# zpool status
pool: pool
state: ONLINE
scan: scrub in progress since Sat Dec 29 03:56:02 2018
133G scanned at 309K/s, 129G issued at 300K/s, 619G total
0 repaired, 20.80% done, no estimated completion time
When this happened last month I tried rebooting the server and restarting the scrub and everything went better.
The first graph shows the disk I/O bandwith history for the last week. When the scrub started disk I/O “busy percent”
reached almost 100 %. And curiously the transfer rates looked rather healthy at around 10 MBps of read activity.
At first I suspected a misbehaving disk slowing down the whole process with retries but all the disks show a similar
service time pattern. One attached for reference.
Looking at the rest of the stats for some misbehavior hints I saw arctats_prefetch_metadata misses raising to
about 2000 per second and arcstats_l2_misses following the same pattern.
Could it be prefetch spending a lot of time writing on the l2arc only to have the data evicted due to misses?
I have tried disabling scrub prefetch (vfs.zfs.no_scrub_prefetch=1) and, voila! everything picked up speed. Now
with a zpool iostat I see bursts of 100+ MBps reading activity and a proper scrub activity.
Disk busy percent has gone down to around 50% and cache stats have become much better. Turns out that
most of the I/O activity was just pointless writes to the L2ARC.
Now, the hardware configuration.
The server has only 8 GB of memory with a maximum configured ARC size of 4 GB.
It has a LSI2008 card with IR firmware. I didn´t bother to cross flash but anyway I am not using the RAID facilities,
it´s just configured like a plain HBA.
mps0: <Avago Technologies (LSI) SAS2008> port 0x9000-0x90ff mem 0xdfff0000-0xdfffffff,0xdff80000-0xdffbffff irq 17 at device 0.0 numa-domain 0 on pci4
mps0: Firmware: 20.00.07.00, Driver: 21.02.00.00-fbsd
mps0: IOCCapabilities: 185c<ScsiTaskFull,DiagTrace,SnapBuf,EEDP,TransRetry,IR>
zpool status
pool: pool
state: ONLINE
scan: scrub in progress since Sat Dec 29 03:56:02 2018
323G scanned at 742K/s, 274G issued at 632K/s, 619G total
0 repaired, 44.32% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
pool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
da12 ONLINE 0 0 0
da13 ONLINE 0 0 0
da14 ONLINE 0 0 0
da9 ONLINE 0 0 0
da15 ONLINE 0 0 0
da3 ONLINE 0 0 0
raidz1-1 ONLINE 0 0 0
da10 ONLINE 0 0 0
da4 ONLINE 0 0 0
da5 ONLINE 0 0 0
da6 ONLINE 0 0 0
da7 ONLINE 0 0 0
da8 ONLINE 0 0 0
logs
da11p2 ONLINE 0 0 0
cache
da11p3 ONLINE 0 0 0
errors: No known data errors
Yes, both ZIL and L2ARC on the same disk (a SSD). I know it’s not optimal but I guess it’s better
than the high latency of conventional disks,
# camcontrol devlist
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 11 lun 0 (pass0,da0)
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 15 lun 0 (pass1,da1)
<SEAGATE ST9146803SS FS03> at scbus6 target 17 lun 0 (pass2,da2)
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 18 lun 0 (pass3,da3)
<SEAGATE ST9146803SS FS03> at scbus6 target 20 lun 0 (pass4,da4)
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 21 lun 0 (pass5,da5)
<SEAGATE ST9146803SS FS03> at scbus6 target 22 lun 0 (pass6,da6)
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 23 lun 0 (pass7,da7)
<SEAGATE ST914603SSUN146G 0868> at scbus6 target 24 lun 0 (pass8,da8)
<SEAGATE ST9146803SS FS03> at scbus6 target 25 lun 0 (pass9,da9)
<SEAGATE ST9146803SS FS03> at scbus6 target 26 lun 0 (pass10,da10)
<LSILOGIC SASX28 A.0 5021> at scbus6 target 27 lun 0 (ses0,pass11)
<ATA Samsung SSD 850 2B6Q> at scbus6 target 28 lun 0 (pass12,da11)
<SEAGATE ST9146803SS FS03> at scbus6 target 29 lun 0 (pass13,da12)
<SEAGATE ST9146802SS S229> at scbus6 target 30 lun 0 (pass14,da13)
<SEAGATE ST9146803SS FS03> at scbus6 target 32 lun 0 (pass15,da14)
<SEAGATE ST9146802SS S22B> at scbus6 target 33 lun 0 (pass16,da15)
<TSSTcorp CD/DVDW TS-T632A SR03> at scbus13 target 0 lun 0 (pass17,cd0)
Hope the attachments reach the list, otherwise I will mail them to anyone interested.
Cheers,
Borja.
More information about the freebsd-fs
mailing list