NVME aborting outstanding i/o and controller resets
Warner Losh
imp at bsdimp.com
Fri Apr 12 14:43:58 UTC 2019
On Fri, Apr 12, 2019 at 6:00 AM Patrick M. Hausen <hausen at punkt.de> wrote:
> Hi all,
>
> my problems seem not to be TRIM related after all … and I can now
> quickly reproduce it.
>
> =====
> root at freenas01[~]# sysctl vfs.zfs.trim.enabled
> vfs.zfs.trim.enabled: 0
> =====
> root at freenas01[~]# cd /mnt/zfs
> root at freenas01[/mnt/zfs]# dd if=/dev/urandom of=hurz bs=10m
> ^C — system freezes temporarily
>
This does one I/O at a time to the filesystem, which then repackages the
I/Os such that multiple I/Os are going on with the NVMe card.
> =====
> Apr 12 13:42:16 freenas01 nvme6: resetting controller
>
OK. This means that whatever I/O workload we've done has caused the NVME
card to stop responding for 30s, so we reset it.
> Apr 12 13:42:16 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:42:16 freenas01 nvme6: WRITE sqid:1 cid:117 nsid:1 lba:981825104
> len:176
> Apr 12 13:42:16 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:117 cdw0:0
>
But only one request was in flight... And we keep doing it over and over
again, but to different LBAs suggesting that we're stuttering: a few go
through and then things wedge again. This happens every 30ish seconds.
> Apr 12 13:42:49 freenas01 nvme6: resetting controller
> Apr 12 13:42:50 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:42:50 freenas01 nvme6: WRITE sqid:1 cid:127 nsid:1 lba:984107936
> len:96
> Apr 12 13:42:50 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:127 cdw0:0
> Apr 12 13:43:35 freenas01 nvme6: resetting controller
> Apr 12 13:43:35 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:43:35 freenas01 nvme6: WRITE sqid:1 cid:112 nsid:1 lba:976172032
> len:176
> Apr 12 13:43:35 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:112 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: resetting controller
>
And then this one goes wonkies.
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:111 nsid:1 lba:976199176
> len:248
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:111 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:102 nsid:1 lba:976199432
> len:248
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:102 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:112 nsid:1 lba:976199680
> len:8
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:112 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:105 nsid:1 lba:976199752
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:105 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:122 nsid:1 lba:976199816
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:122 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:103 nsid:1 lba:976199688
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:103 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:126 nsid:1 lba:976200136
> len:56
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:126 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:106 nsid:1 lba:976200192
> len:8
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:106 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:107 nsid:1 lba:976200200
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:107 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:127 nsid:1 lba:976200264
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:127 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:113 nsid:1 lba:976200328
> len:120
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:113 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:108 nsid:1 lba:976200448
> len:72
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:108 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:116 nsid:1 lba:976200520
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:116 cdw0:0
> =====
> root at freenas01[~]# nvmecontrol identify nvme6
> Controller Capabilities/Features
> ================================
> Vendor ID: 8086
> Subsystem Vendor ID: 8086
> Serial Number: BTLJ90230EC61P0FGN
> Model Number: INTEL SSDPE2KX010T8
>
So it's an intel card.
> Firmware Version: VDV10131
> Recommended Arb Burst: 0
> IEEE OUI Identifier: e4 d2 5c
> Multi-Interface Cap: 00
> Max Data Transfer Size: 131072
> Controller ID: 0x00
>
> Admin Command Set Attributes
> ============================
> Security Send/Receive: Not Supported
> Format NVM: Supported
> Firmware Activate/Download: Supported
> Namespace Managment: Supported
> Abort Command Limit: 4
> Async Event Request Limit: 4
> Number of Firmware Slots: 1
> Firmware Slot 1 Read-Only: No
> Per-Namespace SMART Log: No
> Error Log Page Entries: 64
> Number of Power States: 1
>
> NVM Command Set Attributes
> ==========================
> Submission Queue Entry Size
> Max: 64
> Min: 64
> Completion Queue Entry Size
> Max: 16
> Min: 16
> Number of Namespaces: 1
> Compare Command: Not Supported
> Write Uncorrectable Command: Supported
> Dataset Management Command: Supported
> Volatile Write Cache: Not Present
>
> Namespace Drive Attributes
> ==========================
> NVM total cap: 1000204886016
> NVM unallocated cap: 0
> =====
> root at freenas01[~]# zpool status
> pool: freenas-boot
> state: ONLINE
> scan: scrub repaired 0 in 0 days 00:00:03 with 0 errors on Sun Apr 7
> 03:45:03 2019
> config:
>
> NAME STATE READ WRITE CKSUM
> freenas-boot ONLINE 0 0 0
> mirror-0 ONLINE 0 0 0
> nvd0p2 ONLINE 0 0 0
> nvd1p2 ONLINE 0 0 0
>
> errors: No known data errors
>
> pool: zfs
> state: ONLINE
> scan: scrub repaired 0 in 0 days 00:01:53 with 0 errors on Fri Mar 22
> 19:53:37 2019
> config:
>
> NAME STATE READ
> WRITE CKSUM
> zfs ONLINE 0
> 0 0
> raidz2-0 ONLINE 0
> 0 0
> gptid/97d0a7ce-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
> gptid/98053880-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
> gptid/983a9468-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
> gptid/987100f2-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
> gptid/98aa6e88-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
> gptid/98f20b8c-44e5-11e9-982e-0025905f99ac ONLINE 0
> 0 0
>
> errors: No known data errors
> =====
>
> The problem only appears on the data pool built from 6 INTEL
> SSDPE2KX010T8. The system
> pool built from two KXG50ZNV256G TOSHIBA does not show any problem with
> write load.
>
OK. That suggests Intel has a problem with their firmware.
> All the Intel drives have the latest firmware according to the Intel
> support website.
>
> Could it possibly help to tweak dev.nvme.7.ioq0.num_entries and similar
> entries?
>
It might. Maybe. I'm unsure what the intel firmware bug is.
> What about switching to the nda device instead of nvd?
>
I doubt that would have any effect. They both throw as much I/O onto the
card as possible in the default config.
There's been some minor improvements in -current here. Any chance you could
experimentally try that with this test? You won't get as many I/O abort
errors (since we don't print those), and we have a few more workarounds for
the reset path (though honestly, it's still kinda stinky).
Warner
More information about the freebsd-stable
mailing list