nvme detached

Dan Langille dan at langille.org
Wed Aug 4 17:16:05 UTC 2021


Yesterday I had an NVME stick detach.  This degraded a zpool but zpools status indicated the device was still online. Yet it was not visible in /dev/.

More details are at https://gist.github.com/dlangille/bc8af0f5a098d3a106fa5fbf40a88d42

I first noticed the issue with multiple ssh sessions freezing up.

Then Nagios started alerting. A reboot cleared this up. scrubs did not find any errors.

The /var/log/messages entries below.

Thank you.

Aug  3 15:06:02 knew kernel: nvme0: Resetting controller due to a timeout.
Aug  3 15:06:02 knew kernel: nvme0: resetting controller
Aug  3 15:06:32 knew kernel: nvme0: controller ready did not become 0 within 30500 ms
Aug  3 15:06:32 knew kernel: nvme0: failing queued i/o
Aug  3 15:06:32 knew kernel: nvme0: IDENTIFY (06) sqid:0 cid:0 nsid:0 cdw10:00000001 cdw11:00000000
Aug  3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:0 cid:0 cdw0:0
Aug  3 15:06:32 knew kernel: nvme0: failing outstanding i/o
Aug  3 15:06:32 knew kernel: nvme0: READ sqid:2 cid:123 nsid:1 lba:250153507 len:5
Aug  3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:123 cdw0:0
Aug  3 15:06:32 knew kernel: nvme0: failing outstanding i/o
Aug  3 15:06:32 knew kernel: nvme0: WRITE sqid:3 cid:118 nsid:1 lba:454009346 len:1
Aug  3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:3 cid:118 cdw0:0
Aug  3 15:06:32 knew kernel: nvme0: failing outstanding i/o
Aug  3 15:06:32 knew kernel: nvme0: WRITE sqid:4 cid:122 nsid:1 lba:454009345 len:1
Aug  3 15:06:32 knew kernel: nvme0: ABORTED - BY REQUEST (00/07) sqid:4 cid:122 cdw0:0
Aug  3 15:06:32 knew kernel: nvd0: detached

-- 
  Dan Langille
  dan at langille.org


More information about the freebsd-questions mailing list