NVMe timeout / aborting outstanding i/o (again)

Warner Losh imp at bsdimp.com
Wed Nov 13 18:41:36 UTC 2019


On Wed, Nov 13, 2019 at 11:34 AM Theron <theron.tarigo at gmail.com> wrote:

> With latest 12.1-STABLE (r354687), I have file access 30 second hang
> problem again when resuming from suspend, with dmesg:
>
> nvme0: Resetting controller due to a timeout.
> nvme0: resetting controller
> nvme0: aborting outstanding i/o
> nvme0: aborting outstanding i/o
> (...)
> nvme0: aborting outstanding i/o
> nvme0: aborting outstanding i/o
> nvme0: nvme0: aborting outstanding i/o
> async event occurred (type 0x0, info 0x00, page 0x01)
> nvme0: aborting outstanding i/o
> nvme0: aborting outstanding i/o
> (...)
>
> I thought this was fixed with r351914 "MFC r351747: Implement nvme
> suspend / resume for pci attachment."
>

There are three causes of timeouts this fixed: First, a prior MFC fixed a
missed interrupt due to some difference in how some drives implemented read
modify write of the MSI registers. Second, during suspend, we weren't
properly shutting down the controller. Finally, there was a restoration of
the controller that might have had power removed issue the commit also
fixed.


> Latest change, r354074 "MFC r352630: Make nvme(4) driver some more NUMA
> aware.", looks suspicious, I'll test before vs. after that change when I
> can.
>
> Is anyone else seeing this?
>

I've had no other reports of this, so it's a good one to test. I don't
think it will matter, but I'm not sure of that.

Warner


More information about the freebsd-stable mailing list