svn commit: r290199 - in head/sys/dev: nvd nvme
Steven Hartland
steven at multiplay.co.uk
Tue Dec 8 22:01:24 UTC 2015
On 08/12/2015 21:17, Jim Harris wrote:
>
>
> On Tue, Dec 8, 2015 at 1:48 PM, Steven Hartland
> <steven at multiplay.co.uk <mailto:steven at multiplay.co.uk>> wrote:
>
> Hi Jim could you let me know the use case for exposing the
> controller stripe size as the disk stripe size done by this commit?
>
> I ask as it actually causes problems for ZFS which has checks to
> ensure zpools perform optimally by correctly configuring ashift to
> match the stripesize if reported.
>
> This is usually fine as stripe size typically reports the physical
> block size of device, where sectorsize is the logical block size,
> unfortunately this is currently limited to ashift of 13 (8KB) so
> when nvme reports 128KB it limits it 8KB and hence every
> subsequent zpool status reports a warning about optimal performance.
>
> Before I look to fix one or the other, I wanted to fully
> understand the reasoning behind how nvme behaves here.
>
>
> Some Intel NVMe controllers have a slow path for I/Os that span a
> 128KB stripe boundary. The FreeBSD NVMe driver checks for this
> condition, and will split the I/O inside of the NVMe driver in these
> cases, to ensure we do not hit this slow path.
>
> The idea behind reporting the stripe size up through GEOM was to
> provide a hint to upper layers, especially for file system layout - in
> hopes of reducing the number of I/Os that need to be split.
>
> Based on your findings, limiting the stripe size reported up through
> GEOM to 4KB would be OK. This may result in some small number of
> additional I/Os to require splitting, but the NVMe I/O path is very
> efficient so these additional I/Os would cause very minimal (if any)
> difference in performance or CPU utilization.
>
Thanks for the fast reply Jim most appreciated. I've created a review
for the change here: https://reviews.freebsd.org/D4446
If you're happy I'll get that committed.
Regards
Steve
More information about the svn-src-head
mailing list