NVMe performance 4x slower than expected
Konstantin Belousov
kostikbel at gmail.com
Thu Apr 2 10:15:27 UTC 2015
On Thu, Apr 02, 2015 at 12:24:45AM +0200, Tobias Oberstein wrote:
> Am 01.04.2015 um 23:23 schrieb Konstantin Belousov:
> > On Wed, Apr 01, 2015 at 10:52:18PM +0200, Tobias Oberstein wrote:
> >>> > FreeBSD 11 Current with patches (DMAR and ZFS patches, otherwise the box
> >>> > doesn't boot at all .. because of 3TB RAM and the amount of periphery).
> >>>
> >>> Do you still have WITNESS and INVARIANTS turned on in your kernel
> >>> config? They're turned on by default for Current, but they do have
> >>> some performance impact. To turn them off, just build a
> >>> GENERIC-NODEBUG kernel .
> >>
> >> WITNESS is off, INVARIANTS is still on.
> > INVARIANTS are costly.
>
> ah, ok. will rebuild without this option.
>
> > I have the following patch for a long time, it allowed to increase pps
> > in iperf and similar tests when DMAR is enabled. In your case it could
> > reduce the rate of the DMAR interrupts.
>
> You mean these lines from vmstat?
>
> irq257: dmar0:qi 22312 0
> irq259: dmar1:qi 22652 0
> irq261: dmar2:qi 261874194 6911
> irq263: dmar3:qi 124939 3
>
> So these dmar2 interrupts come from DMAR region 2 which is used by nvd7?
Dmar unit 2.
In modern machines, there is one (or two, sometimes) translation units
per CPU package, which handle devices from the PCIe buses rooted in the
socket.
Interrupt stats above mean that the load on your machine is unbalanced
WRT PCIe buses, most of the DMA transfers were performed by devices
attached to the bus(es) on socket where DMAR 2 is located.
>
> From dmesg:
>
> dmar0: <DMA remap> iomem 0xc7ffc000-0xc7ffcfff on acpi0
> dmar1: <DMA remap> iomem 0xe3ffc000-0xe3ffcfff on acpi0
> dmar2: <DMA remap> iomem 0xfbffc000-0xfbffcfff on acpi0
> dmar3: <DMA remap> iomem 0xabffc000-0xabffcfff on acpi0
>
> mpr0: dmar3 pci0:4:0:0 rid 400 domain 4 mgaw 48 agaw 48 re-mapped
> mpr1: dmar2 pci0:195:0:0 rid c300 domain 2 mgaw 48 agaw 48 re-mapped
>
> nvme0: dmar0 pci0:65:0:0 rid 4100 domain 0 mgaw 48 agaw 48 re-mapped
> nvme1: dmar0 pci0:67:0:0 rid 4300 domain 1 mgaw 48 agaw 48 re-mapped
> nvme2: dmar0 pci0:69:0:0 rid 4500 domain 2 mgaw 48 agaw 48 re-mapped
> nvme3: dmar1 pci0:129:0:0 rid 8100 domain 0 mgaw 48 agaw 48 re-mapped
> nvme4: dmar1 pci0:131:0:0 rid 8300 domain 1 mgaw 48 agaw 48 re-mapped
> nvme5: dmar1 pci0:132:0:0 rid 8400 domain 2 mgaw 48 agaw 48 re-mapped
> nvme6: dmar2 pci0:193:0:0 rid c100 domain 0 mgaw 48 agaw 48 re-mapped
> nvme7: dmar2 pci0:194:0:0 rid c200 domain 1 mgaw 48 agaw 48 re-mapped
>
> unknown: dmar3 pci0:0:29:0 rid e8 domain 0 mgaw 48 agaw 48 re-mapped
> unknown: dmar3 pci0:0:26:0 rid d0 domain 1 mgaw 48 agaw 48 re-mapped
>
> ix0: dmar3 pci0:1:0:0 rid 100 domain 2 mgaw 48 agaw 48 re-mapped
> ix1: dmar3 pci0:1:0:1 rid 101 domain 3 mgaw 48 agaw 48 re-mapped
>
> ix0: Using MSIX interrupts with 49 vectors
> ix1: Using MSIX interrupts with 49 vectors
>
> --
>
> So the LSI HBAs, Intel NICs and NVMe are all using DMAR, but only the
> NICs use MSI-X?
MSI-X is the method of reporting interrupt requests to CPUs.
DMARs are some engines to translate addresses of DMA requests (and also
to translate interrupts).
>
> But 2 * 49 = 98, and that is smaller than the 191 which Jim mentions.
>
> And what are those "unknown" devices on dmar3?
0:26:0 and 0:29:0 are USB controllers, most likely, the b/d/f numbers are
typical for the Intel PCH. "unknown" is displayed when pci device does
not have driver attached, you probably do not have USB loaded. DMAR still
has to enable the translation context for USB controllers, since BIOS
performs transfers behind the OS, and instructs DMAR driver to enable
mappings.
More information about the freebsd-hackers
mailing list