Issues with GTX960 on CentOS7 using bhyve PCI passthru (FreeBSD 11-RC2)
Peter Grehan
grehan at freebsd.org
Fri Jan 13 02:42:51 UTC 2017
Hi,
> First, `nvidia-smi -q` output diff [0] is interesting. It suggests that
> the card may be in some incompletely initialized state: notice the
> "Unknown Error" instead of real UUID, and the P8 power state. Could it
> be that the driver doesn't put the card's BIOS in the right state?
That is extremely likely. bhyve itself doesn't have a BIOS, though
bhyve/UEFI could be modified to handle options ROMs (see
http://awilliam.github.io/presentations/KVM-Forum-2014/#/)
> The command was run in both host and guest without Xorg loaded.
Thanks for the diff; this is very useful.
> - GPU UUID : GPU-f6c71b8e-f6c8-5a42-260d-1164720bf4f2
> + GPU UUID : Unknown Error
That implies some type of h/w access isn't working, either MMIO
registers or response from a DMA command.
> - Board ID : 0x100
> + Board ID : 0x4
The same ?
> PCIe Generation
> Max : 2
> - Current : 2
> + Current : 1
bhyve's emulated PCI hostbridge only advertises gen-1 - that could be
easily changed to gen2. That could make a difference for some of the
clock issues below
(source is pci_emul.c:pci_emul_add_pciecap())
> Link Width
> Max : 16x
> Current : 16x
That's a bit unexpected since the hostbridge only advertises 1x, but
the driver is probably exporting the host value here.
> - Performance State : P0
> + Performance State : P8
Note sure what's happening here.
> Clocks
> - Graphics : 625 MHz
> - SM : 1251 MHz
> - Memory : 1304 MHz
> - Video : 540 MHz
> + Graphics : 405 MHz
> + SM : 810 MHz
> + Memory : 324 MHz
> + Video : 405 MHz
This may be related to the gen1 vs gen2 issue above.
> When rebooting, I get this:
> nvidia-modeset: ERROR: GPU:0: Idling display engine timed out: 0x0000857d:0:0:0x00000040
This may be DMA not working.
A general issue with PCI passthrough is that often MMIO from the guest
works, since that is just VT-x remapping, but DMA doesn't work due to
issues with IOMMU programming (or incorrect mappings being used). This
gives a device that partially works in that registers can be read, but
data transfer doesn't work.
> Jan 11 11:34:49 fbsd12tst kernel: nvidia-modeset: ERROR: GPU:0: Display engine push buffer channel allocation failed
> Jan 11 11:34:49 fbsd12tst kernel: nvidia-modeset: ERROR: GPU:0: Failed to allocate display engine core DMA push buffer
Not sure what's happening with those.
Would it be possible to try the nouveau driver ? At least the source
is available, so it may be easier to determine what is broken.
later,
Peter.
More information about the freebsd-virtualization
mailing list