[Bug 283815] listing dev.vgapci.X.%iommu hangs indefinitely on NVIDIA card

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 03 Jan 2025 16:40:50 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=283815

Anton Saietskii <vsasjason@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
            Summary|listing sysctl hangs        |listing dev.vgapci.X.%iommu
                   |indefinitely on             |hangs indefinitely on
                   |dev.vgapci.X while it's     |NVIDIA card
                   |NVIDIA card                 |

--- Comment #3 from Anton Saietskii <vsasjason@gmail.com> ---
Narrowed down the issue to a single OID.

I have the following GPU:
vgapci0@pci0:1:0:0:     class=0x030000 rev=0xa1 hdr=0x00 vendor=0x10de
device=0x1bb7 subvendor=0x1028 subdevice=0x07b1
    vendor     = 'NVIDIA Corporation'
    device     = 'GP104GLM [Quadro P4000 Mobile]'
    class      = display
    subclass   = VGA
    cap 01[60] = powerspec 3  supports D0 D3  current D0
    cap 05[68] = MSI supports 1 message, 64 bit
    cap 10[78] = PCI-Express 2 legacy endpoint max data 256(256) RO NS
                 max read 512
                 link x16(x16) speed 8.0(8.0) ASPM L0s/L1(L0s/L1) ClockPM
disabled
    ecap 0002[100] = VC 1 max VC0
    ecap 0018[250] = LTR 1
    ecap 0004[128] = Power Budgeting 1
    ecap 0001[420] = AER 2 0 fatal 0 non-fatal 1 corrected
    ecap 000b[600] = Vendor [1] ID 0001 Rev 1 Length 36
                 0b 00 01 90 01 00 41 02 02 00 41 01 01 18 00 00
                 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                 00 00 00 00
    ecap 0019[900] = PCIe Sec 1 lane errors 0

It's unused in my system, so being turned off by sysutils/acpi_call and xmj@'s
turn_off_gpu.sh from TuningPowerConsumption [0]:
vgapci0@pci0:1:0:0:     class=0x030000 rev=0xa1 hdr=0x00 vendor=0x10de
device=0x1bb7 subvendor=0x1028 subdevice=0x07b1
    vendor     = 'NVIDIA Corporation'
    device     = 'GP104GLM [Quadro P4000 Mobile]'
    class      = display
    subclass   = VGA
(With '\_SB.PCI0.PEG0.PEGP._OFF' method.)

After NVIDIA turned off, any sysctl call which tries to get dev.vgapci.X.%iommu
hangs. Getting other OIDs, e.g. dev.vgapci.0.%driver, works fine.
NOTE: this isn't DRM issue as I don't have any NVIDIA-related modules loaded.

[0]: https://wiki.freebsd.org/TuningPowerConsumption

-- 
You are receiving this mail because:
You are the assignee for the bug.