Re: git: 0e33c2e6df7a - main - pci: Only re-route IRQs based on firmware on x86

From: John Baldwin <jhb_at_FreeBSD.org>
Date: Tue, 01 Apr 2025 16:21:16 UTC
On 3/31/25 14:14, Jessica Clarke wrote:
> On 29 Mar 2025, at 20:18, Colin Percival <cperciva@FreeBSD.org> wrote:
>>
>> The branch main has been updated by cperciva:
>>
>> URL: https://cgit.FreeBSD.org/src/commit/?id=0e33c2e6df7a5de65db40c7cc0fc97f66da28ccd
>>
>> commit 0e33c2e6df7a5de65db40c7cc0fc97f66da28ccd
>> Author:     Colin Percival <cperciva@FreeBSD.org>
>> AuthorDate: 2025-03-28 18:07:01 +0000
>> Commit:     Colin Percival <cperciva@FreeBSD.org>
>> CommitDate: 2025-03-29 20:17:29 +0000
>>
>>     pci: Only re-route IRQs based on firmware on x86
>>
>>     There is a (very historical) call to pci_assign_interrupt for the
>>     purpose of routing IRQs which may have been set up wrong by x86 BIOS
>>     or firmware.  On non-x86 systems, this is unnecessary; and on INTRNG
>>     systems it results in a (synthetic) IRQ leak and ultimately a kernel
>>     panic after many hotplug/unplug cycles.
> 
> This is in fact incorrect. Whilst there may well be a leak that needs
> fixing, the rerouting is absolutely needed on non-x86 platforms. See
> 5884fab46153dee52bda660bcabca95c3cc97f44 and
> 7de649170fd804668da78f005c7966941b402106 for some of the history behind
> this. As a result of this commit the problem described in the second
> commit is reintroduced.

Hmm, this is my bad as I should have remembered those.  However, as the
second commit log alludes to, INTRng does not really match the design of
what is intended here and what the PCI bus currently expects.  The
intr_map[] entry should probably only be allocated when the INTx IRQ is
allocated via bus_alloc_resource() with a rid of 0, not when the IRQ is
routed.  In particular, there is no "unroute" callback which is where
the leak comes from.  The intention of routing is to confirm which pin
on a PIC a PCI INTx line is connected to.  It is not to actually allocate
the interrupt as the routing may never get used if either MSI is used
instead or if no driver attaches.  The 0x20-0x23 you mention in the
second commit log should probably be the IRQ value used instead of a
dynamically allocated value, and the dynamically allocated value should
be allocated as a result of bus_alloc_resource and stored internally.
x86 manages to do this today with the mapping of (local APIC, IDT vector)
tuples not encoded in the IRQ values.

-- 
John Baldwin