Re: debugging a msi interrupt panic
- In reply to: John Hay : "debugging a msi interrupt panic"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 25 Sep 2023 08:30:16 UTC
Hi, I found the cause of the panics and thought I should give feedback here. MSI interrupt vectors have to be aligned to the size of the allocation, which have to be in powers of 2 up to a maximum of 32 vectors. native_apic_alloc_vectors(), which is used by pci_alloc_msi(), does not properly align the vectors that it allocates. It does an aligned check, but then after that adds APIC_IO_INTS, which is 48, and so the alignment is lost. So allocating smaller numbers of vectors would work because 48 is a multiple of 16. If you allocate 32 MSI interrupts, the pcie card just ignores the bottom 5 bits of the vector that was written to its MSI_DATA register, so in my case where native_apic_alloc_vectors() allocated a base vector of 0x50, the card ignores bit 4 and generates interrupt vectors from 0x40 to 0x5f. I have opened a bug report about it: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=274074 Regards John On Fri, 22 Sept 2023 at 13:00, John Hay <john@sanren.ac.za> wrote: > Hi, > > How do one debug msi interrupts on FreeBSD when writing a device driver? > > I'm writing a driver for a OCP TimeCard. The card has several different > functions, each "tied" to a specific msi interrupt. So it would be useful > to use multiple msi interrupts. The PCIe functionality is a Xilinx AXI PCIe > blob, with 32 MSI interrupts. It does not have MSI-X capability. > > <snip> > # pciconf -lbc none1@pci0:1:0:0: > none0@pci0:1:0:0: class=0x058000 rev=0x00 hdr=0x00 vendor=0x1d9b > device=0x0400 subvendor=0x10ee subdevice=0x0007 > bar [10] = type Memory, range 32, base 0xf4000000, size 33554432, > enabled > cap 01[40] = powerspec 3 supports D0 D3 current D0 > cap 05[48] = MSI supports 32 messages, 64 bit > cap 10[60] = PCI-Express 2 endpoint max data 256(256) NS > max read 512 > link x1(x1) speed 2.5(2.5) ASPM disabled(L0s) > ecap 0003[100] = Serial 1 0000000000000000 > </snip> > > Currently in attach, I'm doing the following (error checking removed): > > <snip> > pci_enable_busmaster(dev); /* msi irq needs it enabled, at least for the > AXI core */ > sc->sc_msi = pci_msi_count(dev); > pci_alloc_msi(dev, &sc->sc_msi) > loop for each function /* currently just a single iteration to get one > function working */ > port->pp_irid = port->intr_num + 1; /* msi irq no starts at one */ > port->pp_ires = bus_alloc_resource_any(dev, SYS_RES_IRQ, > &port->pp_irid, RF_ACTIVE | RF_SHAREABLE); > bus_setup_intr(dev, port->pp_ires, INTR_TYPE_TTY | INTR_MPSAFE, > timecard_intr, NULL, port, &port->p_ihandle) > </snip> > > I did define a variable timecard_intr_count that timecard_intr() should > increment, but it does not change, so timecard_intr() probably does not get > called. > > Here is the console output of the panic and a show apic and show lapic. > The driver is not compiled in, but loaded later manually. > > <snip> > pci0: driver added > found-> vendor=0x8086, dev=0x1e3a, revid=0x04 > domain=0, bus=0, slot=22, func=0 > class=07-80-00, hdrtype=0x00, mfdev=1 > cmdreg=0x0006, statreg=0x0010, cachelnsz=0 (dwords) > lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) > intpin=a, irq=16 > powerspec 3 supports D0 D3 current D0 > MSI supports 1 message, 64 bit > pci0:0:22:0: reprobing on driver added > timecard0: TimeCard Probe > Vendor ID : 0x8086 > Device ID : 0x1e3a > pci1: driver added > found-> vendor=0x1d9b, dev=0x0400, revid=0x00 > domain=0, bus=1, slot=0, func=0 > class=05-80-00, hdrtype=0x00, mfdev=0 > cmdreg=0x0002, statreg=0x0010, cachelnsz=16 (dwords) > lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns) > powerspec 3 supports D0 D3 current D0 > MSI supports 32 messages, 64 bit > pci0:1:0:0: reprobing on driver added > timecard0: TimeCard Probe > Vendor ID : 0x1d9b > Device ID : 0x400 > We've got the TimeCard, probe successful! > timecard0: TimeCard Probe > Vendor ID : 0x1d9b > Device ID : 0x400 > We've got the TimeCard, probe successful! > timecard0: <TimeCard> mem 0xf4000000-0xf5ffffff at device 0.0 on pci1 > TimeCard Attach for : deviceID : 0x4001d9b > Address of timecard_intr_count 0xffffffff8314e400 > timecard0: attempting to allocate 32 MSI vectors (32 supported) > msi: routing MSI IRQ 38 to local APIC 4 vector 80 > msi: routing MSI IRQ 39 to local APIC 4 vector 81 > msi: routing MSI IRQ 40 to local APIC 4 vector 82 > msi: routing MSI IRQ 41 to local APIC 4 vector 83 > msi: routing MSI IRQ 42 to local APIC 4 vector 84 > msi: routing MSI IRQ 43 to local APIC 4 vector 85 > msi: routing MSI IRQ 44 to local APIC 4 vector 86 > msi: routing MSI IRQ 45 to local APIC 4 vector 87 > msi: routing MSI IRQ 46 to local APIC 4 vector 88 > msi: routing MSI IRQ 47 to local APIC 4 vector 89 > msi: routing MSI IRQ 48 to local APIC 4 vector 90 > msi: routing MSI IRQ 49 to local APIC 4 vector 91 > msi: routing MSI IRQ 50 to local APIC 4 vector 92 > msi: routing MSI IRQ 51 to local APIC 4 vector 93 > msi: routing MSI IRQ 52 to local APIC 4 vector 94 > msi: routing MSI IRQ 53 to local APIC 4 vector 95 > msi: routing MSI IRQ 54 to local APIC 4 vector 96 > msi: routing MSI IRQ 55 to local APIC 4 vector 97 > msi: routing MSI IRQ 56 to local APIC 4 vector 98 > msi: routing MSI IRQ 57 to local APIC 4 vector 99 > msi: routing MSI IRQ 58 to local APIC 4 vector 100 > msi: routing MSI IRQ 59 to local APIC 4 vector 101 > msi: routing MSI IRQ 60 to local APIC 4 vector 102 > msi: routing MSI IRQ 61 to local APIC 4 vector 103 > msi: routing MSI IRQ 62 to local APIC 4 vector 104 > msi: routing MSI IRQ 63 to local APIC 4 vector 105 > msi: routing MSI IRQ 64 to local APIC 4 vector 106 > msi: routing MSI IRQ 65 to local APIC 4 vector 107 > msi: routing MSI IRQ 66 to local APIC 4 vector 108 > msi: routing MSI IRQ 67 to local APIC 4 vector 109 > msi: routing MSI IRQ 68 to local APIC 4 vector 110 > msi: routing MSI IRQ 69 to local APIC 4 vector 111 > timecard0: using IRQs 38-69 for MSI > TimeCard msi 32 > rman iomem start f4000000, end f5ffffff, size 2000000 > timecard_corelist_read did not find list, using fb list 1 > nports 1 > Start with uart 0 0 > irq 41, irid 4 > > > Fatal trap 30: reserved (unknown) fault while in kernel mode > cpuid = 2; apic id = 04 > port irq rid 4 > instruction pointer = 0x20:0xffffffff804dda2d > stack pointer = 0x28:0xfffffe001b1d8de0 > TimeCard device loaded. > pci2: driver added > pci3: driver added > pci4: driver added > frame pointer = 0x28:0xfffffe001b1d8e10 > code segment = base 0x0, limit 0xfffff, type 0x1b > pci5: driver added > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, IOPL = 0 > current process = 11 (idle: cpu2) > trap number = 30 > panic: reserved (unknown) fault > cpuid = 2 > time = 1695375919 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame > 0xfffffe001b1d8c00 > vpanic() at vpanic+0x151/frame 0xfffffe001b1d8c50 > panic() at panic+0x43/frame 0xfffffe001b1d8cb0 > trap_fatal() at trap_fatal+0x387/frame 0xfffffe001b1d8d10 > calltrap() at calltrap+0x8/frame 0xfffffe001b1d8d10 > --- trap 0x1e, rip = 0xffffffff804dda2d, rsp = 0xfffffe001b1d8de0, rbp = > 0xfffffe001b1d8e10 --- > acpi_cpu_idle() at acpi_cpu_idle+0x28d/frame 0xfffffe001b1d8e10 > cpu_idle_acpi() at cpu_idle_acpi+0x4d/frame 0xfffffe001b1d8e30 > cpu_idle() at cpu_idle+0xac/frame 0xfffffe001b1d8e50 > sched_idletd() at sched_idletd+0x4b1/frame 0xfffffe001b1d8ef0 > fork_exit() at fork_exit+0x80/frame 0xfffffe001b1d8f30 > fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe001b1d8f30 > --- trap 0xe0c2545f, rip = 0x8f4a0c76246ffed9, rsp = 0x670c1cf091ea5e5b, > rbp = 0x57639a151e2b2985 --- > KDB: enter: panic > [ thread pid 11 tid 100005 ] > Stopped at kdb_enter+0x37: movq $0,0x129840e(%rip) > db> x/wx timecard_intr_count > timecard_intr_count: 0 > db> show apic > Interrupts bound to lapic 0 > vec 0x30 -> IRQ 9 > vec 0x31 -> IRQ 28 > vec 0x32 -> IRQ 31 > vec 0x33 -> IRQ 24 > vec 0x34 -> IRQ 29 > vec 0x35 -> IRQ 30 > vec 0x36 -> IRQ 8 > vec 0x37 -> IRQ 16 > vec 0x38 -> IRQ 36 > vec 0x39 -> IRQ 1 > vec 0xef -> lapic timer > Interrupts bound to lapic 2 > vec 0x30 -> IRQ 25 > vec 0x31 -> IRQ 33 > vec 0x32 -> IRQ 23 > vec 0x33 -> IRQ 18 > vec 0xef -> lapic timer > Interrupts bound to lapic 4 > vec 0x30 -> IRQ 26 > vec 0x31 -> IRQ 0 > vec 0x32 -> IRQ 34 > vec 0x33 -> IRQ 37 > vec 0x50 -> IRQ 38 > vec 0x51 -> IRQ 39 > vec 0x52 -> IRQ 40 > vec 0x53 -> IRQ 41 > vec 0x54 -> IRQ 42 > vec 0x55 -> IRQ 43 > vec 0x56 -> IRQ 44 > vec 0x57 -> IRQ 45 > vec 0x58 -> IRQ 46 > vec 0x59 -> IRQ 47 > vec 0x5a -> IRQ 48 > vec 0x5b -> IRQ 49 > vec 0x5c -> IRQ 50 > vec 0x5d -> IRQ 51 > vec 0x5e -> IRQ 52 > vec 0x5f -> IRQ 53 > vec 0x60 -> IRQ 54 > vec 0x61 -> IRQ 55 > vec 0x62 -> IRQ 56 > vec 0x63 -> IRQ 57 > vec 0x64 -> IRQ 58 > vec 0x65 -> IRQ 59 > vec 0x66 -> IRQ 60 > vec 0x67 -> IRQ 61 > vec 0x68 -> IRQ 62 > vec 0x69 -> IRQ 63 > vec 0x6a -> IRQ 64 > vec 0x6b -> IRQ 65 > vec 0x6c -> IRQ 66 > vec 0x6d -> IRQ 67 > vec 0x6e -> IRQ 68 > vec 0x6f -> IRQ 69 > vec 0xef -> lapic timer > Interrupts bound to lapic 6 > vec 0x30 -> IRQ 27 > vec 0x31 -> IRQ 32 > vec 0x32 -> IRQ 35 > vec 0x33 -> IRQ 4 > vec 0xef -> lapic timer > db> show lapic > lapic ID = 4 > version = 1.5 > max LVT = 6 > SVR = ff (enabled) > TPR = 00 > In-service Interrupts: > isr2: 43 > TMR Interrupts: > IRR Interrupts: > irr1: 32 33 > irr2: 43 > irr7: ef > db> > </snip> > > Regards > > John > >