Re: git: 11778fca4a83 - main - Fix mpr(4) panic during a firmware update.

From: Alan Somers <asomers_at_freebsd.org>
Date: Thu, 20 Oct 2022 16:48:49 UTC
On Thu, Oct 20, 2022 at 10:46 AM Ken Merry <ken@freebsd.org> wrote:
>
>
> > On Oct 20, 2022, at 12:37, Alan Somers <asomers@freebsd.org> wrote:
> >
> > On Thu, Oct 20, 2022 at 10:29 AM Ken Merry <ken@freebsd.org> wrote:
> >>
> >>
> >> On Oct 20, 2022, at 12:23, Alan Somers <asomers@freebsd.org> wrote:
> >>
> >> On Mon, Oct 17, 2022 at 10:53 AM Kenneth D. Merry <ken@freebsd.org> wrote:
> >>
> >>
> >> The branch main has been updated by ken:
> >>
> >> URL: https://cgit.FreeBSD.org/src/commit/?id=11778fca4a83f5e3b597c75785aa5c0ee0dc518e
> >>
> >> commit 11778fca4a83f5e3b597c75785aa5c0ee0dc518e
> >> Author:     Kenneth D. Merry <ken@FreeBSD.org>
> >> AuthorDate: 2022-10-17 16:48:34 +0000
> >> Commit:     Kenneth D. Merry <ken@FreeBSD.org>
> >> CommitDate: 2022-10-17 16:48:34 +0000
> >>
> >>   Fix mpr(4) panic during a firmware update.
> >>
> >>   Issue Description:
> >>   The RequestCredits field of IOCFacts got changed between the Phase23
> >>   firmware to Phase24 firmware. So as part of firmware update operation,
> >>   driver has to free the resources & pools which are created with the Phase23
> >>   Firmware's IOCFacts data (i.e. during driver load time) and has to
> >>   reallocate the resources and pools using Phase24's IOCFacts data. Here
> >>   driver has freed the interrupts but missed to reallocate the interrupts and
> >>   hence config page read operation is getting timed out and controller is
> >>   going for recursive reinit (controller reset) operations and leading to
> >>   kernel panic.
> >>
> >>   Fix:
> >>   Reallocate the interrupts if the interrupts are disabled as part of
> >>   firmware update/downgrade operation.
> >>
> >>   Submitted by:   Sreekanth Ready <sreekanth.reddy@broadcom.com>
> >>   Tested by:      ken
> >>   MFC after:      3 days
> >>
> >>
> >> Would this commit fix the panic in bug 252575?
> >>
> >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252575
> >>
> >>
> >> Unfortunately, no.
> >>
> >> This is the panic I ran into on a firmware upgrade using storcli64 on a Broadcom 9600-16i card going from Phase 23 to Phase 24:
> >>
> >> mpr0: Reinitializing controller
> >> mpr0: Firmware: 24.00.00.00, Driver: 23.00.00.00-fbsd
> >> mpr0: IOCCapabilities: 2fa84c<ScsiTaskFull,DiagTrace,EEDP,TransRetry,EventReplay
> >> ,MSIXIndex,HostDisc,FastPath,RDPQArray,AtomicReqDesc>
> >> mpr0: Calling Reinit from mpr_wait_command, timeout=30, elapsed=30
> >> mpr0: Reinitializing controller
> >> mpr_config_get_ioc_pg8: request for header completed with error 0
> >> mpr_config_get_ioc_pg8: request for header completed with error 16
> >> mpr_config_get_ioc_pg8: request for header completed with error 16
> >> mpr_config_get_ioc_pg8: request for header completed with error 16
> >> mpr_config_get_ioc_pg8: request for header completed with error 16
> >> mpr_config_get_ioc_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_iounit_pg8: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_man_pg11: request for header completed with error 16
> >> mpr_config_get_dpm_pg0: request for header completed with error 16
> >> mpr_config_get_dpm_pg0: request for header completed with error 16
> >> mpr_config_get_dpm_pg0: request for header completed with error 16
> >> mpr_config_get_dpm_pg0: request for header completed with error 16
> >> mpr0: Unfreezing SIM queue
> >> mpr0: fault_state(0x40001500)!
> >> mpr0: Timeout while writing doorbell
> >> panic: mpr_iocfacts_allocate failed to get IOC Facts with error 6
> >>
> >> cpuid = 1
> >> time = 1663771142
> >> KDB: stack backtrace:
> >> db_trace_self_wrapper() at 0xffffffff8040a9ab = db_trace_self_wrapper+0x2b/frame 0xfffffe02e759d6c0
> >> vpanic() at 0xffffffff805f7ad1 = vpanic+0x151/frame 0xfffffe02e759d710
> >> panic() at 0xffffffff805f7973 = panic+0x43/frame 0xfffffe02e759d770
> >> mpr_iocfacts_allocate() at 0xffffffff8154bdad = mpr_iocfacts_allocate+0x15dd/frame 0xfffffe02e759d8e0
> >> mpr_reinit() at 0xffffffff81549e0c = mpr_reinit+0x14c/frame 0xfffffe02e759d930
> >> mpr_wait_command() at 0xffffffff8154fb69 = mpr_wait_command+0x1d9/frame 0xfffffe02e759d9b0
> >> mpr_ioctl() at 0xffffffff8155a113 = mpr_ioctl+0x1de3/frame 0xfffffe02e759db40
> >> devfs_ioctl() at 0xffffffff804bab8f = devfs_ioctl+0xaf/frame 0xfffffe02e759db90
> >> vn_ioctl() at 0xffffffff806e9664 = vn_ioctl+0x1a4/frame 0xfffffe02e759dca0
> >> devfs_ioctl_f() at 0xffffffff804bb21e = devfs_ioctl_f+0x1e/frame 0xfffffe02e759dcc0
> >> kern_ioctl() at 0xffffffff80667b2d = kern_ioctl+0x26d/frame 0xfffffe02e759dd30
> >> sys_ioctl() at 0xffffffff80667811 = sys_ioctl+0x101/frame 0xfffffe02e759de00
> >> amd64_syscall() at 0xffffffff8092ae4c = amd64_syscall+0x10c/frame 0xfffffe02e759df30
> >> fast_syscall_common() at 0xffffffff8090369b = fast_syscall_common+0xf8/frame 0xfffffe02e759df30
> >> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x8dfe3a, rsp = 0x821837048, rbp = 0x4 ---
> >> Uptime: 18h2m58s
> >> Dumping 5338 out of 130903 MB:..1%..11%..21%..31%..41%..51%..61%..71%..81%..91%
> >>
> >> Ken
> >> —
> >> Ken Merry
> >> ken@FreeBSD.ORG
> >
> > Ahh, too bad.  Is Broadcom even aware of the other one?
>
> I don’t know.
>
> I’d suggest sending email to Sreekanth (email address in the commit above) and ask.
>
> He’s handling mpr(4) bugs from what they’ve said.
>
> Ken
> —
> Ken Merry
> ken@FreeBSD.ORG

Ok, I'll do it.