[Bug 272469] Broadcom mpi3mr driver: MSIX allocation fail on DELL PowerEdge R7625 system
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 12 Jul 2023 11:48:01 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=272469 Bug ID: 272469 Summary: Broadcom mpi3mr driver: MSIX allocation fail on DELL PowerEdge R7625 system Product: Base System Version: 13.2-RELEASE Hardware: amd64 OS: Any Status: New Severity: Affects Many People Priority: --- Component: kern Assignee: bugs@FreeBSD.org Reporter: chandrakanth.patil@broadcom.com Created attachment 243352 --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=243352&action=edit msix_table_dump mpi3mr avenger driver: system details: 1. Dell PowerEdge R7625 with 196 physical cores and 256 logical cores mpi3mr driver will allocate the single msix for handshaking with the driver during the initial load phase using pci_alloc_msix() API. After allocating the single msix, the driver is sending the get IOC_FACTS commands to firmware through which the driver will fetch all the controller properties. The issue is the driver is not getting the interrupt for IOC_FACTS completion leads to timeout which in turn leads to driver load failure. but the driver can see that the command is completed by the firmware if it polls the reply queue. After creating the single msix in the driver, the vmstat -i in the OS should show the interrupt but it is not showing so the interrupt binding is failing. ideally in this case the pci_alloc_msix() API should throw some error during allocation but it is not throwing any error. Note: 1. This issue is happening only on this specific server where the number of CPUs are > 128 (total CPUs are 256). 2. But when we reduce the number of cores to 24 in the BIOS then the driver is working without any issues. We have dumped the MSIX table before and after the allocation of a single msix and after the command times out. Please find it in the attachment. I wanted to understand if is there any OS limitation w.r.t MSIX allocation on larger cores system. Please find attached driver logs and MSIX table dump. -- You are receiving this mail because: You are the assignee for the bug.