[Bug 281177] 13.2 works, 13.3 and 14.x installers panic on older qlogic isp card

From: <bugzilla-noreply_at_freebsd.org>
Date: Mon, 11 Nov 2024 11:45:12 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=281177

--- Comment #27 from Joerg Pulz <Joerg.Pulz@frm2.tum.de> ---
Some background on isp(4) default firmware handling:

Every card has firmware in flash and we have firmware for every supported
isp(4) card generation in ispfw(4).

The driver reads the FLT (flash layout table) to get the flash address of the
firmware stored on the card.
The firmware header is loaded from flash at this address to get the firmware
version.
ispfw(4) is loaded and the firmware header is parsed to get the ispfw(4)
firmware version.
After comparison of the available versions the newer one is loaded into the RAM
of the card and the card is instructed to execute the loaded firmware.

There are some hints (hint.isp.N.fwload_disable and hint.isp.N.fwload_force) to
change the above behavior.

On a running system three sysctl(8) values provide firmware version
information:
dev.isp.N.fw_version_flash
  The readonly flash firmware version value in the active region of the
controller.

dev.isp.N.fw_version_ispfw
  The readonly firmware version value provided by ispfw(4).

dev.isp.N.fw_version_run
  The readonly firmware version value currently executed on the controller.

As the behavior hasn't changed between 14.1 and 14.2, I wonder what happens
here.

It may be that there are some old cards with broken flash and reading the FLT
fails or gives bad data.
If that's the case than probably reflashing the card may be of help.
But if that's the case it should happen on 14.1 systems too.

About disabling isp(4) in GENERIC:
Actually we are talking about two/three people with the 24xx 4Gbit/s) and 25xx
(8Gbit/s) cards that seem to be problematic. The latest firmware for the 25xx
cards is dated 2019.
We seem to have no issues with the 26xx (16Gbit/s), 27xx (32Gbit/s) and 28xx
(32 and 64Gbit/s) cards.
Is that enough to justify disabling isp(4) in GENERIC at all? It is no
requirement to enable ispfw(4) in loader.conf(5). isp(4) is loading and using
it if available automatically if not instructed by a hint to do otherwise. So
disabling isp(4) in GENERIC will most probably hit all people that have no
explicit isp_enable="YES" in loader.conf(5).

We could think about changing the code to skip FLT reading for those old card
generations at all and change the load and exec behavior back to the state
before all my changes.
That's probably not going to happen in time for 14.2.

Binary firmware loading instead of .ko is possible. I already have some code
for this. But it would be a complete change away from ispw(4).
Anyway, I doubt that this would solve the problem we see here.
Again, probably not going to happen for 14.2.


For further tasks to solve this I would need some detailed data where it breaks
on 14.2.
Unfortunately my test system is currently running without the 25xx card. I have
physical access to this system tomorrow to plug in a 25xx card and run some
tests by myself.

In the meantime:
@Vladimir

Can you please boot your panicing 14.2 memstick using

  hint.isp.0.debug="0x3f"

That should reveal all details about reading and parsing the FLT and firmware
header and about firmware loading and execution.
I need all the console output until it panics.

-- 
You are receiving this mail because:
You are the assignee for the bug.