[Bug 263906] MFI driver fails with "Fatal firmware error" line 1155 in ../../dm/src/dm.c

From: <bugzilla-noreply_at_freebsd.org>
Date: Tue, 10 May 2022 23:39:31 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263906

            Bug ID: 263906
           Summary: MFI driver fails with "Fatal firmware error" line 1155
                    in ../../dm/src/dm.c
           Product: Base System
           Version: 13.1-STABLE
          Hardware: amd64
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: greg@teamworkweb.com

Testing out FreeBSD 13.1-RC6 on server hardware which has been 100% stable for
past 3+ months on 13.0. One HBA is a LSI 9361 SAS in JBOD Mode. This HBA was
pulled from another system where it saw 99.99% uptime for over 3 years. Fairly
sure HBA is not the issue. Honestly not certain it is on latest firmware, have
not updated since it has been stable.

Was unable to install 13.1-RC6, mfi driver would load and get through most of
creating JBOD objects, then die with:

mfi0: 39952: (705350772s/0x0020/DEAD) - Fatal firmware error: Line 1155 in
../../dm/src/dm.c

Tried switching it back into RAID mode, same error.

I was able to work around this by setting the following at boot:

set hw.mfi.mrsas_enable="1"
set hint.hw.mfi.mrsas_enable="1"

Honestly not sure which format was needed? So did both. I had intended to use
MRSAS driver anyways, since I like having drives populated under CAM control.

However wanted to report this, since I did not have any issues with MFI driver
under 13.0. This seems to be a new issue with driver build under 13.1-RC6.

Will try to honor requests for more details, since I have some time to play
with this hardware for a bit yet. But... currently having another fairly
serious issue with 13.1-RC6 install which I am trying to figure out how to even
report?

After getting everything setup basically identical to test environment I had
running for 3+ months under 13.0, started running aggressive back to back fio
and iozone benchmarks on a zpool. Same processes I have been doing for months
with out issues under 13.0. Next morning I went to check results and was
getting "No more processes" on my shell. Found some how system was flooded with
37,000+ processes that were all instances of "sh", most in sleep! Was able to
grab some lines out of /var/log/messages, but then rebooted and it has been
down hill ever since. System won't fully boot, not to a login. I can get into
single user, but then /var/log is empty! Not sure if that is normal? Haven't
done enough debugging or diags in FreeBSD. Anyway, long story short, now
dealing with other serious issues which might prevent further testing of MFI
driver in the immediate future. But I will do what I can to help test this new
build.

Apologize in advance for missed steps or details, moving over to FreeBSD after
a decade on OmniOS so still picking things up. Thanks!

-Greg-

-- 
You are receiving this mail because:
You are the assignee for the bug.