Re: Dell perc card corrupted by FreeBSD driver?

From: Dan Mahoney (Ports) <freebsd_at_gushi.org>
Date: Thu, 21 Mar 2024 17:08:18 UTC
On Thu, 21 Mar 2024, Dan Mahoney (Ports) wrote:

> All,
>
> We have a Dell Perc h330 mini that we=E2=80=99ve been using with the olde=
r =E2=80=9Cmfi=E2=80=9D driver, not the newer mrsas.  Recently, it started =
giving us an error on boot claiming that it was "Disabling writes to flash =
as the flash part has gone bad"
>
> In the Dell Forums, a dell staffer very suspiciously saw this error and s=
aid =E2=80=9Care you running FreeBSD?=E2=80=9D (https://www.dell.com/commun=
ity/en/conversations/rack-servers/disabling-writes-to-flash-as-the-flash-pa=
rt-has-gone-bad/647f9010f4ccf8a8de13d06e)
>
> Is there something about the driver we use that=E2=80=99s known to cause =
breakage?  Is there a known firmware that fixes this?  Is there a better pl=
ace to ask this than the general =E2=80=9Cquestions=E2=80=9D list?
>
> This system is out of warranty, but as we have a lot of Dell machines out=
 in the field doing Important DNS Things, I may open a generic case with a =
system that *is* under warranty to ask more about this.  In the meantime, i=
f anyone who maintains the driver knows something, I=E2=80=99d love to hear=
 from you.

Replying to myself, I've heard back from Dell and am posting their=20
response here.  (The TLDR?  If you're not using mrsas, you've got a=20
ticking time bomb).

They reported a bug in 2020 and nobody's been willing to look at it. :(

Perhaps the correct answer here would be to make mrsas the default.  This=
=20
almost feels like something that is worthy of an errata notification.

=3D=3D=3D

Greetings from Dell EMC Server Support team! This is with reference to the=
=20
recent issue reported : "disabling writes to flash as the flash part has=20
gone bad".

The root cause is inbox MFI driver. The Raid map sync method in MFI driver=
=20
is causing a sync loop with PERC firmware resulting upto 128 writes per=20
second into PERC internal flash storage space, thereby wearing it out.=20
PERC would need to be replaced when such errors are encountered.

Dell and Broadcom have reported this issue to FreeBSD in Bugzilla. FreeBSD=
=20
have provided a patch that removes Raid map sync functionality from mfi=20
driver.


Since you have already engaged FreeBSD support, please check on below=20
link. =C2=A0https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D248352

FreeBSD is an unsupported Operating System. Dell has no engineering or=20
sustaining relationship with FreeBSD.

MRSAS driver does not have this issue.

=3D=3D=3D

I'm going to go poke that bug.

-Dan

--=20

"No mowore webooting!!!"

-Paul, 10-16-99, 10 PM

--------Dan Mahoney--------
Techie,  Sysadmin,  WebGeek
Gushi on efnet/undernet IRC
FB:  fb.com/DanielMahoneyIV
LI:   linkedin.com/in/gushi
Site:  http://www.gushi.org
---------------------------