AOC-USAS2-L8i zfs panics and SCSI errors in messages
Kenneth D. Merry
ken at freebsd.org
Wed Oct 26 14:56:40 UTC 2011
On Wed, Oct 26, 2011 at 10:05:33 -0400, Douglas Gilbert wrote:
> On 11-10-26 06:16 AM, Jeremy Chadwick wrote:
> >On Wed, Oct 26, 2011 at 11:36:44AM +0200, Karli Sj?berg wrote:
> >>Hi all,
> >>
> >>I tracked down what causes the panics!
> >>
> >>I got a tip from aragon and phoenix at the forum about
> >>/etc/periodic/security/100.chksetuid
> >>
> >>And to put:
> >>daily_status_security_chksetuid_enable="NO"
> >>into /etc/periodic.conf
> >
> >This is not truly the cause of the panic, it simply exacerbates it.
> >
> >Many of the periodic scripts will do things like iterate over all files
> >on the filesystem looking for specific attributes, etc.. This tends to
> >stress filesystems heavily. This isn't the only one. :-)
> >
> >>I can now run periodic daily without any panics. I?m still wondering
> >>about the cause of this, the explanation from the forum was that that
> >>phase is too demanding for multi TB systems. But I have several multi
> >>TB servers with FreeBSD and ZFS, and none of them has ever behaved
> >>this way. Besides, the panic is instantaneous, not degenerative. I
> >>imagine that a run like that would start out OK and then just get
> >>worse and worse, getting gradually slower and slower until it just
> >>wouldn?t cope any more and hang. This feels more like hitting a wall.
> >>As if it found something that is couldn?t deal with and has no choice
> >>but to panic immediately.
> >
> >It may be possible that you have some underlying filesystem corruption
> >that triggers this situation. Have you actually tried doing a "zpool
> >scrub" of your pools and seeing if any errors happen or if the panic
> >occurs there?
> >
> >I'm inclined to think what you're experiencing is probably a bug or
> >"quirk" in the storage controller driver you're using. There are other
> >drivers that have had fixes applied to them "to make them work decently
> >with ZFS", meaning the kind of stressful I/O ZFS puts on them results in
> >the controller driver behaving oddly or freaking out, case in point. It
> >could also be a controller firmware bug/quirk/design issue. Seriously.
> >
> >I believe the AOC-USAS2-L8i controller has been discussed on
> >freebsd-stable, re: mps(4) driver problems or equivalent, but I'm not
> >going to CC that list given that there would be 3 cross-posted lists
> >involved and that is liable to upset some folks. You should search the
> >mailing lists for discussion of Supermicro controllers that work
> >reliably with FreeBSD.
> >
> >It would be worthwhile to discuss this condition on -stable, mainly with
> >something like "Anyone else using the AOC-USAS2-L8i reliably with ZFS?"
> >You get the idea.
>
> There is a steady stream of patches from LSI staff to
> both the mptsas and mpt2sas drivers on the Linux SCSI
> list (e.g. the most recent patch set to mpt2sas was on
> 20111019 and contained 7 separate "fixes").
>
> I don't see these patches appearing on this list. Is there
> a mechanism to get driver corrections incorporated into
> the relevant FreeBSD drivers?
>
> LSI do keep FreeBSD drivers on their site. For example for
> the 9212-4i4e HBA, see this page:
> http://www.lsi.com/products/storagecomponents/Pages/LSISAS9212-4i4e.aspx
> That FreeBSD zip is dated 20110808 and has mps drivers for
> FreeBSD 7.2.0, 7.4.0, 8.2.0
They do have a developer working on their version of the mps driver.
Release of the driver has been hung up by LSI's legal department since
February, unfortunately. I'm not sure what the issue is, but that is why
it isn't in FreeBSD.
The plan, once LSI's legal department approves it, is to hopefully give
their developer commit access so he can just check fixes in to the driver.
For now, though, their binary-only drivers may fix things for some folks.
e.g., those drivers should support their Integrated RAID features. (The
driver in the tree doesn't support them.)
The error recovery code in their driver is a bit better (the error recovery
part was written by Isilon), but I'm not sure whether it would fix this
particular problem. This really looks like a ZFS issue.
Ken
--
Kenneth Merry
ken at FreeBSD.ORG
More information about the freebsd-scsi
mailing list