mpt errors - UNIT ATTENTION asc:29,0
Wes Morgan
morganw at chemikals.org
Fri Aug 7 19:42:39 UTC 2009
On Fri, 7 Aug 2009, Artem Belevich wrote:
> Hi,
>
> I'm running 8.0-BETA2 on Asus p5BV/SAS with built-in LSI1068
> controller with 8 SATA ports. 6 of the ports hooked up to 1TB WD Green
> drives. The drives are used as a single raidz2 ZFS pool:
>
> NAME STATE READ WRITE CKSUM
> z2 ONLINE 0 0 0
> raidz2 ONLINE 0 0 0
> da1 ONLINE 0 0 0
> da0 ONLINE 0 0 0
> da2 ONLINE 0 0 0
> da3 ONLINE 0 0 0
> da4 ONLINE 0 0 0
> da5 ONLINE 0 0 0
>
> I'm runing a simple stress test that copies 10GB file until it fills
> the volume and then runs "zfs scrub" on it.
>
> dd if=/dev/urandom of=/z2/f.0 bs=1m count=10240
> for f in {1..350}; do echo $f; cp f.$[$f-1] f.$f; done;
> zpool scrub z2
>
> What concerns me is that I'm periodically getting error messages from
> MPT driver. They usually start few hours after the start of the script
> and by the end of it they are happening every few minutes seemingly
> randomly on all six drives.
>
> Aug 7 10:25:32 buz kernel: mpt0: mpt_cam_event: 0x16
> Aug 7 10:25:32 buz kernel: mpt0: mpt_cam_event: 0x16
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): READ(10). CDB: 28 0 46
> 32 97 c0 0 0 80 0
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): CAM Status: SCSI Status Error
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): SCSI Status: Check Condition
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): UNIT ATTENTION asc:29,0
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): Power on, reset, or bus
> device reset occurred
> Aug 7 10:25:32 buz kernel: (da4:mpt0:0:4:0): Retrying Command (per Sense Data)
>
> ZFS scrub does not seem to report any issues so far - no checksum or
> read/write errors. WD's hard drive diagnostics tools didn't find any
> issues with te drives either.
>
> Sould somebody shed some light on why would such error happen? Is that
> some sort of hardware issue? Driver bug? Issue with compatibility
> between controller and the drives? System configuration issue (some
> sysctl/tunable needs tweaking, perhaps)?
I have that same board with 8 500gb drives in a raidz2. I used to
be using a SATA backplane and I would see those timeouts fairly regularly
when moving lots of data around. To eliminate the cable mess I switched to
an SAS backplane with fanout cables and since then I have not seen the
timeouts.
More information about the freebsd-scsi
mailing list