SATA READ_DMA timeouts - SOLVED?
Mel
fbsd.questions at rachie.is-a-geek.net
Tue Sep 30 17:29:51 UTC 2008
On Tuesday 30 September 2008 18:54:12 Reid Linnemann wrote:
> Jeremy Chadwick wrote:
> > (I'm not subscribed to freebsd-questions, so please CC me on replies.
> > I'm also not sure how I ended up getting this mail in the first place;
> > it looks like someone BCC'd my koitsu at freebsd.org address).
>
> Yes, I BCC'd you since you are maintaining a page on the wiki
> documenting SATA DMA problems.
>
> > Furthermore, one of the most common reports on the FreeBSD lists is the
> > exact opposite -- users complaining that "their disks are SATA300 but
> > only operate at SATA150" (caused by that jumper). Users are told to
> > remove the jumper, and are reminded that the reason the jumper is
> > enabled by default is said chipset incompatibilities.
> >
> > That said, your mail confuses me for one reason:
> >
> > Were you receiving DMA errors with the jumper REMOVED (e.g. SATA300
> > operation), or with the jumper ENABLED (SATA150 operation)? Your below
> > description does not state what exactly you did with the jumper to make
> > your drives work reliably, only "that the jumper capability on your
> > disks was available".
>
> I should have been more clear.
>
> My disks came with no cap on the SATA150 jumper, although FreeBSD
> reported that they were in SATA150 mode. The system would be unusable
> from READ_DMA timeouts if the system was ever powered off and brought
> back up. I had to do some voodoo of booting in single user mode with
> ACPI turned off to repair filesystems and rebuild my gmirror, then load
> ACPI and drop back into multi-user mode. I even had to do this if the
> system was powered off gracefully. So far, since I capped the jumpers
> this has not been the case. I still get them periodically if I do
> something like rebuild a gmirror component, so I can no longer say my
> problem is completely resolved.
Is this on 7.x? Sounds very similar to my experience described in:
http://www.freebsd.org/cgi/query-pr.cgi?pr=122572&cat=kern
The machine is now operational and working in UDMA33 mode with two gmirror'ed
SATA, using 6.3-p4. Unfortunately, I can't risk "trying 7.x" anymore, since
it's emergency storage for the main fileserver, so dataloss is
unacceptable :/. I do not know about the jumper state at the moment. I will
inform if there will be a window real soon now, to check for jumpers.
Ata info:
# atacontrol list
ATA channel 0:
Master: acd0 <HL-DT-STDVD-ROM GDR-T10N/1.02> ATA/ATAPI revision 5
Slave: no device present
ATA channel 1:
Master: no device present
Slave: no device present
ATA channel 2:
Master: ad4 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II
Slave: no device present
ATA channel 3:
Master: ad6 <WDC WD6400AAKS-65A7B0/01.03B01> Serial ATA II
Slave: no device present
# atacontrol cap ad4
Protocol Serial ATA II
device model WDC WD6400AAKS-65A7B0
serial number WD-WMASY1885186
firmware revision 01.03B01
cylinders 16383
heads 16
sectors/track 63
lba supported 268435455 sectors
lba48 supported 1250263728 sectors
dma supported
overlap not supported
Feature Support Enable Value Vendor
write cache yes yes
read ahead yes yes
Native Command Queuing (NCQ) yes - 31/0x1F
Tagged Command Queuing (TCQ) no no 31/0x1F
SMART yes yes
microcode download yes yes
security no no
power management yes yes
advanced power management no no 0/0x00
automatic acoustic management yes yes 128/0x80 128/0x80
# atacontrol mode ad4
current mode = UDMA33
--
Mel
Problem with today's modular software: they start with the modules
and never get to the software part.
More information about the freebsd-questions
mailing list