SMP system not running SMP
Arno J. Klaassen
arno at heho.snv.jussieu.fr
Thu Jun 29 20:43:18 UTC 2006
"UEMURA (fka. MAENAKA) Tetsuya" <maenaka at pluto.dti.ne.jp> writes:
> Posted on Tue, 27 Jun 2006 15:06:51 +0100
> By default, FreeBSD couldn't start. Dumping the ahd state when probing
> the da and simply stopped. So I set the SCSI BIOS to restrict the device
> speed upto 80MB/s and the problem went away. After that, the machine
> runs flawlessly for 8 months.
I have a Tyan S2882 which I cannot get up for more than a couple of
days under moderate load, and the symptoms seem related :
config :
- tracking -stable
- 8G RAM
- latest BIOS 3ware 9500S-12 with 1.1T data
- RAID-1 MAXTOR ATLAS10K5_73WLS as system-disk on ahd0
- doing nothing else than some test-scripts implying fairly
moderate nfs-traffic (i.e. scripts via nfs, (rarely needed) data
either on NFS or raid, scripts being CPU-intensive)
symptom :
- systems cold-boots fine (SMP dual opteron 248)
- runs OK for a couple of minutes/hours/days
- then total freeze; *never* a panic in 9 months
- warm reset either does not detect da0 or indeed dumps ahd state
when probing it
- even cold reboot sometimes has to be repeated once or twice in order
to redetect correctly da0
has tried :
- changed scsi-cables and termination three times : no deal
- decreased device speed to 80Mhz : seems to eliminate the "minutes"
part from "runs OK for a couple of minutes/hours/days" ...
observations :
- this week I downloaded the latest manual from tyan and came across
the following jumper setting (dunno if it was in the original
version or whether I overlooked it; the printed manual is at the
customer's site) :
"Set PCI-X Bridge A (PCI 3 & PCI 4 & SCSI7902 & BCM5704) to operate at
a maximum 66MHz;
Note: Due to the PCI-X specifications it will be necessary to set
this bus to 66MHz if a 133/100MHz PCI-X card is
added to this bus."
Since I do have a 100MHz PCI-X card (3ware) I set this jumper;
system up for three days now, cannot confirm right now this was the
culprit but other AMD811X based systems might have the same issue.
- this board has dual ahd and dual bge :
vmstat -i (I just rebooted for an upgrade -stable + linux_base) :
irq24: bge0 ahd0 16826 2
irq25: bge1 ahd1 1305665 157
network is attached to bge1, disk is on ahd0. Interestingly, when I
provoke insane swapping, it is the "irq25:" process which consumes
50-90%! of cpu-time, but when I stop the program provoking swapping
and redo vmstat -i, it indeed reports slightly increased irq24
activity but no noticeable change in irq25 activity ...
( I put hint.ahd.1.disabled="1" in /boot/loader.conf since
I do not need ahd1 but that does not seem to do anything )
FYI.
I can test on this box for a couple of more weeks, feel free to
contact me for more information.
Thanx, regards, Arno
--
Arno J. Klaassen
SCITO S.A.
8 rue des Haies
F-75020 Paris, France
http://scito.com
More information about the freebsd-amd64
mailing list