7.2-STABLE i386 box crashing -- clues?
David Wolfskill
david at catwhisker.org
Thu Nov 12 12:59:05 UTC 2009
On Thu, Nov 12, 2009 at 05:27:09PM +1100, Peter Jeremy wrote:
> I can't offer any solutions but I have some more questions...
I appreciate the help!
> ...
> >Every once in a while, it just crashes -- hard. It loses video output
> >at that point; Ctl+Alt+Esc doesn't appear to change anything; entering
> >(say) "reset" blindly at that point has no apparent effect.
>
> Roughly how often?
For the current month:
albert(7.2-S)[8] last reboot shutdown
reboot ~ Thu Nov 12 03:04
reboot ~ Wed Nov 11 20:06
reboot ~ Wed Nov 11 14:42
shutdown ~ Wed Nov 11 14:40
reboot ~ Wed Nov 11 14:35
reboot ~ Wed Nov 11 10:05
reboot ~ Wed Nov 11 09:09
reboot ~ Wed Nov 11 04:25
reboot ~ Tue Nov 10 12:49
reboot ~ Mon Nov 9 14:52
reboot ~ Sun Nov 8 17:42
reboot ~ Sat Nov 7 04:22
reboot ~ Fri Nov 6 21:43
reboot ~ Fri Nov 6 19:00
reboot ~ Fri Nov 6 16:20
shutdown ~ Fri Nov 6 16:17
reboot ~ Fri Nov 6 16:03
reboot ~ Fri Nov 6 13:07
reboot ~ Fri Nov 6 09:46
reboot ~ Thu Nov 5 16:41
reboot ~ Thu Nov 5 13:32
reboot ~ Thu Nov 5 12:59
reboot ~ Thu Nov 5 10:17
reboot ~ Thu Nov 5 04:26
reboot ~ Wed Nov 4 20:32
reboot ~ Wed Nov 4 15:48
reboot ~ Wed Nov 4 10:37
reboot ~ Tue Nov 3 13:15
reboot ~ Tue Nov 3 10:55
reboot ~ Tue Nov 3 04:16
reboot ~ Mon Nov 2 18:13
reboot ~ Sun Nov 1 20:03
shutdown ~ Sun Nov 1 20:01
reboot ~ Sun Nov 1 17:10
reboot ~ Sun Nov 1 13:51
shutdown ~ Sun Nov 1 13:48
wtmp begins Sun Nov 1 05:08:18 PST 2009
albert(7.2-S)[9]
The "solo reboots" are crashes; those paired with "shutdown" entries are
controlled.
> Has anything unusual happened lately? Brownout, blackout, power surge,
> lightning, heatwave, ...
Nothing linked to the crashes. I pulled the UPS out of service
some weeks ago because it needs new batteries; I need to get those
ordered. But the crashes were happening before that, in any case.
> >accordingly, had attached a SCSI host adaptor via PCI riser card. Since
> >I had nothing actually connected to the card, I pulled it out of the
> >machine before bringing it back up.
>
> Did you also pull the riser card? Riser cards don't have a spectacularly
> high reputation.
That's actually what I pulled. The SCSI card itself is still physically
in the chassis, merely with an air gap between itself at the system
board (because the riser card is now in a closet).
> > (I also fleft around for
> >excessively warm spots; nothing. All fans spin up, as well.)
>
> I don't suppose you also studied the capacitors on the motherboard.
> Are any showing any signs of bulges?
I'll take another look for those; I recall that electrolytics exhibit
that as a sign of failure -- thanks for the reminder.
> Have you tried reseating everything?
The memory, yeah (even before replacing it); also swapped the DIMMs.
Only other thing that can be re-seated (desktop system board, so most
everything is built-in) would be the CPU, and I'm not quite sure how
that heat sink works. I did re-seat some power connectors.
> >Flaky CPU? Flaky power supply? How might I tell?
>
> CPU shouldn't go flaky unless it's been overheated. In my experience,
> PSUs are the least reliable part of consumer-grade hardware but about
> the only way to check is to swap it.
:-}
> If you've got a DMM, you could check all the rails but there are
> lots of failure modes that won't show up that way.
Yeah, I kinda figured that. I do have a DMM (used to have a VTVM), but
figured the meter wouldn't show transient dips or whatever too well.
> Have you checked the voltage/temperature screen in the BIOS? Does
> anything look abnormal?
Did a couple of reality checks in that way as detours during some of the
reboots. Nothing interesting there at all. (And I have seen a case in
the past -- though with a 1U box) where that test definitely showed
something wrong (CPU temp climbing about 1C every 30 seconds, IIRC).
> Are you using a PS/2 or USB keyboard?
PS/2 via KVM. I don't have any USB keyboarda. :-}
> Are you running X?
Yes; the machine is configured to start xdm on transition to
multi--user, as my spouse used to use it as a desktop. (She's gone back
to using its predecessor, a 4.11-STABLE machine, in frustration.)
> At this stage, my suggestion would be to try swapping the PSU.
Thanks. I'll discuss it with the "family CFO."
Peace,
david
--
David H. Wolfskill david at catwhisker.org
Depriving a girl or boy of an opportunity for education is evil.
See http://www.catwhisker.org/~david/publickey.gpg for my public key.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-hardware/attachments/20091112/d2c3984f/attachment.pgp
More information about the freebsd-hardware
mailing list