FreeBSD Machines dieing, we've tried so much....
Chad Leigh -- Shire.Net LLC
chad at shire.net
Wed Jun 22 15:20:42 GMT 2005
On Jun 22, 2005, at 3:07 AM, Ted Mittelstaedt wrote:
>
>
>
>> -----Original Message-----
>> From: Matt Juszczak [mailto:matt at atopia.net]
>> Sent: Monday, June 20, 2005 10:49 AM
>> To: Ted Mittelstaedt
>> Cc: freebsd-questions at freebsd.org
>> Subject: RE: FreeBSD Machines dieing, we've tried so much....
>>
>>
>>
>>
>> On Mon, 20 Jun 2005, Ted Mittelstaedt wrote:
>>
>>
>>
>>
>>> Please post dmesg output from both systems.
>>>
>>
>> The systems end up crashing so I can't do a dmesg.... or do you
>> mean a
>> general dmesg when they are stable?
>>
>>
>
> Yes. Matt, please slow down and quit panicing for just a second here
> - you haven't even told us what processor these are on let alone
> what the
> hardware manufacturer is. It's like your calling to schedule a
> doctors
> appointment and you aren't even telling them if the patient is
> a man, woman, child, or for that matter, family dog!
>
> The vast majority of panics are hardware-related. It is rare nowadays
> for a usermode program to make the system panic. In particular you
> said
> the problem happens more under load. That really points even more
> to a
> hardware problem - bad CPU cache ram, bad ram, scsi termination, that
> sort of thing.
>
> Ted
Just as an example of what Ted is saying. About 3 or 4 years ago I
had installed some new "server" main boards for AMD CPUs. The
"chipset" was a split chipset that had a "northbridge" by one vendor
and a "southbridge" by another vendor. One was an AMD chip and one
was a VIA chip. (The AMD supported ECC etc unlike all the other
brands of that same chip functionality). Under load (using Adaptec
RAID controllers) the machine would freeze up. Finally, after much
testing and ridiculous amounts of cooling (assuming it was a heat
problem), I replaced the main boards with new ones that only used AMD
chipsets for both the north and southbridge chips. Problem went away.
These same boards work fine, including under load, with Windows, for
example, and a test Linux install also did not have problems (though
the Linux was not very well tested).
My point is, that you can have some sort of HW problem that shows up
under load and it may not be an pbvious one.
Test you RAM first, using something like memtest86, and think about
what other HW is in your machine(s) and whether you can swap it out
for test purposes, etc.
---
Chad Leigh -- Shire.Net LLC
Your Web App and Email hosting provider
chad at shire.net
More information about the freebsd-questions
mailing list