Weird messages output
Gavin Atkinson
gavin.atkinson at ury.york.ac.uk
Tue Mar 27 22:03:07 UTC 2007
On Tue, 27 Mar 2007, Eirik Øverby wrote:
> On 27. mar. 2007, at 15.33, Gavin Atkinson wrote:
>> On Tue, 2007-03-27 at 15:00 +0200, Eirik Øverby wrote:
>>> Hi all,
>>>
>>> running 6.1-RELEASE on several HP DL385 servers (identically
>>> configured), one of them has recently spat the following out in the /
>>> var/log/messages file:
>>>
>>> ..........
>>> Mar 10 03:51:24 apphost02 ntpd[445]: kernel time sync enabled 2001
>>> Mar 10 05:02:01 apphost02 kernel: NMI ISA 30, EISA ff
>>> ..........
>>
>> I suspect you'll find your (ECC) memory has problems.
>
> You are absolutely correct. Further investigation using the ProLiant
> management tools for FreeBSD revealed serious RAM trouble. Two banks were
> degraded, so we have now had the modules replaced on-site.
Glad to be of help!
> Thanks for the tip!
> Do you happen to know if there are any "generic" tools/daemons available to
> decipher such NMIs? Perhaps be able to send SNMP traps or something?
I don't, to be honest. There is some code in /usr/src/sys/i386/isa/nmi.c
that tries to detect the cause of an NMI, although I don't remember ever
seeing the messages when a parity error was detected. I guess it's
possible that (to some chipset vendor at least) 0x20 and 0x30 indicate
parity error, but neither our code or Linux's (see
http://fxr.watson.org/fxr/source/arch/i386/kernel/traps.c?v=linux-2.6#L743 )
know those codes to mean parity error.
Gavin
More information about the freebsd-stable
mailing list