8.1-RC2 - PCI fatal error or MCE triggered by USB/ehci on Sun
X4100M2?
John Baldwin
jhb at freebsd.org
Fri Jul 9 20:03:45 UTC 2010
On Friday, July 09, 2010 11:26:00 am Markus Gebert wrote:
> --
> MCA: Bank 4, Status 0xb400004000030c2b
> MCA: Global Cap 0x0000000000000105, Status 0x0000000000000007
> MCA: Vendor "AuthenticAMD", ID 0x40f13, APIC ID 2
> MCA: CPU 2 UNCOR BUSLG Observer WR I/O
> MCA: Address 0xfd00000000
Using my local port of mcelog this is what I get for this check:
CPU 2 4 northbridge
ADDR fd00000000
Northbridge Master abort
link number = 4
bit61 = error uncorrected
bus error 'local node observed, request didn't time out
generic write mem transaction
i/o access, level generic'
STATUS b400004000030c2b MCGSTATUS 7
MCGCAP 105 APICID 2 SOCKETID 0
CPUID Vendor AMD Family 15 Model 65
I don't know what to tell you off hand. Did you buy this hardware from Sun
directly? If so, I would try bugging them about this, especially given the
error that the BIOS is logging. It does sound like a hardware issue, but in
the chipset, not in the RAM, so you might need to swap out the main board
rather than the RAM.
I'm curious if disabling USB legacy support in the BIOS causes it to still die
even with ehci not loaded. If so, then the SMI# for the ehci controller must
somehow prevent the issue, perhaps by triggering frequently enough to slow the
rate of I/O requests down?
--
John Baldwin
More information about the freebsd-stable
mailing list