variable hang when starting APs on Westmere processors
Mike Karels
mike at karels.net
Mon May 2 21:29:07 UTC 2011
Looks like freebsd-smp is gone... not sure of the right target for this.
I just picked up a problem from another developer at work who had the good
fortune to have scheduled a vacation this week. The short description is
that the start_ap() routine sometimes hangs, from 10 minutes to 3 hours,
while starting up CPUs. This is with a much-modified system based on
FreeBSD 7.2. A stock 8.2 CD hangs at the same spot almost all the time,
although the code in the two versions appears identical.
More details: This is amd64, using an Intel S5520HCR 2-socket motherboard
with two XEON X5660 2.8GHz Westmere hex-core CPUs. The problem happens
somewhat less with two XEON E5620 Quad core 2.4GHz CPUs. The hang seems
to happen with higher numbered CPUs, so the hex-core with SMT has more
chances to hit the problem.
We added KTRs to the code, and found that the hang happens in the
lapic_ipi_wait() call after de-asserting RESET.
Of course, Linux doesn't exhibit the problem.
Has anyone else seen a problem like this? Any ideas how to fix it, or
debug further?
Please copy me on responses; I'm not subscribed to this list currently.
Mike
More information about the freebsd-amd64
mailing list