Please help me diagnose this crazy VMWare/FreeBSD 8.x crash
Mark Felder
feld at feld.me
Mon May 21 20:09:57 UTC 2012
On Mon, 21 May 2012 13:47:45 -0500, Michael Powell
<nightrecon at hotmail.com> wrote:
> Very curious how 'irq 22 at device 22.0' and 'dev.mpt.0.%location:
> slot=22'
> all match with a '22'.
Strangely here in ESXi that doesn't work the same. Emulated BIOS must be
considerably different... :/
$ vmstat -i
interrupt total rate
irq1: atkbd0 6 0
irq6: fdc0 9 0
irq15: ata1 34 0
irq16: em1 62 0
irq18: em0 178079 17
cpu0: timer 4136470 400
irq256: mpt0 112544 10
Total 4427204 428
$ sysctl -a | grep mpt
kern.sched.preemption: 1
kern.sched.preempt_thresh: 64
dev.mpt.0.%desc: LSILogic SAS/SATA Adapter
dev.mpt.0.%driver: mpt
dev.mpt.0.%location: slot=0 function=0 handle=\_SB_.PCI0.PE40.S1F0
dev.mpt.0.%pnpinfo: vendor=0x1000 device=0x0054 subvendor=0x15ad
subdevice=0x1976 class=0x010700
dev.mpt.0.%parent: pci3
dev.mpt.0.debug: 3
dev.mpt.0.role: 1
dev.mpt.0.wake: 0
irq256 and slot ... 0. Interesting.
> The obvious thing here is we are comparing a userland Vbox guest to a
> VMWare
> hypervisor. From what little I know concerning any of this, to me it
> sounds
> vaguely like an APIC, LAPIC, and IO/APIC bug. There are known bugs wrt to
> BIOS setting up IRQ routing incorrectly, and/or providing incorrect ACPI
> and/or IMS tables to operating systems.
FWIW, VirtualBox and ESXi are nearly the same except ESXi just has as
minimal an OS as possible for performance reasons. And what you're
describing is exactly what I've been thinking for a long time but I just
haven't had the proof.
> The parallel in this case would be the logical or synthetic so-called
> "BIOS"
> that the VMWare hypervisor presents to the FreeBSD guest at guest boot
> time.
> In this case the truest fix for the problem would fall to VMWare, e.g.
> if the
> hypervisor is setting up tables in such a way as to create the shared IRQ
> problem in the first place.
> If my idea/theory/potential hypothesis has any merit. I do not understand
> why any of this would be different depending upon which guest is
> installed,
> but I also know absolutely nothing about VMWare hypervisor internals.
I don't know enough about how it's supposed to work but hopefully we're
getting close to nailing down the real VMWare bug and we can finally tell
their engineering to fix it.
More information about the freebsd-questions
mailing list