kern/186051: [vmware] [panic] FreeBSD 8.4+, 9.x+, 10.0 guest panic with VMWare Server on boot
Steven Spence
freebsd at stratum16.com
Wed Apr 30 18:00:01 UTC 2014
The following reply was made to PR kern/186051; it has been noted by GNATS.
From: Steven Spence <freebsd at stratum16.com>
To: John Baldwin <jhb at freebsd.org>
Cc: bug-followup at freebsd.org
Subject: Re: kern/186051: [vmware] [panic] FreeBSD 8.4+, 9.x+, 10.0 guest
panic with VMWare Server on boot
Date: Wed, 30 Apr 2014 11:58:35 -0600
On 04/30/2014 11:17 AM, John Baldwin wrote:
> On Wednesday, April 30, 2014 12:47:31 pm Steven Spence wrote:
>> On 04/30/2014 10:09 AM, John Baldwin wrote:
>>> On Tuesday, April 29, 2014 10:13:20 pm Steven Spence wrote:
>>>> On 04/29/2014 01:43 PM, John Baldwin wrote:
>>>>> On Monday, April 28, 2014 11:04:40 pm Steven Spence wrote:
>>>>>> On 04/28/2014 08:32 AM, John Baldwin wrote:
>>>>>>> On Monday, April 21, 2014 01:45:10 PM Steven Spence wrote:
>>>>>>>
>>>>>>>> Output of "sysctl machdep.idle"
>>>>>>>> machdep.idle: amdc1e
>>>>>>>> This is from a 8.3-RELEASE-p15 box.
>>>>>>> Hummm. We really shouldn't be doing anything differently. However, we do a
>>>>>>>
>>>>>>> bit more (including a wrmsr) during idle halt on your machine. Can you
>>>>>>> build
>>>>>>>
>>>>>>> a stable/8 kernel with debug symbols in an 8.3 guest and capture the panic
>>>>>>>
>>>>>>> messages from booting that kernel?
>>>>>>>
>>>>>>>
>>>>>> Here is a capture of the panic from a stable/8 kernel. Is the only
>>>>>> debugging option you are looking for in the kernel config
>>>>>> "makeoptions DEBUG=-g"? I still have the 8.3 kernel on there I can
>>>>>> boot if I need to get in and recompile the stable/8 kernel differently.
>>>>>> I am not sure how much use the information below will be to you.
>>>>>>
>>>>>> kernel trap 1 with interrupts disabled
>>>>>> Fatal trap 1: privileged instruction fault while in kernel mode
>>>>>> cpuid = 0; apic id = 00
>>>>>> instruction pointer = 0x20:0xffffffff809c342e
>>>>>> stack pointer = 0x28:0xffffff8000211b40
>>>>>> acd0: CDROM <VMware Virtual IDE CDROM Drive/00000001> at ata1-master UDMA33
>>>>>> frame pointer = 0x28:0xffffff8000211b60
>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>> processor eflags = resume, IOPL = 0
>>>>>> current process = 11 (idle: cpu0)
>>>>>> trap number = 1
>>>>>> panic: privileged instruction fault
>>>>>> cpuid = 0
>>>>>> KDB: stack backtrace:
>>>>>> #0 0xffffffff8067c0b6 at kdb_backtrace+0x66
>>>>>> #1 0xffffffff8064861e at panic+0x1ce
>>>>>> #2 0xffffffff809d3750 at trap_fatal+0x290
>>>>>> #3 0xffffffff809d3ce5 at trap+0x105
>>>>>> #4 0xffffffff809ba944 at calltrap+0x8
>>>>>> #5 0xffffffff8066e08f at sched_idletd+0x11f
>>>>>> #6 0xffffffff8061ceaf at fork_exit+0x11f
>>>>>> #7 0xffffffff809bae8e at fork_trampoline+0xe
>>>>>> Uptime: 1s
>>>>>> Cannot dump. Device not defined or unavailable.
>>>>>> Automatic reboot in 15 seconds - press a key on the console to abort
>>>>>>
>>>>>> I have also tried to dump the panic to a swap device but I don't think
>>>>>> it is getting far enough in the kernel boot to initialize any hard drive
>>>>>> storage devices.
>>>>>>
>>>>>> If there is anything else I can try to get more information out of this
>>>>>> let me know.
>>>>> If you have the result of this kernel build, can you find the kernel.debug
>>>>> file it generated and run 'gdb kernel.debug' and then 'l *0xffffffff809c342e'?
>>>>> That will (hopefully) identify the exact line it panic'd on. It might also
>>>>> be useful to do 'x/i 0xffffffff809c342e' in gdb as well.
>>>>>
>>>> Below are the results of the two gdb commands:
>>>>
>>>> (gdb) l *0xffffffff809c342e
>>>> 0xffffffff809c342e is in cpu_idle_mwait (cpufunc.h:470).
>>>> 465 }
>>>> 466
>>>> 467 static __inline void
>>>> 468 cpu_monitor(const void *addr, int extensions, int hints)
>>>> 469 {
>>>> 470 __asm __volatile("monitor;"
>>>> 471 : :"a" (addr), "c" (extensions), "d"(hints));
>>>> 472 }
>>>> 473
>>>> 474 static __inline void
>>>>
>>>> (gdb) x/i 0xffffffff809c342e
>>>> 0xffffffff809c342e <cpu_idle_mwait+62>: monitor %eax,%ecx,%edx
>>> That's interesting. It's dying on monitor, not hlt.
>>>
>>> Can you capture the CPU lines from dmesg from a working kernel? I want to see
>>> if VMWare is advertising the ability to use monitor via cpuid.
>>>
>>> Also, try setting 'machdep.idle_mwait=0' at the loader prompt before booting to
>>> see if that fixes the panic.
>>>
>> Here is the requested information:
>>
>> CPU: Quad-Core AMD Opteron(tm) Processor 2384 (2726.06-MHz K8-class CPU)
>> Origin = "AuthenticAMD" Id = 0x100f42 Family = 10 Model = 4
>> Stepping = 2
>> Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2>
>> Features2=0x802009<SSE3,MON,CX16,POPCNT>
> Looks like it is telling the guest here it is ok to use montior ("MON"
> feature).
>
>> AMD
>> Features=0xee500800<SYSCALL,NX,MMX+,FFXSR,Page1GB,RDTSCP,LM,3DNow!+,3DNow!>
>> AMD
>> Features2=0x37e9<LAHF,ExtAPIC,ABM,SSE4A,MAS,Prefetch,OSVW,IBS,SKINIT,WDT>
>> TSC: P-state invariant
>>
>> Setting 'machdep.idle_mwait=0' did fix the panic. It successfully
>> booted into 8.4-STABLE with this option set. I am not sure what (if
>> any) ramifications this option causes but if there are little to none I
>> am fine with sticking this in my /boot/loader.conf and running with it.
>> If you feel there is a deeper/generic problem that still needs to be
>> worked out I can try to provide whatever information you need.
> It should be fine as a workaround. The remaining issues I can see are:
>
> 1) Should we disable monitor automatically for VMWare?
I am not sure on this one. Did FreeBSD start using or change how it was
using this feature with kernels > 8.3? Everything worked good up to
that kernel version, even with VMWare falsely advertising that it
supports the monitor flag. I went looking at the flags the host (CentOS
5) reports for the physical CPU and I don't see the 'monitor' flag in
there either so I am not sure where VMWare is getting the idea it is
supported.
>
> 2) This should be reported to the VMWare folks as it is ultimately their
> bug. If they don't support usage of 'monitor' by guest OS's, then they
> should hide it from the cpuid information.
>
> Would you be able to handle 2)? I would like to see what they say before
> adventuring too much further down the path of 1).
I don't mind contacting VMWare about it but I am almost positive they
are going to tell me that is not a product they support any more and
that I should upgrade to ESX, vSphere, or whatever their latest
incarnation is. Newer FreeBSDs appear to work with newer VMWare
products as I didn't run across anyone else having this problem when I
first went searching for a solution. I don't think disabling a feature
that appears to work for others just because of some old corner case is
a good idea. Doubly so since there is an option to bypass the problem
for people with older VMWare installs like mine. Let me know if you
still think contacting VMWare is worth pursuing.
This is just probably the kick in the butt I need to convert the VMs to
Virtualbox or something more recent and supported.
Thanks,
Steven
More information about the freebsd-emulation
mailing list