Opened 10 years ago
Last modified 9 years ago
#14075 new defect
Windows VM crashes Debian host, NMI for unknown reason points to "vboxdrv"
Reported by: | Denis Kozlov | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 4.3.26 |
Keywords: | Crash, Dump, NMI, Windows, Linux, Debian, Interrupts, vboxdrv | Cc: | |
Guest type: | Windows | Host type: | Linux |
Description
The first time the crash occurred was about a month ago, the host was running Debian 6 and latest (at that time) VirtualBox. Since then, host was wiped and Debian 7 was installed with latest VirtualBox 4.3.26. A crash occurred again recently and so I started investigating.
Crashes seem random. Windows VM was the only VM on the host at the time of the second crash, it was just applying Windows updates. Reverting to a previous copy of VM and reapplying the same Windows updates again did not cause the crash again. Nothing appears in the logs at the time of a crash (both Linux and VirtualBox logs).
The suspect lines extracted from dmesg
:
[50414.144741] warning: `VBoxHeadless' uses 32-bit capabilities (legacy support in use) [50415.090400] EXT4-fs (md0): Unaligned AIO/DIO on inode 4194333 by AioMgr0-N; performance will be poor. ... [63477.985050] Uhhuh. NMI received for unknown reason 31 on CPU 4. [63477.985073] Do you have a strange power saving mode enabled? [63477.985094] Dazed and confused, but trying to continue [68728.724996] Uhhuh. NMI received for unknown reason 21 on CPU 7. [68728.725021] Do you have a strange power saving mode enabled?
To trace NMI for unknown reason, I have enabled crash dump using "kdump", set "kernel.unknown_nmi_panic=1" and "kernel.panic_on_unrecovered_nmi=1" in "/etc/sysctl.conf", and let VM run with HeavyLoad from JAM Software (CPU, memory, file writes and disk access). This crash is now reproducible but it could take anywhere from 1 hour to 1 day of running a VM for a crash to occur.
Call trace from crash dump points to "vboxdrv":
[76034.059602] Uhhuh. NMI received for unknown reason 31 on CPU 3. [76034.059686] Do you have a strange power saving mode enabled? [76034.059766] Kernel panic - not syncing: NMI: Not continuing [76034.059846] Pid: 19036, comm: EMT-4 Tainted: G O 3.2.0-4-amd64 #1 Debian 3.2.65-1+deb7u2 [76034.059937] Call Trace: [76034.060006] <NMI> [<ffffffff8134a53c>] ? panic+0x95/0x1a2 [76034.060157] [<ffffffff81352056>] ? do_nmi+0x151/0x258 [76034.060235] [<ffffffff813517a0>] ? nmi+0x20/0x30 [76034.060312] <<EOE>> [<ffffffffa03f97c6>] ? rtR0MemAllocEx+0x17e/0x1de [vboxdrv] [76034.060470] [<ffffffffa03f05a3>] ? supdrvIOCtlFast+0x75/0x79 [vboxdrv] [76034.060555] [<ffffffffa03ed2a9>] ? VBoxDrvLinuxIOCtl_4_3_26+0x43/0x1eb [vboxdrv] [76034.060645] [<ffffffff811087dd>] ? do_vfs_ioctl+0x459/0x49a [76034.060728] [<ffffffff81039aa2>] ? finish_task_switch+0x4e/0xb9 [76034.060809] [<ffffffff8134fb09>] ? __schedule+0x5f9/0x610 [76034.060892] [<ffffffff81108869>] ? sys_ioctl+0x4b/0x72 [76034.060970] [<ffffffff81355f92>] ? system_call_fastpath+0x16/0x1b
Tested RAM with MemTest86 for days, no problems found. High CPU usage from Interrupts is observed inside the VM as described in Ticket #10611, it might be relevant.
Summary of host specs:
- BIOS: VT-x and VT-d enabled, HT disabled
- Motherboard: Intel Server Board S5520HCT
- CPU: 2 x Intel Xeon E5620, 2.4GHz, 8 cores in total
- RAM: 12 x 4GB (DDR3 1333MHz ECC Unbuffered)
- HD: 2 x 600GB WD VelociRaptor 10Krpm
VM configuration:
- OS: Windows 7 SP1 (with latest updates)
- vCPU: 4-7 (it seems the higher the number the higher the chances of a crash occurring sooner)
- vRAM: 8GB
Attached are various logs, crash analyses and hardware info.
Host hardware info