Opened 14 years ago
Closed 13 years ago
#9305 closed defect (fixed)
VBox modules randomly cause kernel panic on computer shutdown -> fixed as of 28-Jul 2011
Reported by: | Artem S. Tashkinov | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 4.1.0 |
Keywords: | Cc: | ||
Guest type: | other | Host type: | Linux |
Description
!!Assertion Failed!! Expression idCpu == RTMpCpuId() Location : /tmp/vbox.0/r0drv/linux/mpnotification-r0drv-linux.c(85) rtMpNotificationLinuxOnCurrentCpu int3: 0000 #1 PREEMPT SMP
The rest of it is in the attached screenshot.
Attachments (6)
Change History (31)
by , 14 years ago
comment:1 by , 14 years ago
!!Assertion Failed!! Expression idCpu == RTMpCpuId() Location : /tmp/vbox.0/r0drv/linux/mpnotification-r0drv-linux.c(85) rtMpNotificationLinuxOnCurrentCpu int3: 0000 [#1] PREEMPT SMP The rest of it is in the attached screenshot.
I'm running Linux 3.0 i686 vanilla kernel. I observed the same problems on Linux kernel 2.6.39. I don't remember experiencing such problems with VirtualBox 4.0.x, so this issue is probably new to VirtualBox 4.1.x.
comment:4 by , 14 years ago
comment:5 by , 13 years ago
If anyone has the same problem, here's a temporary solution (until VBox developerss identify and solve this issue). Put these lines into your halt/shutdown script just before a halt
invocation:
rmmod `lsmod | grep ^vb | awk '{print $1}'` &> /dev/null rmmod `lsmod | grep ^vb | awk '{print $1}'` &> /dev/null
comment:6 by , 13 years ago
It's most likely a dupe of bug #9253 - but at least my bug report contains full panic information (I run framebuffer at 1600x1200).
follow-up: 8 comment:7 by , 13 years ago
Many thanks for giving us the actual assertion. It seems our notification callback is not firing on the CPU we expect it to fire on. It works fine on my x64 2.6.38-8-generic kernel but I still can't find anything in our sources that restricts this to 32-bit only. Maybe 64-bit dual-core setups are just lucky to not hit the problem.
We noticed a slight difference in the linux sources in smp_processor_id() between 32 and 64-bit, but nothing really concrete to identify the real cause.
@birdie / anyone who can see the Assertion before the trace:
Could you try patching the sources and trying again to trigger the assertion? It would be good if we can get more information out of it.
Index: src/VBox/Runtime/r0drv/linux/mpnotification-r0drv-linux.c =================================================================== --- src/VBox/Runtime/r0drv/linux/mpnotification-r0drv-linux.c (revision 73165) +++ src/VBox/Runtime/r0drv/linux/mpnotification-r0drv-linux.c (revision 73166) @@ -32,6 +32,7 @@ #include "internal/iprt.h" #include <iprt/mp.h> +#include <iprt/asm-amd64-x86.h> #include <iprt/err.h> #include <iprt/cpuset.h> #include <iprt/thread.h> @@ -82,7 +83,8 @@ NOREF(pvUser1); AssertRelease(!RTThreadPreemptIsEnabled(NIL_RTTHREAD)); - AssertRelease(idCpu == RTMpCpuId()); /* ASSUMES iCpu == RTCPUID */ + AssertReleaseMsg(idCpu == RTMpCpuId(), /* ASSUMES iCpu == RTCPUID */ + ("idCpu=%u RTMpCpuId=%d ApicId=%d\n", idCpu, RTMpCpuId(), ASMGetApicId() )); switch (ulNativeEvent) {
comment:8 by , 13 years ago
Replying to ramshankar:
I've applied the patch and I will post the results as soon as I hit this problem again.
follow-up: 10 comment:9 by , 13 years ago
In fact the host crashes every time if I ran any VM - so it must be easily reproducible.
I have a quad core CPU, 4GB of RAM and I run PAE enabled kernel in x86 mode.
follow-up: 12 comment:10 by , 13 years ago
Replying to birdie:
In fact the host crashes every time if I ran any VM - so it must be easily reproducible.
I have a quad core CPU, 4GB of RAM and I run PAE enabled kernel in x86 mode.
Thanks for the revised assertion!
Could you provide us with the gcc version you're using to compile the vboxdrv sources as well as provide us the the vboxdrv.ko binary compiled with it?
Our linux expert suggests this is a calling convention bug, so the gcc version and the vboxdrv.ko binary would help us in solving this issue. This also would explain why it only happens on 32-bit.
Please compress the binary before uploading (.zip or .tar.gz)
comment:11 by , 13 years ago
I have reported this isse in #9282
fm@thinkpad:~ $ LANG=C gcc --version gcc (GCC) 4.6.0 20110603 (Red Hat 4.6.0-10) Copyright (C) 2011 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
fm@thinkpad:~ $ uname -a Linux thinkpad 2.6.38.8-35.fc15.i686.PAE #1 SMP Wed Jul 6 14:29:06 UTC 2011 i686 i686 i386 GNU/Linux
comment:12 by , 13 years ago
Replying to ramshankar:
Could you provide us with the gcc version you're using to compile the vboxdrv sources as well as provide us the the vboxdrv.ko binary compiled with it?
GCC 4.5.3 vanilla, i.e. with no patches applied ( ftp://gcc.gnu.org/pub/gcc/releases/gcc-4.5.3/gcc-4.5.3.tar.bz2 ):
$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/opt/gcc4/bin/../libexec/gcc/i686-pc-linux-gnu/4.5.3/lto-wrapper Target: i686-pc-linux-gnu Configured with: /usr/src/gcc-4.5.3/configure --enable-shared --enable-threads=posix --disable-stage1-checking --with-system-zlib --enable-__cxa_atexit --enable-multilib --with-gnu-as --with-gnu-ld --enable-languages=c,c++ --without-x --prefix=/opt/gcc4 --disable-libunwind-exceptions --with-gmp=/usr Thread model: posix gcc version 4.5.3 (GCC)
Our linux expert suggests this is a calling convention bug, so the gcc version and the vboxdrv.ko binary would help us in solving this issue. This also would explain why it only happens on 32-bit.
Please compress the binary before uploading (.zip or .tar.gz)
I have attached the required module.
comment:13 by , 13 years ago
GCC locally uses these flags during compilation:
-DKERNEL -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Wno-format-security -fno-delete-null-pointer-checks -O2 -m32 -msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 -march=i686 -mtune=core2 -maccumulate-outgoing-args -Wa,-mtune=generic32 -ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -Wframe-larger-than=1024 -fno-stack-protector -fno-omit-frame-pointer -fno-optimize-sibling-calls -Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow -fconserve-stack -DCC_HAVE_ASM_GOTO
comment:14 by , 13 years ago
Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/i686-pc-linux-gnu/4.6.1/lto-wrapper Target: i686-pc-linux-gnu Configured with: /build/src/gcc-4.6.1/configure --prefix=/usr --libdir=/usr/lib --libexecdir=/usr/lib --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=https://bugs.archlinux.org/ --enable-languages=c,c++,ada,fortran,go,lto,objc,obj-c++ --enable-shared --enable-threads=posix --with-system-zlib --enable-cxa_atexit --disable-libunwind-exceptions --enable-clocale=gnu --enable-gnu-unique-object --enable-linker-build-id --with-ppl --enable-cloog-backend=isl --enable-lto --enable-gold --enable-ld=default --enable-plugin --with-plugin-ld=ld.gold --disable-multilib --disable-libstdcxx-pch --enable-checking=release Thread model: posix gcc version 4.6.1 (GCC)
follow-ups: 16 18 comment:15 by , 13 years ago
We hope that the following patch may fix this issue, if anyone would like to give it a shot:
--- src/VBox/Runtime/r0drv/linux/mpnotification-r0drv-linux.c (revision 73209) +++ src/VBox/Runtime/r0drv/linux/mpnotification-r0drv-linux.c (revision 73210) @@ -77,7 +77,7 @@ * @param pvUser2 The notification event. * @remarks This can be invoked in interrupt context. */ -static void rtMpNotificationLinuxOnCurrentCpu(RTCPUID idCpu, void *pvUser1, void *pvUser2) +static DECLCALLBACK(void) rtMpNotificationLinuxOnCurrentCpu(RTCPUID idCpu, void *pvUser1, void *pvUser2) { unsigned long ulNativeEvent = *(unsigned long *)pvUser2; NOREF(pvUser1);
comment:16 by , 13 years ago
Replying to michael:
We hope that the following patch may fix this issue, if anyone would like to give it a shot:
patch works for me
comment:17 by , 13 years ago
michael, the DECLCALLBACK patch works for me as well.
Before patch, system would always panic on suspend when vboxdrv module loaded.
Using Fedora 15, kernel 2.6.38.8-35.fc15.i686.PAE, gcc-4.6.0, VirtualBox-4.1-4.1.0_73009_fedora15-1.i686 on a Lenovo T420s. Thanks!
comment:18 by , 13 years ago
Replying to michael:
We hope that the following patch may fix this issue, if anyone would like to give it a shot:
This patch fixes the issue for me.
This bug report may now be closed as FIXED.
comment:19 by , 13 years ago
Patch provided by michael also solves suspend/hibernate issues described in #9260.
Are there any plans for fixed packages?
comment:21 by , 13 years ago
Summary: | VBox modules randomly cause kernel panic on computer shutdown → VBox modules randomly cause kernel panic on computer shutdown -> fixed as of 28-Jul 2011 |
---|
The patch above was committed on 28 July and will be contained any future releases.
comment:22 by , 13 years ago
#9407 was marked a duplicate of this. But the symptoms described here are different ( happy to be corrected ) to that in #9407, which is about the host crashing when it's suspended. Shutdown goes through without any problem whatsoever. Regardless of whether a VM is running or not, the host crashes on suspend. Uninstall VirtualBox and suspend/resume work normally.
comment:23 by , 13 years ago
Did you try if the fix from above (2011-07-28 21:15:00 by michael) helps?
comment:24 by , 13 years ago
Sorry, yes the fix above - (2011-07-28 21:15:00 by michael) seems to have fixed the problem. Thanks !!
A kernel panic screenshot