Opened 15 years ago
Closed 15 years ago
#6279 closed defect (invalid)
Kernel panic when running VMs with VBoxHeadless
Reported by: | Edoardo | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 3.1.4 |
Keywords: | kernel panic VBoxHeadless | Cc: | |
Guest type: | other | Host type: | Solaris |
Description
Hello to all.
I've got an issue using VirtualBox with the VBoxHeadless mode in Solaris host.
My host is running on a Dell 400SC hardware with two 120GB disks in zfs mirror (rpool) and 6x 1TB external drive in zfs radiz (datas). Host running Solaris 10 10/08 s10x_u6wos_07b X86 with 142901-04 kernel (32bit). and I've tried both Virtualbox 3.1.2 and 3.1.4 with the same result.
I've got two linux VMs: IPCop with 2.4 kernel and Debian with 2.6 kernel
If I start and use the VMs with VirtualBox GUI over VNC or local console, all works perfectly.
Even If I try to use the VMs with the VBoxHeadless mode, the host machine crashes with kernel panic after few seconds from guest OS is loaded. Not immediately, but after few second (during logon via console or ssh on the guest OS, but it happens even if I did not logon on the guest)
No error in the VM logs, but here it is the crash log I've found in /var/adm/messages:
Feb 24 00:22:35 srv unix: [ID 836849 kern.notice] Feb 24 00:22:35 srv ^Mpanic[cpu1]/thread=cbb34dc0: Feb 24 00:22:35 srv genunix: [ID 335743 kern.notice] BAD TRAP: type=e (#pf Page fault) rp=cbb34c2c addr=14a240cb occurred in module "genunix" due to an illegal access to a user address Feb 24 00:22:35 srv unix: [ID 100000 kern.notice] Feb 24 00:22:35 srv unix: [ID 839527 kern.notice] sched: Feb 24 00:22:35 srv unix: [ID 753105 kern.notice] #pf Page fault Feb 24 00:22:35 srv unix: [ID 532287 kern.notice] Bad kernel fault at addr=0x14a240cb Feb 24 00:22:35 srv unix: [ID 243837 kern.notice] pid=0, pc=0xfe8afb35, sp=0xe2b1d9f8, eflags=0x10202 Feb 24 00:22:35 srv unix: [ID 211416 kern.notice] cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6d8<xmme,fxsr,pge,mce,pse,de> Feb 24 00:22:35 srv unix: [ID 936844 kern.notice] cr2: 14a240cb cr3: db89000 Feb 24 00:22:35 srv unix: [ID 537610 kern.notice] gs: cc0d01b0 fs: e1070000 es: 160 ds: d520160 Feb 24 00:22:35 srv unix: [ID 537610 kern.notice] edi: 6 esi: fead4ffc ebp: cbb34c7c esp: cbb34c64 Feb 24 00:22:35 srv unix: [ID 537610 kern.notice] ebx: deb9cdf8 edx: 0 ecx: 14a240cb eax: 0 Feb 24 00:22:35 srv unix: [ID 537610 kern.notice] trp: e err: 0 eip: fe8afb35 cs: 158 Feb 24 00:22:35 srv unix: [ID 717149 kern.notice] efl: 10202 usp: e2b1d9f8 ss: cbb34ca4 Feb 24 00:22:35 srv unix: [ID 100000 kern.notice] Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34b8c unix:die+a7 (e, cbb34c2c, 14a240) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34c18 unix:trap+1130 (cbb34c2c, 14a240cb,) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34c2c unix:cmntrap+9b (cc0d01b0, e1070000,) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34c7c genunix:avl_walk+2d (cc0d6598, e2b1d9f8,) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34ca4 zfs:space_map_walk+45 (cc0d6598, fead4ffc,) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34cec zfs:metaslab_sync+1b5 (cc0d6340, 87d7a, 0) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34d14 zfs:vdev_sync+a8 (cb657200, 87d7a, 0) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34d5c zfs:spa_sync+38e (d209c680, 87d7a, 0) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34da8 zfs:txg_sync_thread+22c (d28fa080, 0) Feb 24 00:22:35 srv genunix: [ID 353471 kern.notice] cbb34db8 unix:thread_start+8 () Feb 24 00:22:35 srv unix: [ID 100000 kern.notice] Feb 24 00:22:35 srv genunix: [ID 672855 kern.notice] syncing file systems... Feb 24 00:22:35 srv genunix: [ID 904073 kern.notice] done Feb 24 00:22:36 srv genunix: [ID 111219 kern.notice] dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel Feb 24 00:23:25 srv genunix: [ID 409368 kern.notice] ^M100% done: 140661 pages dumped, compression ratio 2.00, Feb 24 00:23:25 srv genunix: [ID 851671 kern.notice] dump succeeded
My Solaris version:
root@srv{~}> uname -a SunOS srv 5.10 Generic_142901-04 i86pc i386 i86pc root@srv{~}> cat /etc/release Solaris 10 10/08 s10x_u6wos_07b X86 Copyright 2008 Sun Microsystems, Inc. All Rights Reserved. Use is subject to license terms. Assembled 27 October 2008
In attachment some logs and infos about my system. YAMJ.xml.txt is the xml config file of the VM.
Attachments (8)
Change History (17)
by , 15 years ago
by , 15 years ago
Attachment: | VBox.log.txt added |
---|
by , 15 years ago
Attachment: | VBox.log.1.txt added |
---|
comment:1 by , 15 years ago
Could you please try using NAT for the VMs in question instead of bridged networking and see if it makes a difference? Additionally could you please enable core dumps as mentioned here: http://www.virtualbox.org/wiki/Core_dump ?
comment:2 by , 15 years ago
Hello.
I've tried with core dumps enabled (bridged network) as specified in the page, but no core dump files was created. I think that the OS crashes before the core dump can be generated.
-bash-3.00$ id uid=1001(vbox) gid=1(other) -bash-3.00$ coreadm global core file pattern: /var/core/core.%f.%p global core file content: all init core file pattern: %f.%p init core file content: all global core dumps: enabled per-process core dumps: enabled global setid core dumps: enabled per-process setid core dumps: enabled global core dump logging: enabled -bash-3.00$ svcs | grep coreadm online 11:44:41 svc:/system/coreadm:default
Now I'm running the VM with NAT network and all seems to be ok, but I have no possibility to stress the VM. With NAT network I can't mount the NFS share from the host to run the catalog program (yamj) and reproduce the same condition as before.
Since the other VM is an IPCop I think is possible that the problem is with bridged network.
I've tried also with PC-net guest network adapter, instead of the intel pro/1000 MT, but the result is the same.
I attach the last /var/adm/message and the /var/crash/srv/unix.3 (the vmcore is too large - 580MB)
Can I do other tests?
Thank you.
follow-up: 7 comment:3 by , 15 years ago
Sorry, I can't attach the /var/crash/srv/unix.3 to this ticket. The file is 1,7MB but the max attachment size is 400KB.
comment:5 by , 15 years ago
Yes, Solaris 10 :)
I have 4GB of swap space.
root@srv{~}> swap -l swapfile dev swaplo blocks free /dev/zvol/dsk/rpool/swap 181,1 8 4194288 4194288 root@srv{~}> swap -s total: 142092k bytes allocated + 26500k reserved = 168592k used, 4839532k available
comment:6 by , 15 years ago
From the looks of it it seems to be a ZFS issue with relinquishing ARC memory, are you running the latest patches? Also could you try limiting the ZFS arc cache in /etc/system to ~1.5 Gigs:
set zfs:zfs_arch_max = 1610612736
comment:7 by , 15 years ago
Replying to EdoFede:
Sorry, I can't attach the /var/crash/srv/unix.3 to this ticket. The file is 1,7MB but the max attachment size is 400KB.
The proper core files (here "bounds","unix.3","vmcore.3") should be a few hundred megs so it looks like the core is not valid.
comment:8 by , 15 years ago
Hello.
It's strange that this problem appear only with Headless mode and only with bridged network, not? Anyway, during this weekend I've upgraded to Solaris 10/09, totally patched the system with last kernel udates and limited che ZFS ARC as suggested Now the issue seems to be solved. I've runned one of the VMs for one day without any problem.
Thanks for the suggestion.
Bye, Edoardo.
comment:9 by , 15 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Since this is solved with applying the latest patches to Solaris I'll close the defect. Please reopen if required.
dmesg from host