#1670 closed defect (fixed)
VBoxDrv broken on opensolaris snv_91, any attempt to start a virtual machine core dumps
Reported by: | jkeil | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 1.6.0 |
Keywords: | Cc: | ||
Guest type: | other | Host type: | other |
Description
I have an amd64 box running Solaris Express Community Edition snv_85 X86. The box had been bfu'ed to newer kernel bits multiple times. virtualbox 1.6.0 used to work fine until a week or two ago.
But since ~ May 28th I'm unable to start any virtual machine; a window is opened, but a VirtualBox process is immediately core dumping. The following gets logged to /var/adm/messages:
Jun 5 16:27:33 tiger2 vboxdrv: [ID 147122 kern.notice] VBoxDrv: VBoxDrvSolarisOpen: Dev=0x0 pSession=ffffff031d195810 pid=1408 r0proc=ffffff031c3f7070 thread=ffffff02e5268d80 Jun 5 16:27:34 tiger2 vboxdrv: [ID 641266 kern.notice] NOTICE: rtR0MemObjNativeLockUser: as_pagelock failed to get shadow pages Jun 5 16:27:48 tiger2 genunix: [ID 603404 kern.notice] NOTICE: core_log: VirtualBox[1408] core dumped: /cores/VirtualBox-1408
Core dump always looks the same, it's crashing in mmR3PagePoolTerm():
# pflags /cores/VirtualBox-1408 core '/cores/VirtualBox-1408' of 1408: /opt/VirtualBox/VirtualBox -comment Solaris 11 -startvm 3ba70c22-b8fb- data model = _LP64 flags = ORPHAN|MSACCT|MSFORK /1: flags = STOPPED pollsys(0xfffffd7fffdf9c00,0x4,0xfffffd7fffdf9dc0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /2: flags = STOPPED pollsys(0xfffffd7ffcb1fd10,0x2,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /3: flags = STOPPED lwp_park(0x0,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /4: flags = DETACH|STOPPED lwp_park(0x0,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /5: flags = DETACH|STOPPED lwp_park(0x0,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /6: flags = DETACH sigmask = 0xffffbefc,0x0000ffff cursig = SIGSEGV /7: flags = DAEMON|STOPPED lwp_park(0x0,0xfffffd7ffc70cf20,0x0) why = PR_SUSPENDED sigmask = 0xffbffeff,0x0000fff7 /8: flags = DETACH|STOPPED lwp_park(0x0,0x0,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000 /9: flags = DETACH|STOPPED pollsys(0xfffffd7ffc438a60,0x1,0xfffffd7ffc438c40,0x0) why = PR_SUSPENDED sigmask = 0x00002000,0x00000000
# pstack /cores/VirtualBox-1408 core '/cores/VirtualBox-1408' of 1408: /opt/VirtualBox/VirtualBox -comment Solaris 11 -startvm 3ba70c22-b8fb- ----------------- lwp# 1 / thread# 1 -------------------- fffffd7ffdf659ca __pollsys () + a fffffd7ffdf162c4 pselect () + 1d4 fffffd7ffdf16611 select () + 71 fffffd7ffebed939 _ZN10QEventLoop13processEventsEj () + 289 fffffd7ffec52738 _ZN10QEventLoop9enterLoopEv () + 48 000000000055e928 _ZN18VBoxProgressDialog3runEi () + 68 000000000055ec62 _ZN19VBoxProblemReporter23showModalProgressDialogER9CProgressRK7QStringP7QWidgeti () + 312 0000000000583fa9 _ZN14VBoxConsoleWnd16finalizeOpenViewEv () + 1f9 00000000004b4d2f _ZN14VBoxConsoleWnd9qt_invokeEiP8QUObject () + 44f fffffd7ffec97006 _ZN7QObject15activate_signalEP15QConnectionListP8QUObject () + 136 fffffd7ffeff6999 _ZN7QSignal6signalERK8QVariant () + 99 fffffd7ffecaf4b8 _ZN7QSignal8activateEv () + 78 fffffd7ffecb75aa _ZN16QSingleShotTimer5eventEP6QEvent () + 2a fffffd7ffec3d93d _ZN12QApplication14internalNotifyEP7QObjectP6QEvent () + 9d fffffd7ffec3daef _ZN12QApplication6notifyEP7QObjectP6QEvent () + 7f fffffd7ffec3183c _ZN10QEventLoop14activateTimersEv () + 2ac fffffd7ffebedf2d _ZN10QEventLoop13processEventsEj () + 87d fffffd7ffec52738 _ZN10QEventLoop9enterLoopEv () + 48 fffffd7ffec5268a _ZN10QEventLoop4execEv () + 2a 0000000000539e13 main () + 7a3 00000000004afc4c _start () + 6c ----------------- lwp# 2 / thread# 2 -------------------- fffffd7ffdf659ca __pollsys () + a fffffd7ffdf11a43 poll () + 63 fffffd7ffe3764be _pr_poll_with_poll () + 3b4 fffffd7ffe376649 PR_Poll () + 9 fffffd7ffcb980c1 _Z10ConnThreadPv () + 41 fffffd7ffe37831a _pt_root () + 90 fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 3 / thread# 3 -------------------- fffffd7ffdf5e2b7 __lwp_park () + 17 fffffd7ffdf581f7 cond_wait_queue () + 47 fffffd7ffdf586af __cond_wait () + 5f fffffd7ffdf586f3 cond_wait () + 23 fffffd7ffdf58719 pthread_cond_wait () + 9 fffffd7ffe3779cd PR_WaitCondVar () + 6b fffffd7ffe377cd9 PR_Wait () + 46 fffffd7ffcb94bba _ZN14DConnectWorker3RunEv () + 124 fffffd7ffe347ffe _ZN8nsThread4MainEPv () + 2e fffffd7ffe37831a _pt_root () + 90 fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 4 / thread# 4 -------------------- fffffd7ffdf5e2b7 __lwp_park () + 17 fffffd7ffdf581f7 cond_wait_queue () + 47 fffffd7ffdf586af __cond_wait () + 5f fffffd7ffdf586f3 cond_wait () + 23 fffffd7ffdf58719 pthread_cond_wait () + 9 fffffd7fff2c3930 _Z19rtSemEventMultiWaitP23RTSEMEVENTMULTIINTERNALjb () + 1a0 fffffd7ffca48b98 _ZN10HGCMThread6MsgGetEPP11HGCMMsgCore () + 38 fffffd7ffca4908e _Z10hgcmMsgGetjPP11HGCMMsgCore () + 4e fffffd7ffca4aa63 _Z10hgcmThreadjPv () + 23 fffffd7ffca48509 _Z20hgcmWorkerThreadFuncP11RTTHREADINTPv () + 39 fffffd7fff2a5884 rtThreadMain () + 34 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 5 / thread# 5 -------------------- fffffd7ffdf5e2b7 __lwp_park () + 17 fffffd7ffdf581f7 cond_wait_queue () + 47 fffffd7ffdf586af __cond_wait () + 5f fffffd7ffdf586f3 cond_wait () + 23 fffffd7ffdf58719 pthread_cond_wait () + 9 fffffd7fff2c3394 _Z14rtSemEventWaitP18RTSEMEVENTINTERNALjb () + 1a4 fffffd7ffe757e75 VMR3ReqWait () + a5 fffffd7ffe758372 VMR3ReqQueue () + 132 fffffd7ffe75852c VMR3ReqCallVU () + 18c fffffd7ffe758615 VMR3ReqCallU () + 85 fffffd7ffe755886 VMR3Create () + 1a6 fffffd7ffc9daed9 _ZN7Console13powerUpThreadEP11RTTHREADINTPv () + 1f9 fffffd7fff2a5884 rtThreadMain () + 34 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 6 / thread# 6 -------------------- fffffd7ffe73f5b6 mmR3PagePoolTerm () + 16 fffffd7ffe73d34f MMR3Term () + f fffffd7ffe73d488 MMR3Init () + a8 fffffd7ffe755b09 _Z11vmR3CreateUP3UVMPFiP2VMPvES3_ () + 139 fffffd7ffe75801b _Z18vmR3ReqProcessOneUP3UVMP5VMREQ () + 16b fffffd7ffe75820b VMR3ReqProcessU () + 6b fffffd7ffe75654d _Z19vmR3EmulationThreadP11RTTHREADINTPv () + bd fffffd7fff2a5884 rtThreadMain () + 34 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 7 / thread# 7 -------------------- fffffd7ffdf5e2b7 __lwp_park () + 17 fffffd7ffdf581f7 cond_wait_queue () + 47 fffffd7ffdf58576 cond_wait_common () + 1d6 fffffd7ffdf587cc __cond_timedwait () + 9c fffffd7ffdf58807 cond_timedwait () + 27 fffffd7fff333893 umem_update_thread () + 193 fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 8 / thread# 8 -------------------- fffffd7ffdf5e2b7 __lwp_park () + 17 fffffd7ffdf581f7 cond_wait_queue () + 47 fffffd7ffdf586af __cond_wait () + 5f fffffd7ffdf586f3 cond_wait () + 23 fffffd7ffdf58719 pthread_cond_wait () + 9 fffffd7fff2c3930 _Z19rtSemEventMultiWaitP23RTSEMEVENTMULTIINTERNALjb () + 1a0 fffffd7ffca48b98 _ZN10HGCMThread6MsgGetEPP11HGCMMsgCore () + 38 fffffd7ffca4908e _Z10hgcmMsgGetjPP11HGCMMsgCore () + 4e fffffd7ffca496c0 _Z17hgcmServiceThreadjPv () + 30 fffffd7ffca48509 _Z20hgcmWorkerThreadFuncP11RTTHREADINTPv () + 39 fffffd7fff2a5884 rtThreadMain () + 34 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start () ----------------- lwp# 9 / thread# 9 -------------------- fffffd7ffdf659ca __pollsys () + a fffffd7ffdf162c4 pselect () + 1d4 fffffd7ffdf16611 select () + 71 fffffd7ffd8c8369 IoWait () + 29 fffffd7ffd8c79a0 _XtWaitForSomething () + 180 fffffd7ffd8c7529 XtAppNextEvent () + 139 fffffd7ffd8c73bb XtAppMainLoop () + 3b fffffd7ffc456521 _Z19vboxClipboardThreadP11RTTHREADINTPv () + 311 fffffd7fff2a5884 rtThreadMain () + 34 fffffd7fff2c444c _Z18rtThreadNativeMainPv () + 6c fffffd7ffdf5dfc9 _thr_setup () + 89 fffffd7ffdf5e270 _lwp_start ()
I suspect that this is caused by onnv-gate changeset #6695
changeset 6695: 12d7dd4459fd parent: 7066e93e6b89 author: aguzovsk date: Thu May 22 22:23:49 2008 -0700 (13 days ago) permissions: -rw-r--r-- description: 6423097 segvn_pagelock() may perform very poorly 6526804 DR delete_memory_thread, AIO, and segvn deadlock 6557794 segspt_dismpagelock() and segspt_shmadvise(MADV_FREE) may deadlock 6557813 seg_ppurge_seg() shouldn't flush all unrelated ISM/DISM segments 6557891 softlocks/pagelocks of anon pages should not decrement availrmem for memory swapped pages 6559612 multiple softlocks on a DISM segment should decrement availrmem just once 6562291 page_mem_avail() is stuck due to availrmem overaccounting and lack of seg_preap() calls 6596555 locked anonymous pages should not have assigned disk swap slots 6639424 hat_sfmmu.c:hat_pagesync() doesn't handle well HAT_SYNC_STOPON_REF and HAT_SYNC_STOPON_MOD flags 6639425 optimize checkpage() optimizations 6662927 page_llock contention during I/O
The VBoxDrv kernel module is using as_pagelock() and expects that it returns a list of shadow pages, but I suspect that after the above putback the returned list of shadow pages is now NULL. This seems to confuse the VBoxDrv module.
Result is that MMR3Init() -> mmR3PagePoolInit() fails in virtualbox user land; and this ends with the VirtualBox process core dumping.
Change History (4)
follow-up: 3 comment:1 by , 17 years ago
comment:2 by , 17 years ago
I was impatient and recompiled the vboxdrv kernel module from VirtualBox-1.6.0_OSE sources, with the tests for "returned shadow pages pointer != NULL" removed.
That fixed the problem (for now).
(And I'll upgrade to 1.6.2, when it becomes available...)
comment:3 by , 17 years ago
Replying to ramshankar:
This should be fixed in 1.6.2 where we've introduced a VirtualBox kernel interface to prevent such breakage.
Yep, I can confirm that it doesn't core dump any more with VirtualBox 1.6.2 on snv_92.
So I guess this bug can be closed.
comment:4 by , 17 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
This should be fixed in 1.6.2 where we've introduced a VirtualBox kernel interface to prevent such breakage. It will be out in a few days. Please try with 1.6.2 then and see if it still persists.
Remember while installing 1.6.2 you will have to install the VirtualBoxKern package first and then install VirtualBox. Hold on, it's due to be out very soon.