VirtualBox

Opened 16 years ago

Closed 15 years ago

#2258 closed defect (fixed)

Solaris 10u5 guest hangs and consumes 100% CPU

Reported by: Shane Hjorth Owned by:
Component: VMM/HWACCM Version: VirtualBox 2.0.6
Keywords: Cc:
Guest type: Solaris Host type: Linux

Description (last modified by Frank Mehnert)

Running Virtualbox 2.0.2 on a Sun Ultra 24 running Ubuntu 8.04 (64bit) and I am finding stability to be a real problem when Intel VT-x is enabled which in turn allows me to run Solaris 10u5 x86 in 64-bit mode.

Running with VT-x disabled (same guest no running in 32-bit mode) is however stable.

After a period of time (30minutes+ depends on activity on the guest itself) when VT-x is enabled the Solaris 10u5 (64-bit) VirtualBox process will become non-responsive (hang) and CPU usage will shoot up to 100%.

The spinning thread is looping in the following call:

(gdb) where
#0  0x00007f727cbab3c7 in ioctl () from /lib/libc.so.6
#1  0x00007f727c414adf in ?? () from /usr/lib/virtualbox/VBoxRT.so
#2  0x00007f72744409ee in VMMR3HwAccRunGC ()
   from /usr/lib/virtualbox/VBoxVMM.so
#3  0x00007f727446ee6e in EMR3ExecuteVM () from /usr/lib/virtualbox/VBoxVMM.so
#4  0x00007f72744445d8 in ?? () from /usr/lib/virtualbox/VBoxVMM.so
#5  0x00007f727c3eee2c in ?? () from /usr/lib/virtualbox/VBoxRT.so
#6  0x00007f727c410c62 in ?? () from /usr/lib/virtualbox/VBoxRT.so
#7  0x00007f727d3523f7 in start_thread () from /lib/libpthread.so.0
#8  0x00007f727cbb2b2d in clone () from /lib/libc.so.6
#9  0x0000000000000000 in ?? ()

strace output

ioctl(33, 0x56c1, 0)                    = 0
ioctl(33, 0x56c1, 0)                    = 0
ioctl(33, 0x56c1, 0)                    = 0
ioctl(33, 0x56c1, 0)                    = 0
ioctl(33, 0x56c1, 0)                    = 0
ioctl(33, 0x56c1, 0)                    = 0

lsof output

COMMAND     PID  USER   FD   TYPE             DEVICE       SIZE     NODE NAME
VirtualBo 13331 shane   33u   CHR              10,63               24235 /dev/vboxdrv

I have tried a number of troubleshooting steps to solve this issue:

  • patched the Ultra 24 bios to the latest available.
  • Upgraded virtualbox from 2.0.0 -> 2.0.2
  • Ran the Solaris 10u5 system with and without VBox tools installed on the guest.
  • Installed a fresh copy of Solaris 10u5 (64bit).

Nothing has helped so far.

This issue appears to be triggered by load within the guest itself. For example I had to re-attempt the installation of Sun Communication Suite 5 three times before it would complete without the VM hanging half-way through the process. Installing the same product when the same OS is running in 32bit mode completes without a problem.

I currently have 3 Solaris 10u5 guests running.

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
13413 shane     20   0 1329m 1.1g  16m S    5 18.6  10:07.66 VirtualBox
13331 shane     20   0 1313m 1.1g  15m S   94 18.3  44:35.35 VirtualBox
13362 shane     20   0  775m 540m  15m S    4  9.1   8:24.34 VirtualBox

13331 => problem Solaris 10u5 64-bit instance with a small amount of application load 13413 => no-problem running Solaris 10u5 64-bit with no application load 13362 => no-problem running Solaris 10u5 32-bit with same level of application load as 13331

13331 Session Information (Runtime Attributes):

Screen Resolution 720x400
VT-x/AMD-V        Enabled
Nested Paging     Disabled
Guest Additions   Version 1.4

Guest OS Type     Solaris

Hard Disk Statistics
IDE Primary Master
DMA Transfers      79,145

PIO Transfers      2,873
Data Read          576,124,416 B
Data Written       1,436,581,888 B

CD/DVD-ROM Statistics
DMA Transfers
PIO Transfers      4,094
Data Read          7,942,144 B
Data Written       0 B

Network Adapter Statistics
Adapter 0
Data Transmitted  476,950 B
Data Received     66,507,622 B

Attachments (9)

VBox.log (44.1 KB ) - added by Shane Hjorth 16 years ago.
VBox Log file after hung process forcefully stopped.
rasta_vbox_08-2009-04-07-12-33-36.log (50.0 KB ) - added by rasta 16 years ago.
VistualBox 2.1.4 log - rasta
SolarisGuy-VBox-Hung-64bit-S10u6.log (41.4 KB ) - added by Geoff Ongley 16 years ago.
S10U6 Hung VM VBox.log on Solaris Express nv103, VBox 2.2.2
2009-05-30-14-55-34.086-VirtualBox-1612.log (225 bytes ) - added by Geoff Ongley 16 years ago.
VBOX PID 1612 on VBox 2.2.4
2009-05-30-14-55-21.072-VirtualBox-1611.log (225 bytes ) - added by Geoff Ongley 16 years ago.
VBOX PID 1611 on VBox 2.2.4
Vbox.log-clus0-PID-1611.log (39.1 KB ) - added by Geoff Ongley 16 years ago.
VBox.log for PID 1611
Vbox.log-clus1-PID-1612.log (38.8 KB ) - added by Geoff Ongley 16 years ago.
VBox.log for PID 1612
08-2009-06-01-11-06-59_rasta.log (47.7 KB ) - added by rasta 16 years ago.
Log of hung 64-bit Solaris 10 u5 guest, Vbox 2.2.4
08-2009-06-01-15-00-01_rasta.log (63.7 KB ) - added by rasta 16 years ago.
Vbox 2.2.4 log of hung Solaris 10 u5 guest after Power Off.

Download all attachments as: .zip

Change History (57)

by Shane Hjorth, 16 years ago

Attachment: VBox.log added

VBox Log file after hung process forcefully stopped.

comment:1 by Frank Mehnert, 16 years ago

Component: VMMVMM/HWACCM
Description: modified (diff)
priority: majorcritical

comment:2 by Michael Thayer, 16 years ago

For the record, the ioctl calls are just the places where the guest system is executed, so they are perfectly normal.

comment:3 by Sander van Leeuwen, 16 years ago

Try again with 2.0.6 please.

comment:4 by Shane Hjorth, 16 years ago

I've updated to VirtualBox 2.0.6, Ubuntu 8.10 and installed Solaris 10 update 6 afresh (so the latest of everything).

Initially I was able to run 3xSolaris 10u6 VM's in 64bit mode (VT-x enabled) but overnight all three VM's returned to the hung state (non-responsive) and were consuming 100% of a core of the hosts CPU (4 core host system).

The 1xRedhat EL 4 guest that was also running at the same time did not experience the same symptoms.

Running with 32bit mode instead (VT-x disabled), the same VM's have been stable for two days now.

comment:5 by Frank Mehnert, 16 years ago

Version: VirtualBox 2.0.2VirtualBox 2.0.6

comment:6 by Sander van Leeuwen, 16 years ago

Resolution: fixed
Status: newclosed

Reopen if this still happens with 2.1.4 please.

comment:7 by Shane Hjorth, 16 years ago

Resolution: fixed
Status: closedreopened

I just upgraded to the latest of everything and the problem persists:

=> Ubuntu 8.10 + latest patches.
=> Virtualbox 2.1.4
=> Solaris 10 update 6 (Generic_137138-09) (64bit when VT-x is enabled)

Runtime Attributes

Screen Resolution 720x400
VT-x/AMD-V        Enabled
Nested Paging     Disabled
Guest Additions   Not Detected
Guest OS Type     Solaris

General 

Name:             Solaris 10u6 (stewie)
OS Type:          Solaris
Base Memory:      1280 MB
Video Memory:     8 MB
Boot Order:       Floppy, CD/DVD-ROM, Hard Disk
ACPI:             Enabled
IO APIC:          Disabled
VT-x/AMD-V:       Enabled
Nested Paging:    Enabled
PAE/NX:           Enabled
3D Acceleration:  Disabled

Over-night two of the three Solaris 10 update 6 VBox instances hung, consuming 100% CPU:

top - 09:36:34 up 1 day, 20 min,  2 users,  load average: 2.04, 2.11, 2.09
Tasks: 143 total,   1 running, 142 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.6%us, 46.7%sy,  0.0%ni, 51.6%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   6053572k total,  6018704k used,    34868k free,   113968k buffers
Swap:  9936160k total,     5320k used,  9930840k free,  1387760k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                    
11162 shane     20   0 1591m 1.3g  18m S  100 22.5 793:19.80 VirtualBox                                                  
11460 shane     20   0 1562m 1.3g  17m S    8 22.3 200:41.92 VirtualBox                                                  
11480 shane     20   0 1562m 1.3g  17m S  100 22.3 746:55.31 VirtualBox                                                  

comment:8 by woboyle, 16 years ago

I am running Solaris 10 on an 8-core CentOS 5.2 (RHEL) system in 64-bit mode with VT-x enabled, using VBox 2.1.4 and I am experiencing the same problem with the guest (Solaris 10) hanging. I have to do a hard shutdown of the VM and restart to get it back. I'm not losing any data (so far), but this instability makes this an unacceptable situation. I haven't tried with VT-x disabled. FWIW, I am running on a custom server/workstation with an Intel S5000XVN motherboard, dual E5450 processors, and 8GB of RAM.

comment:9 by sej7278, 16 years ago

I find that a Solaris 10u6 guest in 64-Bit mode with VT-x enabled will stop responding within 1-2 hours of being booted. Nothing appears to be reported on the guest, the VM just stops working - no network activity or way to unlock the screen (if not running headless).

I tried to install a recent Solaris patch-cluster and the guest hung before the cluster could even finish installing!

Host: Fedora 10 64-Bit, VirtualBox 2.1.4 (also happened in 2.1.2), Core2Quad Q6600, 8Gb RAM.

OpenSolaris 2008.11 doesn't seem to have the problem - but then it also has working screen resizing and mouse integration unlike regular Solaris.....

comment:10 by Ramshankar Venkataraman, 16 years ago

OpenSolaris? 2008.11 doesn't seem to have the problem - but then it also has working screen resizing and mouse integration unlike regular Solaris.....

Solaris 10 and Solaris Nevada both have guest additions that work. Which version of guest additions are you using? Consider installing the latest from the 2.2.0 BETA1 release. Also after uninstalling/reinstalling make sure you reboot the guest.

comment:11 by Frank Mehnert, 16 years ago

I've deleted your last comment. Please use the Attach button for attaching your VBox.log file. Thank you!

comment:12 by rasta, 16 years ago

With Vbox 2.1.4 on a 32-bit WinXPsp3 host (dual quad-core INTEL Xeon E5345 processors, 4 GB RAM) with VT-x, ACPI, and IO APIC enabled, my 64-bit Solaris 10 u5 guest (fully patched as of 4/7/09) is hanging periodically (every 1-2 hours), as has been experienced by the other users above. After the hang, the host Vbox process is pegged at 100%, also as above. Sometimes the hang happens instantaneously, and sometimes guest desktop functionality slowly diminishes until it is completely disabled, followed by the hang.

I am also experiencing occasional sluggish desktop behavior of the guest, with applications slow to respond. Sometimes, a process will hang unless I start another JDS application, which seems to kick-start the sluggish process into activity.

I have not noticed the hang when running the guest in 32-bit, but have not tried that often (only once with Vbox 2.1.4).

comment:13 by rasta, 16 years ago

I forgot to add that I have installed the 2.1.4 Solaris guest additions. At least in 64-bit, the Solaris guest is too unstable to be used as a production machine with Vbox 2.1.4 and my setup.

by rasta, 16 years ago

VistualBox 2.1.4 log - rasta

comment:14 by rasta, 16 years ago

I just attached a log file (rasta_vbox_08-2009-04-07-12-33-36.log) produced by the Solaris VM after a VM shutdown that I initiated after the guest was partially hung (most X windows features would not work, but xterms would).

comment:15 by rasta, 16 years ago

Please note that the hanging Solaris guests described above by various users are occurring on BOTH 32-bit and 64-bit hosts. The Known Issues of Vbox 2.2 beta 1 include instabilities of 64-bit guests on 32-bit hosts only.

comment:16 by Shane Hjorth, 16 years ago

Problem is still occuring with Solaris 10u6 in 64bit mode and Virtualbox 2.2 (full) with VT-x enabled:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                          
17388 shane     20   0 1577m 1.3g  29m S   10 22.2  42:35.12 VirtualBox
17350 shane     20   0 1576m 1.3g  29m S    8 22.2  29:14.16 VirtualBox                                                        
16850 shane     20   0 1579m 1.3g  29m S  100 22.1 189:47.64 VirtualBox                                                        
17632 shane     20   0 1321m 1.0g  32m S    9 18.1  68:00.45 VirtualBox                                                        

17388/17350 are Solaris 10u6 (not yet hung) 16850 is Solaris 10u6 (hung and spinning at 100%) 17632 is Redhat Linux EL 4 (not hung)

comment:17 by rasta, 16 years ago

I can confirm that the problem recurred with Vbox 2.2, Solaris 10 u5 guest in 64-bit mode, WinXP 32-bit host with VT-x enabled. As in my post of 2009-04-07 20:40:41, the problem manifests itself by the guest windowing system dying, while xterms still work for a while. Once the problem occurs, there is alternative but to reboot the guest either by VM power off and restart, or by rebooting the guest from an xterm.

comment:18 by rasta2, 16 years ago

Same as above with Vbox 2.2.2.

comment:19 by Warren Strange, 16 years ago

This problem also appears to happen on OpenSolaris 2009/06 (dev) in 64 bit mode. VirtualBox version is 2.2.2

I am running a Java application (OpenSSO/glassfish) that is started when the system boots.

In 32 bit mode I never have any problems (from first boot to application up is about 3 minutes). Solaris run queue length never goes above 3.5

In 64 mode, the system becomes very sluggish. The solaris run q goes up to 65+. The system can spin like this for 10-20 minutes (or more - I have killed it few times and not waited for it to finish). The JVM consumes 100% of cpu. Trussing it suggests it is making a lot of poll() calls.

Note that I was able to boot OK in 64 bit mode if I tune the JVM memory down (from -Xmx down from 800mb to 512mb). In 32 bit mode this makes no difference at all. It would appear as if some memory demand crosses some threshold and then causes the system to thrash - but only in 64 bit mode. Note the virtual image is allocated 1.5 GB - so it has plenty of memory (my host has 4 GB). The RSS of the JVM is ~250 mb or so (no matter what the -Xmx setting is).

comment:20 by sej7278, 16 years ago

virtualbox 2.2.2, fedora10 64-bit 8gb q6600 host, vt-x/pae/nx enabled, 1gb ram allocated to each guest.

the crash still happens even with solaris 10u7, however not opensolaris 2008.11 (both running at the same time).

this is with no applications running, just gnome and a terminal window, guest additions installed (no 3d acceleration, vrdp or shared folders/clipboard).

comment:21 by bqbauer, 16 years ago

I just discovered this bug and reproduced it on a 2008.11 64-bit host with a 64-bit Solaris 10u6 guest under VB 2.2.2 with additions installed. I also was logged in with only a terminal window open. I then installed Solaris 10u7 with all the same settings in VB, and have not had any problems after several hours.

VB guest settings are 2048MB memory, ACPI, IO APIC, VT-x, PAE, and 3D enabled. IDE Controller type set to ICH6 instead of the default PIIX4. 25GB SATA hard disk, Intel PRO/1000 NAT. All other features disabled (USB, audio, etc).

Both guest OS installations were fresh installs with a ZFS root file system and the full distribution. No OS or filesystem tweaks or extra patches.

Host is a Q6600 quad core with 8GB memory (with 2008.11 64-bit as mentioned above).

comment:22 by rasta, 16 years ago

bqbauer's findings are potentially significant. Can anyone else confirm that the 64-bit guest problem does not occur with a Solaris 10 u7 guest?

Also, does enabling PAE do anything for Solaris guests? I thought not, and don't use it. Does everyone recommend enabling it with a WinXP 32-bit host with VT-x? Pros and cons?

comment:23 by bqbauer, 16 years ago

I wanted to see if something simple like IDE versus SATA drives on the guest had an impact on this stability issue. I have two u7 guest installations identical in every way except for the IDE controller and hard drive. I ran both of them together on the same host for almost four hours today, and both remained completely stable the entire time. I gave each only 1.5GB memory for the sake of my 8GB host. I will keep these guests for a while if anyone wants me to try different settings.

Regarding PAE, I only had it enabled because of something I read when 64-bit guest support was first announced, although I now think I misread it. I had thought PAE was recommended for all 64-bit guests even though they don't really need this, but it seems it is only an issue for very few.

comment:24 by Shane Hjorth, 16 years ago

I experienced a Solaris 10 guest hang over-night with a fresh installation of Solaris 10 update 7 using ZFS as the root-filesystem, Virtualbox 2.2.2 and a single IDE (primary-master) 20GB virtual disk.

Switching to 32-bit mode has stabilised the guest.

comment:25 by bqbauer, 16 years ago

shjorth, For my own curiosity, can you post your current host details, such as the specific CPU model and total memory? Are you still running the tests on 64-bit Ubuntu 8.10? Too bad the problem can't be created at will by running a specific app or script.

comment:26 by Shane Hjorth, 16 years ago

Following CPU in use (4 cores, so the following is repeated 4 times):

cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 15
model name      : Intel(R) Core(TM)2 Extreme CPU Q6850  @ 3.00GHz
stepping        : 11
cpu MHz         : 2003.000
cache size      : 4096 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 10
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant_tsc arch_perfmon pebs bts rep_good nopl pni monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr lahf_lm
bogomips        : 5993.98
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:
<snip>

shane@quagmire:~$ cat /proc/meminfo 
MemTotal:      6053572 kB
MemFree:         42952 kB
Buffers:        109368 kB
Cached:         855980 kB
SwapCached:          0 kB
Active:        4794608 kB
Inactive:       802824 kB
SwapTotal:     9936160 kB
SwapFree:      9930788 kB
Dirty:            1820 kB
Writeback:           0 kB
AnonPages:     4632392 kB
Mapped:         206448 kB
Slab:           177808 kB
SReclaimable:    69060 kB
SUnreclaim:     108748 kB
PageTables:      22748 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
WritebackTmp:        0 kB
CommitLimit:  12962944 kB
Committed_AS:  4878484 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    344448 kB
VmallocChunk: 34359393811 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
HugePages_Surp:      0
Hugepagesize:     2048 kB
DirectMap4k:     13632 kB
DirectMap2M:   6211584 kB

shane@quagmire:~$ uname -a
Linux quagmire 2.6.27-11-generic #1 SMP Wed Apr 1 20:53:41 UTC 2009 x86_64 GNU/Linux
(Ubuntu 8.10)

comment:27 by bqbauer, 16 years ago

For whatever help this is, I installed a new u7 guest on my system at work and left it running over night, and that system is still operational.

One thing different about this one is that it is on an i7 CPU with Nested Paging (or its equivalent) enabled. I also have it configured as an NIS client. As with my other successful attempts, this has a SATA drive on the guest. I did notice an issue both at home and at work where the guest terminal window doesn't respond to my typing unless I click outside the guest (not just the terminal) and then back inside. I thought this was an issue with the additions, but maybe not.

Why is this discussions window so wide...?

comment:28 by bqbauer, 16 years ago

Update: I ran the above u7 guest for just over three days without any problems. The only time I had a u7 guest lock up was with an IDE drive (virtual drive). All of my tests with SATA in the guest have been without incident. Could this have something to do with my host being OpenSolaris instead of Linux?

Also, for no good reason I have been trying out the new ICH6 IDE controller type with these tests.

Hope this helps someone.

comment:29 by Geoff Ongley, 16 years ago

This Problem appears to happen for me as well on SXCE (Solaris Express) as the host (tried on b103 -> 109), and Solaris 10 Update 6 Guests, with ZFS disks in the guest OR with UFS disks in the guest. ZFS in the host.

Truss'ing the process thread that is chewing the core, you see:

2011/9:		yield()						= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		yield()						= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
2011/9:		yield()						= 0

Over and Over again.

The file descriptor its doing ioctls against is:

  45: S_IFCHR mode:0600 dev:328,0 ino:155713540 uid:0 gid:3 rdev:297,0
      O_RDWR|O_LARGEFILE FD_CLOEXEC
      /devices/pseudo/vboxdrv@0:vboxdrv

This problem occurs on my AMD64x2 box with AMD-V on, and on my Core 2 Duo with VT-x enabled. This has happened as far as I can tell in every release I have tried of 2.x, right up to the latest 2.2.2 but it may be an older problem than that.

Switching off AMD-v or VT-x (switching the guest to 32-bit), as most others have stated, is stable.

This bug is starting to hurt on guests I need to be 64-bit (Sun Cluster requires 64-bit S10).

Looking at the VBox.Log for the busy/hung Solaris 10 guests, it appears to of stopped logging, the last entry is when the guest was resumed after a snapshot.

What can I do to help resolve this?

comment:30 by Frank Mehnert, 16 years ago

SolarisGuy, please could you attach a VBox.log file of VirtualBox 2.2.2 when the guest produced that huge load? Note that the ioctl's are completely normal as they are using to switch to the guest.

sej7278, you are talking about a crash but this ticket is about guest hangs and 100% CPU load. Please open a separate bug report for your problem if you really observe a VBox crash.

by Geoff Ongley, 16 years ago

S10U6 Hung VM VBox.log on Solaris Express nv103, VBox 2.2.2

comment:31 by Geoff Ongley, 16 years ago

Done - attached as "SolarisGuy-VBox-Hung-64bit-S10u6.log".

Thanks for taking a look at this Frank. So knowing this thread is behaving as its meant to (in terms of doing the ioctls) is good, thanks. I've verified a behaving process is doing the same thing as you suggest; it is busy ioctl-ing away.

Some more info will hopefully help.

Output of prstat -mL; looking at thread 9 on PID 2709 particularly (thread consuming the CPU). Note the 100% system time for this thread only (other threads for PID 2709 are doing near nothing).

  PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
  2709 geoff    0.2 100 0.0 0.0 0.0 0.0 0.0 0.1 193 340  1K   0 VirtualBox/9
  2710 geoff    0.9 3.3 0.0 0.0 0.0 0.0  96 0.4 539  40  4K   0 VirtualBox/9
   763 root     0.1 0.1 0.0 0.0 0.0  48  51 0.1  29   0 790   0 mountd/1
  2729 geoff    0.0 0.1 0.0 0.0 0.0 0.0 100 0.0  54   0 540   0 prstat/1
  2710 geoff    0.1 0.0 0.0 0.0 0.0 0.0  99 0.4 100   0 700   0 VirtualBox/1
  2709 geoff    0.1 0.0 0.0 0.0 0.0 0.0  99 0.6 102   0 700   0 VirtualBox/1
   438 daemon   0.0 0.1 0.0 0.0 0.0 0.0 100 0.0  20   0 400   0 rpcbind/1
  2709 geoff    0.0 0.0 0.0 0.0 0.0 0.0  99 0.6 501   0 501 501 VirtualBox/13
  2687 geoff    0.0 0.1 0.0 0.0 0.0 100 0.0 0.0   1   0   7   0 VBoxSVC/6
  2710 geoff    0.0 0.0 0.0 0.0 0.0 0.0  99 0.7 501   0 577 501 VirtualBox/13
  1694 root     0.0 0.0 0.0 0.0 0.0 100 0.0 0.0   6   0  62   0 named/2
   994 noaccess 0.0 0.0 0.0 0.0 0.0 0.0 100 0.4  83   0  83   0 java/14
   170 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  62   0 372   0 nscd/26
   765 daemon   0.0 0.0 0.0 0.0 0.0 0.0 100 0.0  15   0   6   0 nfsd/10521
  2709 geoff    0.0 0.0 0.0 0.0 0.0 100 0.0 0.2  64   0  64   0 VirtualBox/7
   765 daemon   0.0 0.0 0.0 0.0 0.0 0.0 100 0.1  15   0   6   0 nfsd/10520
.
.
.

So lets take a deeper look - looking at the behaving process (doing all the ioctls too), vs this misbehaving process:

2710 - behaving

2709 - 100% System time, hung unusable VM

The behaving process is getting a boat load more of these to complete. Something is slowing this thread (PID 2709 thread 9) down, for some reason.

[root@lightning ~]# dtrace -n 'syscall::*ioctl:entry /execname=="VirtualBox"/ { @[pid] = count(); }'
dtrace: description 'syscall::*ioctl:entry ' matched 1 probe
^C

     2679              151
     2709            67253
     2710           235749

(this was run for a couple of mins, both VMs idle, both 64 bit).

A "regular" plain jane prstat shows the process and all its threads consuming "50%" of the box, or 100% of one CPU (Dual Core AMD 64 in this case)

   PID USERNAME  SIZE   RSS STATE  PRI NICE      TIME  CPU PROCESS/NLWP
  2709 geoff    1129M 1094M cpu0    20    0   1:10:59  50% VirtualBox/26
  2710 geoff    1129M 1094M run     54    0   0:26:01 2.9% VirtualBox/26
  2687 geoff      18M   11M sleep   59    0   0:00:18 0.1% VBoxSVC/10
   763 root     4080K 2624K sleep   59    0   0:04:18 0.1% mountd/9
   438 daemon   3356K 1488K sleep   59    0   0:01:09 0.1% rpcbind/1
.
.
.

Hope this helps!

comment:32 by Shane Hjorth, 16 years ago

I wanted to see if something simple like IDE versus SATA drives on the guest had an impact on this stability issue.

I tried using a SATA disk instead of IDE with a fresh Solaris 10 u7 installation (VBox 2.2.2). Overnight the guest hung and consumed 100% CPU so this didn't help unfortunately.

comment:33 by rasta, 16 years ago

Given that we should only run 32-bit Solaris guests until this problem is fixed, is it recommended that we also disable VT-x when doing so?

Today, I experienced unstable Xorg/Gnome behavior with my Sol 10 u5 32-bit guest. VT-x was enabled.

comment:34 by Sander van Leeuwen, 16 years ago

2.2.4 contains a fix for 64 bits Solaris guests that use 100% CPU. The guest itself is still responding though. Check if this update resolves your issue.

comment:35 by Sander van Leeuwen, 16 years ago

Summary: Solaris 10u5 guest hangs and consumes 100% CPUSolaris 10u5 guest hangs and consumes 100% CPU -> check with 2.2.4

comment:36 by Geoff Ongley, 16 years ago

Hi Frank,

I was very excited to see this update, thanks.

I have booted 2 64 bit Solaris 10 guests on 2.2.4, both are update 6, UFS based hosts; they are living on a nevada 103 box with ZFS root.

I left them on overnight.

Both 64 bit guests are hammering the CPU and are not responding as per previous releases.

prstat -mL:

 PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID 
  1612 geoff    0.2  99 0.0 0.0 0.0 0.0 0.0 0.6 194 573  1K   0 VirtualBox/9
  1611 geoff     26  72 0.0 0.0 0.0 0.0 0.0 2.0 176 856 .7M   0 VirtualBox/9

Both thread 9 again; oddly one hasn't completely been caught up in SYS time, but its right up there. Perhaps this is some sort of change in behaviour?

Both are unresponsive none the less.

Dtrace output, we can see the second PID 1611 is getting more ioctls done:

[root@lightning ~]# dtrace -n 'syscall::*ioctl:entry /execname=="VirtualBox"/ { @[pid] = count(); }'
dtrace: description 'syscall::*ioctl:entry ' matched 1 probe
^C

     1612            26168
     1611         11971636

So even though 1611 is hung, it is still getting a lot more ioctl sys calls done, probably because it hasn't totally got stuck in system time only? (yet?)

Also, here's something odd. PID 1611 thread 9 (process with ~70% SYS, 30% USR), is doing a stack more ioctls before any yield() is performed:

1611/9 truss output:

1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		yield()						= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1611/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0

and 1612/9 (process at virtually 100% SYS- but slightly less this time, at 99%)"

1612/9:		yield()						= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		yield()						= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		yield()						= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		ioctl(45, _ION('V', 193, 0), 0x00000000)	= 0
1612/9:		yield()						= 0

I will leave them running, see if they both end up in the same spot.

I will attach both VBox logs for these guests.

Hope this helps!

by Geoff Ongley, 16 years ago

VBOX PID 1612 on VBox 2.2.4

by Geoff Ongley, 16 years ago

VBOX PID 1611 on VBox 2.2.4

by Geoff Ongley, 16 years ago

Attachment: Vbox.log-clus0-PID-1611.log added

VBox.log for PID 1611

by Geoff Ongley, 16 years ago

Attachment: Vbox.log-clus1-PID-1612.log added

VBox.log for PID 1612

comment:37 by Nikolay Igotti, 16 years ago

Behavior you see with ioctl is normal - ioctl is used as interface between userland->kernel driver, to perform guest execution. As this is private VirtualBox interface - truss prints garbage.

Anything special your guests doing?

comment:38 by Geoff Ongley, 16 years ago

Frank has previously talked about ioctls; I know they are expected to happen - the point I was making is on the hung guests, this is almost the ONLY thing happening (note prstat output with microstate accounting - you can see this is the only thread doing work).

There is however difference in behaviour between the processes to previous outputs - note the differences in time before a yield() operation (ie I'm voluntarily giving up CPU) between the two hung processes on 2.2.4.

Previously, all hung processes I saw displayed the few ioctls, yield() behaviour. But its possible I just never saw the 70% SYS 30% balance of syscalls (ioctls) that I now see in one of the two hung VM processes.

Previously what we had seen was a hung process would be doing nothing more than a few ioctls; then a yield. I'm not sure if its a new thing to see a hung process on 2.2.4 have a different

behaviour in terms of what the process is doing under the hood, but still being hung

(note the second process is NOT entirely caught up in system time doing these ioctls with this

one thread).

I don't think the truss output is garbage and should be so easily discounted; when combined with the other information it prints a picture - clearly we're tied up in one thread; and generally that thread is giving up cpu time very shortly after it gets a chance to run, and this is ongoingly happening.

This I have found is often also triggered when you get towards or at the end of a JES installation/cluster install; that commonly could also cause a hang of this nature, but 50% or more of the time the JES install would complete successfully without a hang.

We've basically covered a lot of this earlier in the thread, my last post was an update of behaviour seen on 2.2.4. If you're interested in being of assistance with this problem, please read through the earlier posts as well.

comment:39 by rasta, 16 years ago

With 2.2.4, I have just had a completely hung 64-bit Solaris 10 u5 guest with pegged cpu on a WinXP 32-bit host. ACPI, IO APIC, and VT-x enabled. 1492 MB ram for guest, 128 MB vram. I will attach the log of the hung guest. The guest had been booted for less than 2 hours. No way to kill the hung guest peacefully, only brutal guest reset with Vbox crash and Windows core dump.

by rasta, 16 years ago

Log of hung 64-bit Solaris 10 u5 guest, Vbox 2.2.4

comment:40 by rasta, 16 years ago

I am about to attach another log showing results after I had to power off a hung guest with pegged cpu.

by rasta, 16 years ago

Vbox 2.2.4 log of hung Solaris 10 u5 guest after Power Off.

comment:41 by Sean, 16 years ago

Same thing for Solaris 10 U7 Guest, 64 bit, on OpenSolaris 2009.06 host, VB 2.2.4.

comment:42 by Sean, 16 years ago

Was trying to upload the gcore output of the VirtualBox process, but the upload limit is only 400K and the core file is a few hundred MB.

comment:43 by Frank Mehnert, 16 years ago

Please contact me via private E-mail at frank _dot_ mehnert _at_ sun _dot_ com either to tell me the address of your private server or I can tell you the address of an FTP server for uploading.

comment:44 by rasta, 16 years ago

Same problem with Solaris 10 u7 64-bit guest on 32-bit WinXP host. Let me know if I can supply you with any further helpful information (logs, gcore, etc.)

comment:45 by Geoff Ongley, 15 years ago

Hey again,

The issue continues with VirtualBox 3.0.

Is there anything else further we can do to try and assist with solving this problem?

comment:46 by Frank Mehnert, 15 years ago

Summary: Solaris 10u5 guest hangs and consumes 100% CPU -> check with 2.2.4Solaris 10u5 guest hangs and consumes 100% CPU

We think we found and fixed the problem. Please contact me at frank _dot_ mehnert _at_ sun _dot_ com to request a VBox 3.0.5 test build.

comment:47 by Frank Mehnert, 15 years ago

You can try VirtualBox 3.0.6 Beta 1, see the announcement.

comment:48 by Frank Mehnert, 15 years ago

Resolution: fixed
Status: reopenedclosed
Note: See TracTickets for help on using tickets.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette