VirtualBox

Opened 14 years ago

Closed 14 years ago

#8231 closed defect (fixed)

X Segmentation fault on startx -> fixed on trunk as of 2011-02-04

Reported by: rbhkamal Owned by:
Component: guest additions Version: VirtualBox 3.2.12
Keywords: Cc:
Guest type: Linux Host type: Windows

Description

Occasionally, vboxvideo_drv.so crashes X with segmentation fault at address 0xc. Unfortunately, I haven't been able to reproduce the problem on a system that allows switching to the console (so I can copy the logs). However, I was able to take a screen shot of the error.

The workaround, is when X fails to start, restart the guest services and reload the vboxvideo_drv module then try to start X again.

The guest is a *live* stripped down Ubuntu 10.04.1 running linux-kernel-x86-2.6.32-28-generic and X server version 1.7.6

The problem seems to get worse when the guest takes longer to boot, however, following the workaround above always works.

See attached for crash trace.

fyi, I'm currently requesting for permission to provide you with the ISO or maybe an export of the virtual machine.

Attachments (5)

vboxdrv_crash.png (31.8 KB ) - added by rbhkamal 14 years ago.
Screenshot of the segmentation fault
Xorg.0.log.old (27.4 KB ) - added by rbhkamal 14 years ago.
The first failed attempt to start X
Xorg.0.log (26.1 KB ) - added by rbhkamal 14 years ago.
Second attempt to start X works
logs.7z (13.9 KB ) - added by rbhkamal 14 years ago.
Some more logs (udev, casper.log) and output of lsmod and ps -ef after the second attempt to start X
startgui.sh (223 bytes ) - added by rbhkamal 14 years ago.
This shell is start by another shell which is started by rc.local

Download all attachments as: .zip

Change History (28)

by rbhkamal, 14 years ago

Attachment: vboxdrv_crash.png added

Screenshot of the segmentation fault

comment:1 by rbhkamal, 14 years ago

Forgot to mention that this happens with 3.2.12 3.2.10 and 3.2.8. Please let me know if you need the ISO. Or if there is anything you'd like me to try.

Thanks, RK

comment:2 by Michael Thayer, 14 years ago

If you are able to continue using the machine without rebooting then the log should still be there.

comment:3 by rbhkamal, 14 years ago

Alright, I'm in the process of getting some logs. Its very tricky nice the guest OS locked-down and doesn't allow switching to the console, and it doesn't have a file manager nor an X terminal.

by rbhkamal, 14 years ago

Attachment: Xorg.0.log.old added

The first failed attempt to start X

by rbhkamal, 14 years ago

Attachment: Xorg.0.log added

Second attempt to start X works

by rbhkamal, 14 years ago

Attachment: logs.7z added

Some more logs (udev, casper.log) and output of lsmod and ps -ef after the second attempt to start X

comment:4 by rbhkamal, 14 years ago

Please note that the date on the guest machine was not set correctly but I just made these logs today.

comment:5 by Michael Thayer, 14 years ago

Keywords: vboxdrv removed

This looks like the same issue as ticket #5788. That one is closed as fixed, but the "solution" may also have been that the updated X server hid the bug.

comment:6 by rbhkamal, 14 years ago

Hummm... this might explain why trying to start X again works. Seems like a race condition?

comment:7 by Michael Thayer, 14 years ago

What seems strange to me is that the log looks like the server actually started successfully and stopped again. The segfault is in code executed during startup, which the log suggests has already been executed. I know that the server has or had a "generation" mechanism which implied it starting and stopping several times during the lifetime of the server process - perhaps that is involved here.

comment:8 by rbhkamal, 14 years ago

I'm not sure that I understand what you mean by "generation mechanism" but here is the life line of the guest OS:

Start VM
  \> execute startx
    ---- If the startx returns, then startx again.
    ---- If X fails three times in a row, halt/power off the VM.

Please see startgui.sh below

by rbhkamal, 14 years ago

Attachment: startgui.sh added

This shell is start by another shell which is started by rc.local

comment:9 by Michael Thayer, 14 years ago

Is there any way you can install debugging symbols for the server and get a backtrace in gdb? The automatic X server backtrace is nice, but not quite as good as a real one.

comment:10 by Michael Thayer, 14 years ago

And the generation mechanism is something internal to the X server. It is a way of starting and stopping the server without ending the server process or reprobing all hardware, but I don't know anything more about it myself, and I am not even sure if it ever worked.

comment:12 by Michael Thayer, 14 years ago

So based on that link this probably happens when the server terminates and automatically restarts without ending the server process.

comment:13 by Michael Thayer, 14 years ago

Reproduced by starting the X server as plain

$ X

then starting an xterm on it (from a virtual terminal) and exiting it.

comment:14 by Michael Thayer, 14 years ago

The faulting address looks to me like the line

            VGAHWPTR(pScrn)->IOBase = pScrn->domainIOBase;

in vboxvideo.c and VGAHWPTR(pScrn) is NULL.

comment:15 by Michael Thayer, 14 years ago

We call vgaHWFreeHWRec in VBOXCloseScreen, which is called at the end of each server generation, but we call vgaHWGetHWRec to allocate the record in VBOXPreInit, which is called at the start of the first generation only. I will change this tomorrow and see if it fixes the issue.

comment:16 by rbhkamal, 14 years ago

I can test it as well, however, I can't find any instructions on how to install the opensource guest additions (self compiled) manually, right now I just run the installer from the additions ISO.

I'm also still trying to set things up with gdb, however, I'm having a hard time starting X using gdb on startup.

comment:17 by Michael Thayer, 14 years ago

Could you try this build?

which is a test build from the 4.0 stable branch (please see [wiki:Testbuilds here)?

comment:18 by rbhkamal, 14 years ago

Just to be sure, I'm doing everything correctly: 1- Install the build 2- Get the guestAdditions.ISO and upgrade the additions for the guest 3- test

comment:19 by rbhkamal, 14 years ago

Alright, I've launched the machine about 6 times and no crashes. However, I was only able to test it with the testbuild virtualbox installed on the host OS. If I try to test using VirtualBox 3.2.12 with the testbuild guest additions, the virtual machine crashes immediately when X is trying to start.
It seems like the problem is fixed, is it possible to give me a patch so I can try and patch the 3.2.12 guest addition. This way I can test it with minimum changes to the test bed.

Thanks

comment:20 by Michael Thayer, 14 years ago

Here is an untested backport of the change to 3.2:

Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c
===================================================================
--- src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c	(révision 69858)
+++ src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c	(copie de travail)
@@ -802,10 +802,6 @@
     /* Framebuffer-related setup */
     pScrn->bitmapBitOrder = BITMAP_BIT_ORDER;
 
-    /* VGA hardware initialisation */
-    if (!vgaHWGetHWRec(pScrn))
-        return FALSE;
-
 #ifdef VBOX_DRI
     /* Load the dri module. */
     if (!xf86LoadSubModule(pScrn, "dri"))
@@ -857,6 +853,10 @@
     VisualPtr visual;
     unsigned flags;
 
+    /* VGA hardware initialisation */
+    if (!vgaHWGetHWRec(pScrn))
+        return FALSE;
+
     if (pVBox->mapPhys == 0) {
 #ifdef PCIACCESS
         pVBox->mapPhys = pVBox->pciInfo->regions[0].base_addr;
Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c
===================================================================
--- src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c	(révision 69858)
+++ src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c	(copie de travail)
@@ -637,10 +637,6 @@
     /* Framebuffer-related setup */
     pScrn->bitmapBitOrder = BITMAP_BIT_ORDER;
 
-    /* VGA hardware initialisation */
-    if (!vgaHWGetHWRec(pScrn))
-        return FALSE;
-
     TRACE_EXIT();
     return (TRUE);
 }
@@ -668,6 +664,11 @@
     unsigned flags;
 
     TRACE_ENTRY();
+
+    /* VGA hardware initialisation */
+    if (!vgaHWGetHWRec(pScrn))
+        return FALSE;
+
     /* We make use of the X11 VBE code to save and restore text mode, in
        order to keep our code simple. */
     if ((pVBox->pVbe = VBEExtendedInit(NULL, pVBox->pEnt->index,

comment:21 by rbhkamal, 14 years ago

Prefect! it works! Thank you so much! But if I may ask, how where you able to tell vgaHWGetHWRec(pScrn) was null?

comment:22 by Michael Thayer, 14 years ago

Summary: X Segmentation fault on startxX Segmentation fault on startx -> fixed on trunk as of 2011-02-04

Actually it was VGAHWPTR(pScrn) which was NULL. I was able to match the object code in vboxvideo_drv.so with the source, and VGAHWPTR(pScrn)->IOBase became VGAHWPTR(pScrn) + 0x30 - and the invalid access was at address 0x30. Then I realised that we were initialising that pointer (with vgaHWGetHWRec(pScrn)) at the start of the first server generation but uninitialising it (with vgaHWFreeHWRec(pScrn)) at the end of every generation.

I will commit the backport, so the fix will be present in any future 3.2 releases. Thanks for verifying it.

comment:23 by Frank Mehnert, 14 years ago

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette