Opened 14 years ago
Closed 14 years ago
#8231 closed defect (fixed)
X Segmentation fault on startx -> fixed on trunk as of 2011-02-04
Reported by: | rbhkamal | Owned by: | |
---|---|---|---|
Component: | guest additions | Version: | VirtualBox 3.2.12 |
Keywords: | Cc: | ||
Guest type: | Linux | Host type: | Windows |
Description
Occasionally, vboxvideo_drv.so crashes X with segmentation fault at address 0xc. Unfortunately, I haven't been able to reproduce the problem on a system that allows switching to the console (so I can copy the logs). However, I was able to take a screen shot of the error.
The workaround, is when X fails to start, restart the guest services and reload the vboxvideo_drv module then try to start X again.
The guest is a *live* stripped down Ubuntu 10.04.1 running linux-kernel-x86-2.6.32-28-generic and X server version 1.7.6
The problem seems to get worse when the guest takes longer to boot, however, following the workaround above always works.
See attached for crash trace.
fyi, I'm currently requesting for permission to provide you with the ISO or maybe an export of the virtual machine.
Attachments (5)
Change History (28)
by , 14 years ago
Attachment: | vboxdrv_crash.png added |
---|
comment:1 by , 14 years ago
Forgot to mention that this happens with 3.2.12 3.2.10 and 3.2.8. Please let me know if you need the ISO. Or if there is anything you'd like me to try.
Thanks, RK
comment:2 by , 14 years ago
If you are able to continue using the machine without rebooting then the log should still be there.
comment:3 by , 14 years ago
Alright, I'm in the process of getting some logs. Its very tricky nice the guest OS locked-down and doesn't allow switching to the console, and it doesn't have a file manager nor an X terminal.
by , 14 years ago
Some more logs (udev, casper.log) and output of lsmod and ps -ef after the second attempt to start X
comment:4 by , 14 years ago
Please note that the date on the guest machine was not set correctly but I just made these logs today.
comment:5 by , 14 years ago
Keywords: | vboxdrv removed |
---|
This looks like the same issue as ticket #5788. That one is closed as fixed, but the "solution" may also have been that the updated X server hid the bug.
comment:6 by , 14 years ago
Hummm... this might explain why trying to start X again works. Seems like a race condition?
comment:7 by , 14 years ago
What seems strange to me is that the log looks like the server actually started successfully and stopped again. The segfault is in code executed during startup, which the log suggests has already been executed. I know that the server has or had a "generation" mechanism which implied it starting and stopping several times during the lifetime of the server process - perhaps that is involved here.
comment:8 by , 14 years ago
I'm not sure that I understand what you mean by "generation mechanism" but here is the life line of the guest OS:
Start VM \> execute startx ---- If the startx returns, then startx again. ---- If X fails three times in a row, halt/power off the VM.
Please see startgui.sh below
by , 14 years ago
Attachment: | startgui.sh added |
---|
This shell is start by another shell which is started by rc.local
comment:9 by , 14 years ago
Is there any way you can install debugging symbols for the server and get a backtrace in gdb? The automatic X server backtrace is nice, but not quite as good as a real one.
comment:10 by , 14 years ago
And the generation mechanism is something internal to the X server. It is a way of starting and stopping the server without ending the server process or reprobing all hardware, but I don't know anything more about it myself, and I am not even sure if it ever worked.
comment:11 by , 14 years ago
comment:12 by , 14 years ago
So based on that link this probably happens when the server terminates and automatically restarts without ending the server process.
comment:13 by , 14 years ago
Reproduced by starting the X server as plain
$ X
then starting an xterm on it (from a virtual terminal) and exiting it.
comment:14 by , 14 years ago
The faulting address looks to me like the line
VGAHWPTR(pScrn)->IOBase = pScrn->domainIOBase;
in vboxvideo.c and VGAHWPTR(pScrn) is NULL.
comment:15 by , 14 years ago
We call vgaHWFreeHWRec in VBOXCloseScreen, which is called at the end of each server generation, but we call vgaHWGetHWRec to allocate the record in VBOXPreInit, which is called at the start of the first generation only. I will change this tomorrow and see if it fixes the issue.
comment:16 by , 14 years ago
I can test it as well, however, I can't find any instructions on how to install the opensource guest additions (self compiled) manually, right now I just run the installer from the additions ISO.
I'm also still trying to set things up with gdb, however, I'm having a hard time starting X using gdb on startup.
comment:17 by , 14 years ago
Could you try this build?
which is a test build from the 4.0 stable branch (please see [wiki:Testbuilds here)?
comment:18 by , 14 years ago
Just to be sure, I'm doing everything correctly: 1- Install the build 2- Get the guestAdditions.ISO and upgrade the additions for the guest 3- test
comment:19 by , 14 years ago
Alright, I've launched the machine about 6 times and no crashes. However, I was only able to test it with the testbuild virtualbox installed on the host OS. If I try to test using VirtualBox 3.2.12 with the testbuild guest additions, the virtual machine crashes immediately when X is trying to start.
It seems like the problem is fixed, is it possible to give me a patch so I can try and patch the 3.2.12 guest addition. This way I can test it with minimum changes to the test bed.
Thanks
comment:20 by , 14 years ago
Here is an untested backport of the change to 3.2:
Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c =================================================================== --- src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c (révision 69858) +++ src/VBox/Additions/x11/vboxvideo/vboxvideo_13.c (copie de travail) @@ -802,10 +802,6 @@ /* Framebuffer-related setup */ pScrn->bitmapBitOrder = BITMAP_BIT_ORDER; - /* VGA hardware initialisation */ - if (!vgaHWGetHWRec(pScrn)) - return FALSE; - #ifdef VBOX_DRI /* Load the dri module. */ if (!xf86LoadSubModule(pScrn, "dri")) @@ -857,6 +853,10 @@ VisualPtr visual; unsigned flags; + /* VGA hardware initialisation */ + if (!vgaHWGetHWRec(pScrn)) + return FALSE; + if (pVBox->mapPhys == 0) { #ifdef PCIACCESS pVBox->mapPhys = pVBox->pciInfo->regions[0].base_addr; Index: src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c =================================================================== --- src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c (révision 69858) +++ src/VBox/Additions/x11/vboxvideo/vboxvideo_70.c (copie de travail) @@ -637,10 +637,6 @@ /* Framebuffer-related setup */ pScrn->bitmapBitOrder = BITMAP_BIT_ORDER; - /* VGA hardware initialisation */ - if (!vgaHWGetHWRec(pScrn)) - return FALSE; - TRACE_EXIT(); return (TRUE); } @@ -668,6 +664,11 @@ unsigned flags; TRACE_ENTRY(); + + /* VGA hardware initialisation */ + if (!vgaHWGetHWRec(pScrn)) + return FALSE; + /* We make use of the X11 VBE code to save and restore text mode, in order to keep our code simple. */ if ((pVBox->pVbe = VBEExtendedInit(NULL, pVBox->pEnt->index,
comment:21 by , 14 years ago
Prefect! it works! Thank you so much! But if I may ask, how where you able to tell vgaHWGetHWRec(pScrn) was null?
comment:22 by , 14 years ago
Summary: | X Segmentation fault on startx → X Segmentation fault on startx -> fixed on trunk as of 2011-02-04 |
---|
Actually it was VGAHWPTR(pScrn) which was NULL. I was able to match the object code in vboxvideo_drv.so with the source, and VGAHWPTR(pScrn)->IOBase became VGAHWPTR(pScrn) + 0x30 - and the invalid access was at address 0x30. Then I realised that we were initialising that pointer (with vgaHWGetHWRec(pScrn)) at the start of the first server generation but uninitialising it (with vgaHWFreeHWRec(pScrn)) at the end of every generation.
I will commit the backport, so the fix will be present in any future 3.2 releases. Thanks for verifying it.
comment:23 by , 14 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Screenshot of the segmentation fault