Opened 7 years ago
Closed 5 years ago
#17608 closed defect (invalid)
Resuming a VM misses a heartbeat and gets reset occasionally
Reported by: | Jim Klimov | Owned by: | |
---|---|---|---|
Component: | other | Version: | VirtualBox 5.1.30 |
Keywords: | Cc: | ||
Guest type: | Solaris | Host type: | Windows |
Description
I have a Windows 7 host running several Linux and Solaris VMs with Guest Additions under VirtualBox 5.1.30. When I roam between work and home, I run a script to "savestate" the virtual machines and put the laptop to sleep; when it gets powered back on the VMs are resumed and get new IP addresses etc.
All is well except that about once in a week I end up with a freshly booted work VM, where I previously had lots of stuff running and ready to continue work (IDEs, dev-testing terminal tabs, etc). So such reset is quite inconvenient and can lose me some recent work (e.g. shell histories in terminals) as well as it costs me some downtime to re-start everything I needed.
I recently inspected the VirtualBox logs to see if the saved-state file was corrupted or something, and found it is not. In fact, the VM started to resume and then was killed by heartbeat tracker with these messages in the middle of the log:
00:00:22.025091 Changing the VM state from 'LOADING' to 'SUSPENDED' 00:00:22.025181 Changing the VM state from 'SUSPENDED' to 'RESUMING' 00:00:22.026442 Changing the VM state from 'RESUMING' to 'RUNNING' 00:00:22.026458 Console: Machine state changed to 'Running' 00:00:22.449411 GUI: UIMachineViewNormal::adjustGuestScreenSize: Adjust guest-screen size if necessary. 00:00:22.449982 GUI: 2D video acceleration is disabled 00:00:22.449997 GUI: HID LEDs sync is enabled 00:00:22.450553 GUI: UIMachineLogicNormal::sltCheckForRequestedVisualStateType: Requested-state=0, Machine-state=5 00:00:22.539429 AHCI#0: Port 0 reset 00:00:22.697286 VUSB: Attached 'HidMouse' to port 1 00:00:26.473715 VMMDev: vmmDevHeartbeatFlatlinedTimer: Guest seems to be unresponsive. Last heartbeat received 4 seconds ago 00:02:16.362691 OHCI: USB Reset 00:02:16.362971 OHCI: Software reset 00:02:16.363298 EHCI: USB Suspended 00:02:16.381140 EHCI: Hardware reset 00:02:16.381341 ACPI: Reset initiated by ACPI 00:02:16.381460 Changing the VM state from 'RUNNING' to 'RESETTING'
I am rather certain the machine did not take so long before resetting - I see it booting about as soon as it is un-paused after the load.
My primary "gut guess" is it may be that the host clock is synced in the new network after waking up, and/or the VM clock is resynced to the host after unpause, causing a jump in readings and a false-positive missed heartbeat. Or since there are several VMs resuming, something is indeed lagging under the burst of load.
Either way, the solution would be to not trust the heartbeat for some time (per monotonous host kernel clock) after VM resume, and/or not until the VM clock resync has happened.
And also some way to rule out misfiring when host clock changes abruptly might help, though I haven't seen issues when the system is just very busy with some compilation (so mouse is lagging and at least the GUI clocks in host and VMs are not updated for several minutes in a row).
No activity for 2 years and no complete VBox.log, reopen if still an issue and attach a completel VBox.log.