Changeset 71275 in vbox
- Timestamp:
- Mar 8, 2018 2:31:07 PM (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/src/VBox/VMM/VMMR3/NEMR3.cpp
r71040 r71275 19 19 * 20 20 * Later. 21 * 22 * 23 * @section sec_nem_win Windows 24 * 25 * On Windows the Hyper-V root partition (dom0 in zen terminology) does not have 26 * nested VT-x or AMD-V capabilities. For a while raw-mode worked in it, 27 * however now we \#GP when modifying CR4. So, when Hyper-V is active on 28 * Windows we have little choice but to use Hyper-V to run our VMs. 29 * 30 * @subsection subsec_nem_win_whv The WinHvPlatform API 31 * 32 * Since Windows 10 build 17083 there is a documented API for managing Hyper-V 33 * VMs, header file WinHvPlatform.h and implementation in WinHvPlatform.dll. 34 * This interface is a wrapper around the undocumented Virtualization 35 * Infrastructure Driver (VID) API - VID.DLL and VID.SYS. The wrapper is 36 * written in C++, namespaced and early version (at least) was using standard 37 * container templates in several places. 38 * 39 * When creating a VM using WHvCreatePartition, it will only create the 40 * WinHvPlatform structures for it, to which you get an abstract pointer. The 41 * VID API that actually creates the partition is first engaged when you call 42 * WHvSetupPartition after first setting a lot of properties using 43 * WHvSetPartitionProperty. Since the VID API is just a very thin wrapper 44 * around CreateFile and NtDeviceIoControl, it returns an actual HANDLE for the 45 * partition WinHvPlatform. We fish this HANDLE out of the WinHvPlatform 46 * partition structures because we need to talk directly to VID for reasons 47 * we'll get to in a bit. (Btw. we could also intercept the CreateFileW or 48 * NtDeviceIoControl calls from VID.DLL to get the HANDLE should fishing in the 49 * partition structures become difficult.) 50 * 51 * The WinHvPlatform API requires us to both set the number of guest CPUs before 52 * setting up the partition and call WHvCreateVirtualProcessor for each of them. 53 * The CPU creation function boils down to a VidMessageSlotMap call that sets up 54 * and maps a message buffer into ring-3 for async communication with hyper-V 55 * and/or the VID.SYS thread actually running the CPU. When for instance a 56 * VMEXIT is encountered, hyper-V sends a message that the 57 * WHvRunVirtualProcessor API retrieves (and later acknowledges) via 58 * VidMessageSlotHandleAndGetNext. It should be noteded that 59 * WHvDeleteVirtualProcessor doesn't do much as there seems to be no partner 60 * function VidMessagesSlotMap that reverses what it did. 61 * 62 * Memory is managed thru calls to WHvMapGpaRange and WHvUnmapGpaRange (GPA does 63 * not mean grade point average here, but rather guest physical addressspace), 64 * which corresponds to VidCreateVaGpaRangeSpecifyUserVa and VidDestroyGpaRange 65 * respectively. As 'UserVa' indicates, the functions works on user process 66 * memory. The mappings are also subject to quota restrictions, so the number 67 * of ranges are limited and probably their total size as well. Obviously 68 * VID.SYS keeps track of the ranges, but so does WinHvPlatform, which means 69 * there is a bit of overhead involved and quota restrctions makes sense. For 70 * some reason though, regions are lazily mapped on VMEXIT/memory by 71 * WHvRunVirtualProcessor. 72 * 73 * Running guest code is done thru the WHvRunVirtualProcessor function. It 74 * asynchronously starts or resumes hyper-V CPU execution and then waits for an 75 * VMEXIT message. Other threads can interrupt the execution by using 76 * WHvCancelVirtualProcessor, which which case the thread in 77 * WHvRunVirtualProcessor is woken up via a dummy QueueUserAPC and will call 78 * VidStopVirtualProcessor to asynchronously end execution. The stop CPU call 79 * not immediately succeed if the CPU encountered a VMEXIT before the stop was 80 * processed, in which case the VMEXIT needs to be processed first, and the 81 * pending stop will be processed in a subsequent call to 82 * WHvRunVirtualProcessor. 83 * 84 * {something about registers} 85 * 86 * @subsubsection subsubsec_nem_win_whv_cons Issues / Disadvantages 87 * 88 * Here are some observations: 89 * 90 * - The WHvCancelVirtualProcessor API schedules a dummy usermode APC callback 91 * in order to cancel any current or future alertable wait in VID.SYS during 92 * the VidMessageSlotHandleAndGetNext call. 93 * 94 * IIRC this will make the kernel schedule the callback thru 95 * NTDLL!KiUserApcDispatcher by modifying the thread context and quite 96 * possibly the userland thread stack. When the APC callback returns to 97 * KiUserApcDispatcher, it will call NtContinue to restore the old thread 98 * context and resume execution from there. Upshot this is a bit expensive. 99 * 100 * Using NtAltertThread call could do the same without the thread context 101 * modifications and the extra kernel call. 102 * 103 * 104 * - Not sure if this is a thing, but WHvCancelVirtualProcessor seems to cause 105 * cause a lot more spurious WHvRunVirtualProcessor returns that what we get 106 * with the replacement code. By spurious returns we mean that the 107 * subsequent call to WHvRunVirtualProcessor would return immediately. 108 * 109 * 110 * - When WHvRunVirtualProcessor returns without a message, or on a terse 111 * VID message like HLT, it will make a kernel call to get some registers. 112 * This is potentially inefficient if the caller decides he needs more 113 * register state. 114 * 115 * It would be better to just return what's available and let the caller fetch 116 * what is missing from his point of view in a single kernel call. 117 * 118 * 119 * - The WHvRunVirtualProcessor implementation does lazy GPA range mappings when 120 * a unmapped GPA message is received from hyper-V. 121 * 122 * Since MMIO is currently realized as unmapped GPA, this will slow down all 123 * MMIO accesses a tiny little bit as WHvRunVirtualProcessor looks up the 124 * guest physical address the checks if it's a pending lazy mapping. 125 * 126 * 127 * - There is no API for modifying protection of a page within a GPA range. 128 * 129 * We're left with having to unmap the range and then remap it with the new 130 * protection. For instance we're actively using this to track dirty VRAM 131 * pages, which means there are occational readonly->writable transitions at 132 * run time followed by bulk reversal to readonly when the display is 133 * refreshed. 134 * 135 * Now to work around the issue, we do page sized GPA ranges. In addition to 136 * add a lot of tracking overhead to WinHvPlatform and VID.SYS, it also causes 137 * us to exceed our quota before we've even mapped a default sized VRAM 138 * page-by-page. So, to work around this quota issue we have to lazily map 139 * pages and actively restrict the number of mappings. 140 * 141 * Out best workaround thus far is bypassing WinHvPlatform and VID when in 142 * comes to memory and instead us the hypercalls to do it (HvCallMapGpaPages, 143 * HvCallUnmapGpaPages). (This also maps a whole lot better into our own 144 * guest page management infrastructure.) 145 * 146 * 147 * - Observed problems doing WHvUnmapGpaRange followed by WHvMapGpaRange. 148 * 149 * As mentioned above, we've been forced to use this sequence when modifying 150 * page protection. However, when upgrading from readonly to writable, we've 151 * ended up looping forever with the same write to readonly memory exit. 152 * 153 * Workaround: Insert a WHvRunVirtualProcessor call and make sure to get a GPA 154 * unmapped exit between the two calls. Terrible for performance and code 155 * sanity. 156 * 157 * 158 * - WHVRunVirtualProcessor wastes time converting VID/Hyper-V messages to it's 159 * own defined format. 160 * 161 * We understand this might be because Microsoft wishes to remain free to 162 * modify the VID/Hyper-V messages, but it's still rather silly and does slow 163 * things down. 164 * 165 * 166 * - WHVRunVirtualProcessor would've benefited from using a callback interface: 167 * - The potential size changes of the exit context structure wouldn't be 168 * an issue, since the function could manage that itself. 169 * - State handling could be optimized simplified (esp. cancellation). 170 * 171 * 172 * - WHvGetVirtualProcessorRegisters and WHvSetVirtualProcessorRegisters 173 * internally converts register names, probably using temporary heap buffers. 174 * 175 * From the looks of things, it's converting from WHV_REGISTER_NAME to 176 * HV_REGISTER_NAME that's documented in the "Virtual Processor Register 177 * Names" section of "Hypervisor Top-Level Functional Specification". This 178 * feels like an awful waste of time. We simply cannot understand why it 179 * wouldn't have sufficed to use HV_REGISTER_NAME here and simply checked the 180 * input values if restrictions were desired. 181 * 182 * To avoid the heap + conversion overhead, we're currently using the 183 * HvCallGetVpRegisters and HvCallSetVpRegisters calls directly. 184 * 185 * 186 * - Why does WINHVR.SYS (or VID.SYS) only query/set 32 registers at the time 187 * thru the HvCallGetVpRegisters and HvCallSetVpRegisters hypercalls? 188 * 189 * We've not trouble getting/setting all the registers defined by 190 * WHV_REGISTER_NAME in one hypercall... 191 * 192 * 193 * - . 21 194 * 22 195 */
Note:
See TracChangeset
for help on using the changeset viewer.