Changeset 80045 in vbox
- Timestamp:
- Jul 29, 2019 2:38:19 PM (6 years ago)
- svn:sync-xref-src-repo-rev:
- 132482
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/src/VBox/VMM/Docs-RawMode.cpp
r76553 r80045 20 20 /** @page pg_raw Raw-mode Code Execution 21 21 * 22 * This chapter describes the virtualization technique which we call raw-mode 23 * and how VirtualBox makes use of it and implements it. 22 * VirtualBox 0.0 thru 6.0 implemented a mode of guest code execution that 23 * allowed executing mostly raw guest code directly the host CPU but without any 24 * support from VT-x or AMD-V. It was implemented for AMD64, AMD-V and VT-x 25 * were available (former) or even specified (latter two). This mode was 26 * removed in 6.1 (code ripped out) as it was mostly unused by that point and 27 * not worth the effort of maintaining. 24 28 * 25 * @todo Write raw-mode chapter! 29 * A future VirtualBox version may reintroduce a new kind of raw-mode for 30 * emulating non-x86 architectures, making use of the host MMU to efficiently 31 * emulate the target MMU. This is just a wild idea at this point. 26 32 * 27 33 * 28 * @section sec_rawr3 Raw-Ring3 34 * @section sec_old_rawmode Old Raw-mode 35 * 36 * Running guest code unmodified on the host CPU is reasonably unproblematic for 37 * ring-3 code when it runs without IOPL=3. There will be some information 38 * leaks thru CPUID, a bunch of 286 area unprivileged instructions revealing 39 * privileged information (like SGDT, SIDT, SLDT, STR, SMSW), and hypervisor 40 * selectors can probably be identified using VERR, VERW and such instructions. 41 * However, it generally works fine for half friendly software when the CPUID 42 * difference between the target and host isn't too big. 43 * 44 * Kernel code can be executed on the host CPU too, however it needs to be 45 * pushed up a ring (guest ring-0 to ring-1, guest ring-1 to ring2) to let the 46 * hypervisor (VMMRC.rc) be in charge of ring-0. Ring compression causes 47 * issues when CS or SS are pushed and inspected by the guest, since the values 48 * will have bit 0 set whereas the guest expects that bit to be cleared. In 49 * addition there are problematic instructions like POPF and IRET that the guest 50 * code uses to restore/modify EFLAGS.IF state, however the CPU just silently 51 * ignores EFLAGS.IF when it isn't running in ring-0 (or with an appropriate 52 * IOPL), which causes major headache. The SIDT, SGDT, STR, SLDT and SMSW 53 * instructions also causes problems since they will return information about 54 * the hypervisor rather than the guest state and cannot be trapped. 55 * 56 * So, guest kernel code needed to be scanned (by CSAM) and problematic 57 * instructions or sequences patched or recompiled (by PATM). 58 * 59 * The raw-mode execution operates in a slightly modified guest memory context, 60 * so memory accesses can be done directly without any checking or masking. The 61 * modification was to insert the hypervisor in an unused portion of the the 62 * page tables, making it float around and require it to be relocated when the 63 * guest mapped code into the area it was occupying. 64 * 65 * The old raw-mode code was 32-bit only because its inception predates the 66 * availability of the AMD64 architecture and the promise of AMD-V and VT-x made 67 * it unnecessary to do a 64-bit version of the mode. (A long-mode port of the 68 * raw-mode execution hypvisor could in theory have been used for both 32-bit 69 * and 64-bit guest, making the relocating unnecessary for 32-bit guests, 70 * however v8086 mode does not work when the CPU is operating in long-mode made 71 * it a little less attractive.) 29 72 * 30 73 * 31 * @section sec_rawr0 Raw-Ring0 74 * @section sec_rawmode_v2 Raw-mode v2 75 * 76 * The vision for the reinvention of raw-mode execution is to put it inside 77 * VT-x/AMD-V and run non-native instruction sets via a recompiler. 78 * 79 * The main motivation is TLB emulation using the host MMU. An added benefit is 80 * would be that the non-native instruction sets would be add-ons put on top of 81 * the existing x86/AMD64 virtualization product and therefore not require a 82 * complete separate product build. 32 83 * 33 84 * 85 * Outline: 86 * 87 * - Plug-in based, so the target architecture specific stuff is mostly in 88 * separate modules (ring-3, ring-0 (optional) and raw-mode images). 89 * 90 * - Only 64-bit mode code (no problem since VirtualBox requires a 64-bit host 91 * since 6.0). So, not reintroducing structure alignment pain from old RC. 92 * 93 * - Map the RC-hypervisor modules as ROM, using the shadowing feature for the 94 * data sections. 95 * 96 * - Use MMIO2-like regions for all the memory that the RC-hypervisor needs, 97 * all shared with the associated host side plug-in components. 98 * 99 * - The ROM and MMIO2 regions does not directly end up in the saved state, the 100 * state is instead saved by the ring-3 architecture module. 101 * 102 * - Device access thru MMIO mappings could be done transparently thru to the 103 * x86/AMD64 core VMM. It would however be possible to reintroduce the RC 104 * side device handling, as that will not be removed in the old-RC cleanup. 105 * 106 * - Virtual memory managed by the RC-hypervisor, optionally with help of the 107 * ring-3 and/or ring-0 architecture modules. 108 * 109 * - The mapping of the RC modules and memory will probably have to runtime 110 * relocatable again, like it was in the old RC. Though initially and for 111 * 32-bit target architectures, we will probably use a fixed mapping. 112 * 113 * - Memory accesses must unfortunately be range checked before being issued, 114 * in order to prevent the guest code from accessing the hypervisor. The 115 * recompiled code must be able to run, modify state, call ROM code, update 116 * statistics and such, so we cannot use page table stuff protect the 117 * hypervisor code & data. (If long mode implement segment limits, we 118 * could've used that, but it doesn't.) 119 * 120 * - The RC-hypervisor will make hypercalls to communicate with the ring-0 and 121 * ring-3 host code. 122 * 123 * - The host side should be able to dig out the current guest state from 124 * information (think AMD64 unwinding) stored in translation blocks. 125 * 126 * - Non-atomic state updates outside TBs could be flagged so the host know 127 * how to roll the back. 128 * 129 * - SMP must be taken into account early on. 130 * 131 * - As must existing IEM-based recompiler ideas, preferrably sharing code 132 * (basically compiling IEM targetting the other architecture). 133 * 134 * The actual implementation will depend a lot on which architectures are 135 * targeted and how they can be mapped onto AMD64/x86. It is possible that 136 * there are some significan roadblocks preventing us from using the host MMU 137 * efficiently even. AMD64 is for instance rather low on virtual address space 138 * compared to several other 64-bit architectures, which means we'll generate a 139 * lot of \#GPs when the guest tries to access spaced reserved on AMD64. The 140 * proposed 5-level page tables will help with this, of course, but that need to 141 * get into silicon and into user computers for it to be really helpful. 142 * 143 * One thing that helps a lot is that we don't have to consider 32-bit x86 any 144 * more, meaning that the recompiler only need to generate 64-bit code and can 145 * assume having 15-16 GPRs at its disposal. 146 * 34 147 */ 148
Note:
See TracChangeset
for help on using the changeset viewer.