Changeset 97281 in vbox for trunk/include/VBox/vmm
- Timestamp:
- Oct 24, 2022 2:58:21 PM (2 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/include/VBox/vmm/cpumctx.h
r97232 r97281 231 231 * above this we use for storing internal state not visible to the guest. 232 232 * 233 * The initial plan was to use 24 or 22 here and keep bits that needs clearing 234 * on instruction boundrary in the top of the first 32 bits, allowing us to use 235 * a AND with a 32-bit immediate for clearing both RF and the interrupt shadow 236 * bits. However, when using anything less than 32, there is a significant code 237 * size increase: VMMR0.ro is 2475709 bytes with 32 bits, 2482069 bytes with 24 238 * bits, and 2482261 bytes with 22 bits. 239 * 240 * So, for now we're best off setting this to 32. 241 */ 242 #define CPUMX86EFLAGS_HW_BITS 32 233 * Using a value less than 32 here means some code bloat when loading and 234 * fetching the hardware EFLAGS value. Comparing VMMR0.r0 text size when 235 * compiling release build using gcc 11.3.1 on linux: 236 * - 32 bits: 2475709 bytes 237 * - 24 bits: 2482069 bytes; +6360 bytes. 238 * - 22 bits: 2482261 bytes; +6552 bytes. 239 * Same for windows (virtual size of .text): 240 * - 32 bits: 1498502 bytes 241 * - 24 bits: 1502278 bytes; +3776 bytes. 242 * - 22 bits: 1502198 bytes; +3696 bytes. 243 * 244 * In addition we pass pointer the 32-bit EFLAGS to a number of IEM assembly 245 * functions, so it would be safer to not store anything in the lower 32 bits. 246 * OTOH, we'd sooner discover buggy assembly code by doing so, as we've had one 247 * example of accidental EFLAGS trashing by these functions already. 248 * 249 * It would be more efficient for IEM to store the interrupt shadow bit (and 250 * anything else that needs to be cleared at the same time) in the 30:22 bit 251 * range, because that would allow using a simple AND imm32 instruction on x86 252 * and a MOVN imm16,16 instruction to load the constant on ARM64 (assuming the 253 * other flag needing clearing is RF (bit 16)). Putting it in the 63:32 range 254 * means we that on x86 we'll either use a memory variant of AND or require a 255 * separate load instruction for the immediate, whereas on ARM we'll need more 256 * instructions to construct the immediate value. 257 * 258 * Comparing the instruction exit thruput via the bs2-test-1 testcase, there 259 * seems to be little difference between 32 and 24 here (best results out of 9 260 * runs on Linux/VT-x). So, unless the results are really wrong and there is 261 * clear drop in thruput, it would on the whole make the most sense to use 24 262 * here. 263 */ 264 #define CPUMX86EFLAGS_HW_BITS 24 243 265 /** Mask for the hardware EFLAGS bits, 64-bit version. */ 244 266 #define CPUMX86EFLAGS_HW_MASK_64 (RT_BIT_64(CPUMX86EFLAGS_HW_BITS) - UINT64_C(1))
Note:
See TracChangeset
for help on using the changeset viewer.