VirtualBox

Changeset 97281 in vbox for trunk/include/VBox/vmm


Ignore:
Timestamp:
Oct 24, 2022 2:58:21 PM (2 years ago)
Author:
vboxsync
Message:

VMM/cpumctx.h: Set CPUMX86EFLAGS_HW_BITS to 24 as there seems to be no clear performance difference to 32. This should allow IEM and others to get away with more efficient encoding of RFLAGS/fIntInhibit updates later (see code comment).

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/include/VBox/vmm/cpumctx.h

    r97232 r97281  
    231231 * above this we use for storing internal state not visible to the guest.
    232232 *
    233  * The initial plan was to use 24 or 22 here and keep bits that needs clearing
    234  * on instruction boundrary in the top of the first 32 bits, allowing us to use
    235  * a AND with a 32-bit immediate for clearing both RF and the interrupt shadow
    236  * bits.  However, when using anything less than 32, there is a significant code
    237  * size increase: VMMR0.ro is 2475709 bytes with 32 bits, 2482069 bytes with 24
    238  * bits, and 2482261 bytes with 22 bits.
    239  *
    240  * So, for now we're best off setting this to 32.
    241  */
    242 #define CPUMX86EFLAGS_HW_BITS       32
     233 * Using a value less than 32 here means some code bloat when loading and
     234 * fetching the hardware EFLAGS value.  Comparing VMMR0.r0 text size when
     235 * compiling release build using gcc 11.3.1 on linux:
     236 *      - 32 bits: 2475709 bytes
     237 *      - 24 bits: 2482069 bytes; +6360 bytes.
     238 *      - 22 bits: 2482261 bytes; +6552 bytes.
     239 * Same for windows (virtual size of .text):
     240 *      - 32 bits: 1498502 bytes
     241 *      - 24 bits: 1502278 bytes; +3776 bytes.
     242 *      - 22 bits: 1502198 bytes; +3696 bytes.
     243 *
     244 * In addition we pass pointer the 32-bit EFLAGS to a number of IEM assembly
     245 * functions, so it would be safer to not store anything in the lower 32 bits.
     246 * OTOH, we'd sooner discover buggy assembly code by doing so, as we've had one
     247 * example of accidental EFLAGS trashing by these functions already.
     248 *
     249 * It would be more efficient for IEM to store the interrupt shadow bit (and
     250 * anything else that needs to be cleared at the same time) in the 30:22 bit
     251 * range, because that would allow using a simple AND imm32 instruction on x86
     252 * and a MOVN imm16,16 instruction to load the constant on ARM64 (assuming the
     253 * other flag needing clearing is RF (bit 16)).  Putting it in the 63:32 range
     254 * means we that on x86 we'll either use a memory variant of AND or require a
     255 * separate load instruction for the immediate, whereas on ARM we'll need more
     256 * instructions to construct the immediate value.
     257 *
     258 * Comparing the instruction exit thruput via the bs2-test-1 testcase, there
     259 * seems to be little difference between 32 and 24 here (best results out of 9
     260 * runs on Linux/VT-x).  So, unless the results are really wrong and there is
     261 * clear drop in thruput, it would on the whole make the most sense to use 24
     262 * here.
     263 */
     264#define CPUMX86EFLAGS_HW_BITS       24
    243265/** Mask for the hardware EFLAGS bits, 64-bit version. */
    244266#define CPUMX86EFLAGS_HW_MASK_64    (RT_BIT_64(CPUMX86EFLAGS_HW_BITS) - UINT64_C(1))
Note: See TracChangeset for help on using the changeset viewer.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette