VirtualBox

Changeset 85638 in vbox for trunk/src/VBox/Runtime/common


Ignore:
Timestamp:
Aug 6, 2020 4:17:54 PM (4 years ago)
Author:
vboxsync
Message:

IPRT/sha3: Some VS2019 performance tweaks. bugref:9734

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/VBox/Runtime/common/checksum/alt-sha3.cpp

    r85624 r85638  
    3333
    3434/** @def RTSHA3_FULL_UNROLL
    35  * Do full loop unrolling unless we're using VS2019 as it seems to degrate
    36  * performances there for some reason.  With gcc 10.2.1 on a recent Intel system
    37  * (10890XE), this results SHA3-512 throughput (tstRTDigest-2) increasing from
    38  * 83532 KiB/s to 194942 KiB/s against a text size jump from 5913 to 6929 bytes.
     35 * Do full loop unrolling.
     36 *
     37 * With gcc 10.2.1 on a recent Intel system (10890XE), this results SHA3-512
     38 * throughput (tstRTDigest-2) increasing from 83532 KiB/s to 194942 KiB/s
     39 * against a text size jump from 5913 to 6929 bytes, i.e. +1016 bytes.
     40 *
     41 * With VS2019 on a half decent AMD system (3990X), this results in SHA3-512
     42 * speedup from 147676 KiB/s to about 192770 KiB/s.  The text cost is +612 bytes
     43 * (4496 to 5108).  When disabling the unrolling of Rho+Pi we get a little
     44 * increase 196591 KiB/s (+3821) for some reason, saving 22 bytes of code.
    3945 *
    4046 * For comparison, openssl 1.1.1g assembly code (AMD64) achives 264915 KiB/s,
     
    4248 * KECCAK_2X without ROL optimizations (they improve it to 203493 KiB/s).
    4349 */
    44 #if !defined(_MSC_VER) || defined(DOXYGEN_RUNNING)
     50#if !defined(IN_SUP_HARDENED_R3) || defined(DOXYGEN_RUNNING)
    4551# define RTSHA3_FULL_UNROLL
    4652#endif
     
    147153         */
    148154        {
    149 #ifndef RTSHA3_FULL_UNROLL
     155#if !defined(RTSHA3_FULL_UNROLL) || defined(_MSC_VER) /* VS2019 is slightly slow with this section unrolled. go figure */
    150156            static uint8_t const s_aidxState[] = {10,7,11,17,18,  3, 5,16, 8,21, 24, 4,15,23,19, 13,12, 2,20,14, 22, 9, 6, 1};
    151157            static uint8_t const s_acRotate[]  = { 1,3, 6,10,15, 21,28,36,45,55,  2,14,27,41,56,  8,25,43,62,18, 39,61,20,44};
Note: See TracChangeset for help on using the changeset viewer.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette