Changeset 85638 in vbox for trunk/src/VBox/Runtime/common
- Timestamp:
- Aug 6, 2020 4:17:54 PM (4 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/src/VBox/Runtime/common/checksum/alt-sha3.cpp
r85624 r85638 33 33 34 34 /** @def RTSHA3_FULL_UNROLL 35 * Do full loop unrolling unless we're using VS2019 as it seems to degrate 36 * performances there for some reason. With gcc 10.2.1 on a recent Intel system 37 * (10890XE), this results SHA3-512 throughput (tstRTDigest-2) increasing from 38 * 83532 KiB/s to 194942 KiB/s against a text size jump from 5913 to 6929 bytes. 35 * Do full loop unrolling. 36 * 37 * With gcc 10.2.1 on a recent Intel system (10890XE), this results SHA3-512 38 * throughput (tstRTDigest-2) increasing from 83532 KiB/s to 194942 KiB/s 39 * against a text size jump from 5913 to 6929 bytes, i.e. +1016 bytes. 40 * 41 * With VS2019 on a half decent AMD system (3990X), this results in SHA3-512 42 * speedup from 147676 KiB/s to about 192770 KiB/s. The text cost is +612 bytes 43 * (4496 to 5108). When disabling the unrolling of Rho+Pi we get a little 44 * increase 196591 KiB/s (+3821) for some reason, saving 22 bytes of code. 39 45 * 40 46 * For comparison, openssl 1.1.1g assembly code (AMD64) achives 264915 KiB/s, … … 42 48 * KECCAK_2X without ROL optimizations (they improve it to 203493 KiB/s). 43 49 */ 44 #if !defined( _MSC_VER) || defined(DOXYGEN_RUNNING)50 #if !defined(IN_SUP_HARDENED_R3) || defined(DOXYGEN_RUNNING) 45 51 # define RTSHA3_FULL_UNROLL 46 52 #endif … … 147 153 */ 148 154 { 149 #if ndef RTSHA3_FULL_UNROLL155 #if !defined(RTSHA3_FULL_UNROLL) || defined(_MSC_VER) /* VS2019 is slightly slow with this section unrolled. go figure */ 150 156 static uint8_t const s_aidxState[] = {10,7,11,17,18, 3, 5,16, 8,21, 24, 4,15,23,19, 13,12, 2,20,14, 22, 9, 6, 1}; 151 157 static uint8_t const s_acRotate[] = { 1,3, 6,10,15, 21,28,36,45,55, 2,14,27,41,56, 8,25,43,62,18, 39,61,20,44};
Note:
See TracChangeset
for help on using the changeset viewer.