Changeset 72669 in vbox for trunk/src/VBox/VMM/VMMR3
- Timestamp:
- Jun 22, 2018 8:02:59 PM (7 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/src/VBox/VMM/VMMR3/NEMR3Native-win.cpp
r72634 r72669 2899 2899 * 2900 2900 * 2901 */ 2902 2901 * @subsection sec_nem_win_benchmarks Benchmarks. 2902 * 2903 * @subsubsection subsect_nem_win_benchmarks Bootsector2-test1 2904 * 2905 * This is ValidationKit/bootsectors/bootsector2-test1.asm as of 2018-06-22 2906 * (internal r123172) running a the release build of VirtualBox from the same 2907 * source, though with exit optimizations disabled. Host is AMD Threadripper 1950X 2908 * running out an up to date 64-bit Windows 10 build 17134. 2909 * 2910 * The base line column is using the official WinHv API for everything but physical 2911 * memory mapping. The 2nd column is the default NEM/win configuration where we 2912 * put the main execution loop in ring-0, using hypercalls when we can and VID for 2913 * managing execution. The 3rd column is regular VirtualBox using AMD-V directly, 2914 * hyper-V is disabled, main execution loop in ring-0. 2915 * 2916 * @verbatim 2917 TESTING... WinHv API Hypercalls + VID VirtualBox AMD-V 2918 32-bit paged protected mode, CPUID : 108 874 ins/sec 113% / 123 602 1198% / 1 305 113 2919 32-bit pae protected mode, CPUID : 106 722 ins/sec 115% / 122 740 1232% / 1 315 201 2920 64-bit long mode, CPUID : 106 798 ins/sec 114% / 122 111 1198% / 1 280 404 2921 16-bit unpaged protected mode, CPUID : 106 835 ins/sec 114% / 121 994 1216% / 1 299 665 2922 32-bit unpaged protected mode, CPUID : 105 257 ins/sec 115% / 121 772 1235% / 1 300 860 2923 real mode, CPUID : 104 507 ins/sec 116% / 121 800 1228% / 1 283 848 2924 CPUID EAX=1 : PASSED 2925 32-bit paged protected mode, RDTSC : 99 581 834 ins/sec 100% / 100 323 307 93% / 93 473 299 2926 32-bit pae protected mode, RDTSC : 99 620 585 ins/sec 100% / 99 960 952 84% / 83 968 839 2927 64-bit long mode, RDTSC : 100 540 009 ins/sec 100% / 100 946 372 93% / 93 652 826 2928 16-bit unpaged protected mode, RDTSC : 99 688 473 ins/sec 100% / 100 097 751 76% / 76 281 287 2929 32-bit unpaged protected mode, RDTSC : 98 385 857 ins/sec 102% / 100 510 404 94% / 93 379 536 2930 real mode, RDTSC : 100 087 967 ins/sec 101% / 101 386 138 93% / 93 234 999 2931 RDTSC : PASSED 2932 32-bit paged protected mode, Read CR4 : 2 156 102 ins/sec 98% / 2 121 967 17114% / 369 009 009 2933 32-bit pae protected mode, Read CR4 : 2 163 820 ins/sec 98% / 2 133 804 17469% / 377 999 261 2934 64-bit long mode, Read CR4 : 2 164 822 ins/sec 98% / 2 128 698 18875% / 408 619 313 2935 16-bit unpaged protected mode, Read CR4 : 2 162 367 ins/sec 100% / 2 168 508 17132% / 370 477 568 2936 32-bit unpaged protected mode, Read CR4 : 2 163 189 ins/sec 100% / 2 169 808 16768% / 362 734 679 2937 real mode, Read CR4 : 2 162 436 ins/sec 100% / 2 164 914 15551% / 336 288 998 2938 Read CR4 : PASSED 2939 real mode, 32-bit IN : 104 649 ins/sec 118% / 123 513 1028% / 1 075 831 2940 real mode, 32-bit OUT : 107 102 ins/sec 115% / 123 660 982% / 1 052 259 2941 real mode, 32-bit IN-to-ring-3 : 105 697 ins/sec 98% / 104 471 201% / 213 216 2942 real mode, 32-bit OUT-to-ring-3 : 105 830 ins/sec 98% / 104 598 198% / 210 495 2943 16-bit unpaged protected mode, 32-bit IN : 104 855 ins/sec 117% / 123 174 1029% / 1 079 591 2944 16-bit unpaged protected mode, 32-bit OUT : 107 529 ins/sec 115% / 124 250 992% / 1 067 053 2945 16-bit unpaged protected mode, 32-bit IN-to-ring-3 : 106 337 ins/sec 103% / 109 565 196% / 209 367 2946 16-bit unpaged protected mode, 32-bit OUT-to-ring-3 : 107 558 ins/sec 100% / 108 237 191% / 206 387 2947 32-bit unpaged protected mode, 32-bit IN : 106 351 ins/sec 116% / 123 584 1016% / 1 081 325 2948 32-bit unpaged protected mode, 32-bit OUT : 106 424 ins/sec 116% / 124 252 995% / 1 059 408 2949 32-bit unpaged protected mode, 32-bit IN-to-ring-3 : 104 035 ins/sec 101% / 105 305 202% / 210 750 2950 32-bit unpaged protected mode, 32-bit OUT-to-ring-3 : 103 831 ins/sec 102% / 106 919 205% / 213 198 2951 32-bit paged protected mode, 32-bit IN : 103 356 ins/sec 119% / 123 870 1041% / 1 076 463 2952 32-bit paged protected mode, 32-bit OUT : 107 177 ins/sec 115% / 124 302 998% / 1 069 655 2953 32-bit paged protected mode, 32-bit IN-to-ring-3 : 104 491 ins/sec 100% / 104 744 200% / 209 264 2954 32-bit paged protected mode, 32-bit OUT-to-ring-3 : 106 603 ins/sec 97% / 103 849 197% / 210 219 2955 32-bit pae protected mode, 32-bit IN : 105 923 ins/sec 115% / 122 759 1041% / 1 103 261 2956 32-bit pae protected mode, 32-bit OUT : 107 083 ins/sec 117% / 126 057 1024% / 1 096 667 2957 32-bit pae protected mode, 32-bit IN-to-ring-3 : 106 114 ins/sec 97% / 103 496 199% / 211 312 2958 32-bit pae protected mode, 32-bit OUT-to-ring-3 : 105 675 ins/sec 96% / 102 096 198% / 209 890 2959 64-bit long mode, 32-bit IN : 105 800 ins/sec 113% / 120 006 1013% / 1 072 116 2960 64-bit long mode, 32-bit OUT : 105 635 ins/sec 113% / 120 375 997% / 1 053 655 2961 64-bit long mode, 32-bit IN-to-ring-3 : 105 274 ins/sec 95% / 100 763 197% / 208 026 2962 64-bit long mode, 32-bit OUT-to-ring-3 : 106 262 ins/sec 94% / 100 749 196% / 209 288 2963 NOP I/O Port Access : PASSED 2964 32-bit paged protected mode, 32-bit read : 57 687 ins/sec 119% / 69 136 1197% / 690 548 2965 32-bit paged protected mode, 32-bit write : 57 957 ins/sec 118% / 68 935 1183% / 685 930 2966 32-bit paged protected mode, 32-bit read-to-ring-3 : 57 958 ins/sec 95% / 55 432 276% / 160 505 2967 32-bit paged protected mode, 32-bit write-to-ring-3 : 57 922 ins/sec 100% / 58 340 304% / 176 464 2968 32-bit pae protected mode, 32-bit read : 57 478 ins/sec 119% / 68 453 1141% / 656 159 2969 32-bit pae protected mode, 32-bit write : 57 226 ins/sec 118% / 68 097 1157% / 662 504 2970 32-bit pae protected mode, 32-bit read-to-ring-3 : 57 582 ins/sec 94% / 54 651 268% / 154 867 2971 32-bit pae protected mode, 32-bit write-to-ring-3 : 57 697 ins/sec 100% / 57 750 299% / 173 030 2972 64-bit long mode, 32-bit read : 57 128 ins/sec 118% / 67 779 1071% / 611 949 2973 64-bit long mode, 32-bit write : 57 127 ins/sec 118% / 67 632 1084% / 619 395 2974 64-bit long mode, 32-bit read-to-ring-3 : 57 181 ins/sec 94% / 54 123 265% / 151 937 2975 64-bit long mode, 32-bit write-to-ring-3 : 57 297 ins/sec 99% / 57 286 294% / 168 694 2976 16-bit unpaged protected mode, 32-bit read : 58 827 ins/sec 118% / 69 545 1185% / 697 602 2977 16-bit unpaged protected mode, 32-bit write : 58 678 ins/sec 118% / 69 442 1183% / 694 387 2978 16-bit unpaged protected mode, 32-bit read-to-ring-3 : 57 841 ins/sec 96% / 55 730 275% / 159 163 2979 16-bit unpaged protected mode, 32-bit write-to-ring-3 : 57 855 ins/sec 101% / 58 834 304% / 176 169 2980 32-bit unpaged protected mode, 32-bit read : 58 063 ins/sec 120% / 69 690 1233% / 716 444 2981 32-bit unpaged protected mode, 32-bit write : 57 936 ins/sec 120% / 69 633 1199% / 694 753 2982 32-bit unpaged protected mode, 32-bit read-to-ring-3 : 58 451 ins/sec 96% / 56 183 273% / 159 972 2983 32-bit unpaged protected mode, 32-bit write-to-ring-3 : 58 962 ins/sec 99% / 58 955 298% / 175 936 2984 real mode, 32-bit read : 58 571 ins/sec 118% / 69 478 1160% / 679 917 2985 real mode, 32-bit write : 58 418 ins/sec 118% / 69 320 1185% / 692 513 2986 real mode, 32-bit read-to-ring-3 : 58 072 ins/sec 96% / 55 751 274% / 159 145 2987 real mode, 32-bit write-to-ring-3 : 57 870 ins/sec 101% / 58 755 307% / 178 042 2988 NOP MMIO Access : PASSED 2989 SUCCESS 2990 * @endverbatim 2991 * 2992 * What we see here is: 2993 * 2994 * - The WinHv API approach is 10 to 12 times slower for exits we can 2995 * handle directly in ring-0 in the VBox AMD-V code. 2996 * 2997 * - The WinHv API approach is 2 to 3 times slower for exits we have to 2998 * go to ring-3 to handle with the VBox AMD-V code. 2999 * 3000 * - By using hypercalls and VID.SYS from ring-0 we gain between 3001 * 13% and 20% over the WinHv API on exits handled in ring-0. 3002 * 3003 * - For exits requiring ring-3 handling are between 6% slower and 3% faster 3004 * than the WinHv API. 3005 * 3006 * 3007 * As a side note, it looks like Hyper-V doesn't let the guest read CR4 but 3008 * triggers exits all the time. This isn't all that important these days since 3009 * OSes like Linux cache the CR4 value specifically to avoid these kinds of exits. 3010 * 3011 */ 3012
Note:
See TracChangeset
for help on using the changeset viewer.