VirtualBox

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 43361

Last change on this file since 43361 was 43235, checked in by vboxsync, 12 years ago

GMMR0.cpp: Fixed bug in GMMR0CleanupVM/gmmR0CleanupVMScanChunk affecting bound mode on all 32-bit hosts + 64-bit darwin. Problem was caused by unncessary scanning of chunks bound to other VMs and accidentally relinking them into the set of the VM about to die. Once the GVM structure was finally fried, almost all pChunk->pSet members would point to the dead VMs GVM::gmm.s.Private member.

Also fixed a missing redo-from-start when someone else freed a chunk while we were scanning the list. Expecting this to only occure rarely, but should be reproducible when many VMs are doing cleanups at the same time in unbound mode.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id Revision
File size: 187.1 KB
Line 
1/* $Id: GMMR0.cpp 43235 2012-09-06 23:53:40Z vboxsync $ */
2/** @file
3 * GMM - Global Memory Manager.
4 */
5
6/*
7 * Copyright (C) 2007-2012 Oracle Corporation
8 *
9 * This file is part of VirtualBox Open Source Edition (OSE), as
10 * available from http://www.virtualbox.org. This file is free software;
11 * you can redistribute it and/or modify it under the terms of the GNU
12 * General Public License (GPL) as published by the Free Software
13 * Foundation, in version 2 as it comes in the "COPYING" file of the
14 * VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15 * hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16 */
17
18
19/** @page pg_gmm GMM - The Global Memory Manager
20 *
21 * As the name indicates, this component is responsible for global memory
22 * management. Currently only guest RAM is allocated from the GMM, but this
23 * may change to include shadow page tables and other bits later.
24 *
25 * Guest RAM is managed as individual pages, but allocated from the host OS
26 * in chunks for reasons of portability / efficiency. To minimize the memory
27 * footprint all tracking structure must be as small as possible without
28 * unnecessary performance penalties.
29 *
30 * The allocation chunks has fixed sized, the size defined at compile time
31 * by the #GMM_CHUNK_SIZE \#define.
32 *
33 * Each chunk is given an unique ID. Each page also has a unique ID. The
34 * relation ship between the two IDs is:
35 * @code
36 * GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37 * idPage = (idChunk << GMM_CHUNK_SHIFT) | iPage;
38 * @endcode
39 * Where iPage is the index of the page within the chunk. This ID scheme
40 * permits for efficient chunk and page lookup, but it relies on the chunk size
41 * to be set at compile time. The chunks are organized in an AVL tree with their
42 * IDs being the keys.
43 *
44 * The physical address of each page in an allocation chunk is maintained by
45 * the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46 * need to duplicate this information (it'll cost 8-bytes per page if we did).
47 *
48 * So what do we need to track per page? Most importantly we need to know
49 * which state the page is in:
50 * - Private - Allocated for (eventually) backing one particular VM page.
51 * - Shared - Readonly page that is used by one or more VMs and treated
52 * as COW by PGM.
53 * - Free - Not used by anyone.
54 *
55 * For the page replacement operations (sharing, defragmenting and freeing)
56 * to be somewhat efficient, private pages needs to be associated with a
57 * particular page in a particular VM.
58 *
59 * Tracking the usage of shared pages is impractical and expensive, so we'll
60 * settle for a reference counting system instead.
61 *
62 * Free pages will be chained on LIFOs
63 *
64 * On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65 * systems a 32-bit bitfield will have to suffice because of address space
66 * limitations. The #GMMPAGE structure shows the details.
67 *
68 *
69 * @section sec_gmm_alloc_strat Page Allocation Strategy
70 *
71 * The strategy for allocating pages has to take fragmentation and shared
72 * pages into account, or we may end up with with 2000 chunks with only
73 * a few pages in each. Shared pages cannot easily be reallocated because
74 * of the inaccurate usage accounting (see above). Private pages can be
75 * reallocated by a defragmentation thread in the same manner that sharing
76 * is done.
77 *
78 * The first approach is to manage the free pages in two sets depending on
79 * whether they are mainly for the allocation of shared or private pages.
80 * In the initial implementation there will be almost no possibility for
81 * mixing shared and private pages in the same chunk (only if we're really
82 * stressed on memory), but when we implement forking of VMs and have to
83 * deal with lots of COW pages it'll start getting kind of interesting.
84 *
85 * The sets are lists of chunks with approximately the same number of
86 * free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87 * consists of 16 lists. So, the first list will contain the chunks with
88 * 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89 * moved between the lists as pages are freed up or allocated.
90 *
91 *
92 * @section sec_gmm_costs Costs
93 *
94 * The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95 * entails. In addition there is the chunk cost of approximately
96 * (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97 *
98 * On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99 * and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100 * The cost on Linux is identical, but here it's because of sizeof(struct page *).
101 *
102 *
103 * @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104 *
105 * In legacy mode the page source is locked user pages and not
106 * #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107 * by the VM that locked it. We will make no attempt at implementing
108 * page sharing on these systems, just do enough to make it all work.
109 *
110 *
111 * @subsection sub_gmm_locking Serializing
112 *
113 * One simple fast mutex will be employed in the initial implementation, not
114 * two as mentioned in @ref subsec_pgmPhys_Serializing.
115 *
116 * @see @ref subsec_pgmPhys_Serializing
117 *
118 *
119 * @section sec_gmm_overcommit Memory Over-Commitment Management
120 *
121 * The GVM will have to do the system wide memory over-commitment
122 * management. My current ideas are:
123 * - Per VM oc policy that indicates how much to initially commit
124 * to it and what to do in a out-of-memory situation.
125 * - Prevent overtaxing the host.
126 *
127 * There are some challenges here, the main ones are configurability and
128 * security. Should we for instance permit anyone to request 100% memory
129 * commitment? Who should be allowed to do runtime adjustments of the
130 * config. And how to prevent these settings from being lost when the last
131 * VM process exits? The solution is probably to have an optional root
132 * daemon the will keep VMMR0.r0 in memory and enable the security measures.
133 *
134 *
135 *
136 * @section sec_gmm_numa NUMA
137 *
138 * NUMA considerations will be designed and implemented a bit later.
139 *
140 * The preliminary guesses is that we will have to try allocate memory as
141 * close as possible to the CPUs the VM is executed on (EMT and additional CPU
142 * threads). Which means it's mostly about allocation and sharing policies.
143 * Both the scheduler and allocator interface will to supply some NUMA info
144 * and we'll need to have a way to calc access costs.
145 *
146 */
147
148
149/*******************************************************************************
150* Header Files *
151*******************************************************************************/
152#define LOG_GROUP LOG_GROUP_GMM
153#include <VBox/rawpci.h>
154#include <VBox/vmm/vm.h>
155#include <VBox/vmm/gmm.h>
156#include "GMMR0Internal.h"
157#include <VBox/vmm/gvm.h>
158#include <VBox/vmm/pgm.h>
159#include <VBox/log.h>
160#include <VBox/param.h>
161#include <VBox/err.h>
162#include <iprt/asm.h>
163#include <iprt/avl.h>
164#ifdef VBOX_STRICT
165# include <iprt/crc.h>
166#endif
167#include <iprt/list.h>
168#include <iprt/mem.h>
169#include <iprt/memobj.h>
170#include <iprt/mp.h>
171#include <iprt/semaphore.h>
172#include <iprt/string.h>
173#include <iprt/time.h>
174
175
176/*******************************************************************************
177* Structures and Typedefs *
178*******************************************************************************/
179/** Pointer to set of free chunks. */
180typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
181
182/**
183 * The per-page tracking structure employed by the GMM.
184 *
185 * On 32-bit hosts we'll some trickery is necessary to compress all
186 * the information into 32-bits. When the fSharedFree member is set,
187 * the 30th bit decides whether it's a free page or not.
188 *
189 * Because of the different layout on 32-bit and 64-bit hosts, macros
190 * are used to get and set some of the data.
191 */
192typedef union GMMPAGE
193{
194#if HC_ARCH_BITS == 64
195 /** Unsigned integer view. */
196 uint64_t u;
197
198 /** The common view. */
199 struct GMMPAGECOMMON
200 {
201 uint32_t uStuff1 : 32;
202 uint32_t uStuff2 : 30;
203 /** The page state. */
204 uint32_t u2State : 2;
205 } Common;
206
207 /** The view of a private page. */
208 struct GMMPAGEPRIVATE
209 {
210 /** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
211 uint32_t pfn;
212 /** The GVM handle. (64K VMs) */
213 uint32_t hGVM : 16;
214 /** Reserved. */
215 uint32_t u16Reserved : 14;
216 /** The page state. */
217 uint32_t u2State : 2;
218 } Private;
219
220 /** The view of a shared page. */
221 struct GMMPAGESHARED
222 {
223 /** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
224 uint32_t pfn;
225 /** The reference count (64K VMs). */
226 uint32_t cRefs : 16;
227 /** Used for debug checksumming. */
228 uint32_t u14Checksum : 14;
229 /** The page state. */
230 uint32_t u2State : 2;
231 } Shared;
232
233 /** The view of a free page. */
234 struct GMMPAGEFREE
235 {
236 /** The index of the next page in the free list. UINT16_MAX is NIL. */
237 uint16_t iNext;
238 /** Reserved. Checksum or something? */
239 uint16_t u16Reserved0;
240 /** Reserved. Checksum or something? */
241 uint32_t u30Reserved1 : 30;
242 /** The page state. */
243 uint32_t u2State : 2;
244 } Free;
245
246#else /* 32-bit */
247 /** Unsigned integer view. */
248 uint32_t u;
249
250 /** The common view. */
251 struct GMMPAGECOMMON
252 {
253 uint32_t uStuff : 30;
254 /** The page state. */
255 uint32_t u2State : 2;
256 } Common;
257
258 /** The view of a private page. */
259 struct GMMPAGEPRIVATE
260 {
261 /** The guest page frame number. (Max addressable: 2 ^ 36) */
262 uint32_t pfn : 24;
263 /** The GVM handle. (127 VMs) */
264 uint32_t hGVM : 7;
265 /** The top page state bit, MBZ. */
266 uint32_t fZero : 1;
267 } Private;
268
269 /** The view of a shared page. */
270 struct GMMPAGESHARED
271 {
272 /** The reference count. */
273 uint32_t cRefs : 30;
274 /** The page state. */
275 uint32_t u2State : 2;
276 } Shared;
277
278 /** The view of a free page. */
279 struct GMMPAGEFREE
280 {
281 /** The index of the next page in the free list. UINT16_MAX is NIL. */
282 uint32_t iNext : 16;
283 /** Reserved. Checksum or something? */
284 uint32_t u14Reserved : 14;
285 /** The page state. */
286 uint32_t u2State : 2;
287 } Free;
288#endif
289} GMMPAGE;
290AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
291/** Pointer to a GMMPAGE. */
292typedef GMMPAGE *PGMMPAGE;
293
294
295/** @name The Page States.
296 * @{ */
297/** A private page. */
298#define GMM_PAGE_STATE_PRIVATE 0
299/** A private page - alternative value used on the 32-bit implementation.
300 * This will never be used on 64-bit hosts. */
301#define GMM_PAGE_STATE_PRIVATE_32 1
302/** A shared page. */
303#define GMM_PAGE_STATE_SHARED 2
304/** A free page. */
305#define GMM_PAGE_STATE_FREE 3
306/** @} */
307
308
309/** @def GMM_PAGE_IS_PRIVATE
310 *
311 * @returns true if private, false if not.
312 * @param pPage The GMM page.
313 */
314#if HC_ARCH_BITS == 64
315# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
316#else
317# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
318#endif
319
320/** @def GMM_PAGE_IS_SHARED
321 *
322 * @returns true if shared, false if not.
323 * @param pPage The GMM page.
324 */
325#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
326
327/** @def GMM_PAGE_IS_FREE
328 *
329 * @returns true if free, false if not.
330 * @param pPage The GMM page.
331 */
332#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
333
334/** @def GMM_PAGE_PFN_LAST
335 * The last valid guest pfn range.
336 * @remark Some of the values outside the range has special meaning,
337 * see GMM_PAGE_PFN_UNSHAREABLE.
338 */
339#if HC_ARCH_BITS == 64
340# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
341#else
342# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
343#endif
344AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
345
346/** @def GMM_PAGE_PFN_UNSHAREABLE
347 * Indicates that this page isn't used for normal guest memory and thus isn't shareable.
348 */
349#if HC_ARCH_BITS == 64
350# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
351#else
352# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
353#endif
354AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
355
356
357/**
358 * A GMM allocation chunk ring-3 mapping record.
359 *
360 * This should really be associated with a session and not a VM, but
361 * it's simpler to associated with a VM and cleanup with the VM object
362 * is destroyed.
363 */
364typedef struct GMMCHUNKMAP
365{
366 /** The mapping object. */
367 RTR0MEMOBJ hMapObj;
368 /** The VM owning the mapping. */
369 PGVM pGVM;
370} GMMCHUNKMAP;
371/** Pointer to a GMM allocation chunk mapping. */
372typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
373
374
375/**
376 * A GMM allocation chunk.
377 */
378typedef struct GMMCHUNK
379{
380 /** The AVL node core.
381 * The Key is the chunk ID. (Giant mtx.) */
382 AVLU32NODECORE Core;
383 /** The memory object.
384 * Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
385 * what the host can dish up with. (Chunk mtx protects mapping accesses
386 * and related frees.) */
387 RTR0MEMOBJ hMemObj;
388 /** Pointer to the next chunk in the free list. (Giant mtx.) */
389 PGMMCHUNK pFreeNext;
390 /** Pointer to the previous chunk in the free list. (Giant mtx.) */
391 PGMMCHUNK pFreePrev;
392 /** Pointer to the free set this chunk belongs to. NULL for
393 * chunks with no free pages. (Giant mtx.) */
394 PGMMCHUNKFREESET pSet;
395 /** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
396 RTLISTNODE ListNode;
397 /** Pointer to an array of mappings. (Chunk mtx.) */
398 PGMMCHUNKMAP paMappingsX;
399 /** The number of mappings. (Chunk mtx.) */
400 uint16_t cMappingsX;
401 /** The mapping lock this chunk is using using. UINT16_MAX if nobody is
402 * mapping or freeing anything. (Giant mtx.) */
403 uint8_t volatile iChunkMtx;
404 /** Flags field reserved for future use (like eliminating enmType).
405 * (Giant mtx.) */
406 uint8_t fFlags;
407 /** The head of the list of free pages. UINT16_MAX is the NIL value.
408 * (Giant mtx.) */
409 uint16_t iFreeHead;
410 /** The number of free pages. (Giant mtx.) */
411 uint16_t cFree;
412 /** The GVM handle of the VM that first allocated pages from this chunk, this
413 * is used as a preference when there are several chunks to choose from.
414 * When in bound memory mode this isn't a preference any longer. (Giant
415 * mtx.) */
416 uint16_t hGVM;
417 /** The ID of the NUMA node the memory mostly resides on. (Reserved for
418 * future use.) (Giant mtx.) */
419 uint16_t idNumaNode;
420 /** The number of private pages. (Giant mtx.) */
421 uint16_t cPrivate;
422 /** The number of shared pages. (Giant mtx.) */
423 uint16_t cShared;
424 /** The pages. (Giant mtx.) */
425 GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
426} GMMCHUNK;
427
428/** Indicates that the NUMA properies of the memory is unknown. */
429#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
430
431/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
432 * @{ */
433/** Indicates that the chunk is a large page (2MB). */
434#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
435/** @} */
436
437
438/**
439 * An allocation chunk TLB entry.
440 */
441typedef struct GMMCHUNKTLBE
442{
443 /** The chunk id. */
444 uint32_t idChunk;
445 /** Pointer to the chunk. */
446 PGMMCHUNK pChunk;
447} GMMCHUNKTLBE;
448/** Pointer to an allocation chunk TLB entry. */
449typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
450
451
452/** The number of entries tin the allocation chunk TLB. */
453#define GMM_CHUNKTLB_ENTRIES 32
454/** Gets the TLB entry index for the given Chunk ID. */
455#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
456
457/**
458 * An allocation chunk TLB.
459 */
460typedef struct GMMCHUNKTLB
461{
462 /** The TLB entries. */
463 GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
464} GMMCHUNKTLB;
465/** Pointer to an allocation chunk TLB. */
466typedef GMMCHUNKTLB *PGMMCHUNKTLB;
467
468
469/**
470 * The GMM instance data.
471 */
472typedef struct GMM
473{
474 /** Magic / eye catcher. GMM_MAGIC */
475 uint32_t u32Magic;
476 /** The number of threads waiting on the mutex. */
477 uint32_t cMtxContenders;
478 /** The fast mutex protecting the GMM.
479 * More fine grained locking can be implemented later if necessary. */
480 RTSEMFASTMUTEX hMtx;
481#ifdef VBOX_STRICT
482 /** The current mutex owner. */
483 RTNATIVETHREAD hMtxOwner;
484#endif
485 /** The chunk tree. */
486 PAVLU32NODECORE pChunks;
487 /** The chunk TLB. */
488 GMMCHUNKTLB ChunkTLB;
489 /** The private free set. */
490 GMMCHUNKFREESET PrivateX;
491 /** The shared free set. */
492 GMMCHUNKFREESET Shared;
493
494 /** Shared module tree (global).
495 * @todo separate trees for distinctly different guest OSes. */
496 PAVLLU32NODECORE pGlobalSharedModuleTree;
497 /** Sharable modules (count of nodes in pGlobalSharedModuleTree). */
498 uint32_t cShareableModules;
499
500 /** The chunk list. For simplifying the cleanup process. */
501 RTLISTANCHOR ChunkList;
502
503 /** The maximum number of pages we're allowed to allocate.
504 * @gcfgm 64-bit GMM/MaxPages Direct.
505 * @gcfgm 32-bit GMM/PctPages Relative to the number of host pages. */
506 uint64_t cMaxPages;
507 /** The number of pages that has been reserved.
508 * The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
509 uint64_t cReservedPages;
510 /** The number of pages that we have over-committed in reservations. */
511 uint64_t cOverCommittedPages;
512 /** The number of actually allocated (committed if you like) pages. */
513 uint64_t cAllocatedPages;
514 /** The number of pages that are shared. A subset of cAllocatedPages. */
515 uint64_t cSharedPages;
516 /** The number of pages that are actually shared between VMs. */
517 uint64_t cDuplicatePages;
518 /** The number of pages that are shared that has been left behind by
519 * VMs not doing proper cleanups. */
520 uint64_t cLeftBehindSharedPages;
521 /** The number of allocation chunks.
522 * (The number of pages we've allocated from the host can be derived from this.) */
523 uint32_t cChunks;
524 /** The number of current ballooned pages. */
525 uint64_t cBalloonedPages;
526
527 /** The legacy allocation mode indicator.
528 * This is determined at initialization time. */
529 bool fLegacyAllocationMode;
530 /** The bound memory mode indicator.
531 * When set, the memory will be bound to a specific VM and never
532 * shared. This is always set if fLegacyAllocationMode is set.
533 * (Also determined at initialization time.) */
534 bool fBoundMemoryMode;
535 /** The number of registered VMs. */
536 uint16_t cRegisteredVMs;
537
538 /** The number of freed chunks ever. This is used a list generation to
539 * avoid restarting the cleanup scanning when the list wasn't modified. */
540 uint32_t cFreedChunks;
541 /** The previous allocated Chunk ID.
542 * Used as a hint to avoid scanning the whole bitmap. */
543 uint32_t idChunkPrev;
544 /** Chunk ID allocation bitmap.
545 * Bits of allocated IDs are set, free ones are clear.
546 * The NIL id (0) is marked allocated. */
547 uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
548
549 /** The index of the next mutex to use. */
550 uint32_t iNextChunkMtx;
551 /** Chunk locks for reducing lock contention without having to allocate
552 * one lock per chunk. */
553 struct
554 {
555 /** The mutex */
556 RTSEMFASTMUTEX hMtx;
557 /** The number of threads currently using this mutex. */
558 uint32_t volatile cUsers;
559 } aChunkMtx[64];
560} GMM;
561/** Pointer to the GMM instance. */
562typedef GMM *PGMM;
563
564/** The value of GMM::u32Magic (Katsuhiro Otomo). */
565#define GMM_MAGIC UINT32_C(0x19540414)
566
567
568/**
569 * GMM chunk mutex state.
570 *
571 * This is returned by gmmR0ChunkMutexAcquire and is used by the other
572 * gmmR0ChunkMutex* methods.
573 */
574typedef struct GMMR0CHUNKMTXSTATE
575{
576 PGMM pGMM;
577 /** The index of the chunk mutex. */
578 uint8_t iChunkMtx;
579 /** The relevant flags (GMMR0CHUNK_MTX_XXX). */
580 uint8_t fFlags;
581} GMMR0CHUNKMTXSTATE;
582/** Pointer to a chunk mutex state. */
583typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
584
585/** @name GMMR0CHUNK_MTX_XXX
586 * @{ */
587#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
588#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
589#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
590#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
591#define GMMR0CHUNK_MTX_END UINT32_C(4)
592/** @} */
593
594
595/** The maximum number of shared modules per-vm. */
596#define GMM_MAX_SHARED_PER_VM_MODULES 2048
597/** The maximum number of shared modules GMM is allowed to track. */
598#define GMM_MAX_SHARED_GLOBAL_MODULES 16834
599
600
601/**
602 * Argument packet for gmmR0SharedModuleCleanup.
603 */
604typedef struct GMMR0SHMODPERVMDTORARGS
605{
606 PGVM pGVM;
607 PGMM pGMM;
608} GMMR0SHMODPERVMDTORARGS;
609
610/**
611 * Argument packet for gmmR0CheckSharedModule.
612 */
613typedef struct GMMCHECKSHAREDMODULEINFO
614{
615 PGVM pGVM;
616 VMCPUID idCpu;
617} GMMCHECKSHAREDMODULEINFO;
618
619/**
620 * Argument packet for gmmR0FindDupPageInChunk by GMMR0FindDuplicatePage.
621 */
622typedef struct GMMFINDDUPPAGEINFO
623{
624 PGVM pGVM;
625 PGMM pGMM;
626 uint8_t *pSourcePage;
627 bool fFoundDuplicate;
628} GMMFINDDUPPAGEINFO;
629
630
631/*******************************************************************************
632* Global Variables *
633*******************************************************************************/
634/** Pointer to the GMM instance data. */
635static PGMM g_pGMM = NULL;
636
637/** Macro for obtaining and validating the g_pGMM pointer.
638 *
639 * On failure it will return from the invoking function with the specified
640 * return value.
641 *
642 * @param pGMM The name of the pGMM variable.
643 * @param rc The return value on failure. Use VERR_GMM_INSTANCE for VBox
644 * status codes.
645 */
646#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
647 do { \
648 (pGMM) = g_pGMM; \
649 AssertPtrReturn((pGMM), (rc)); \
650 AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
651 } while (0)
652
653/** Macro for obtaining and validating the g_pGMM pointer, void function
654 * variant.
655 *
656 * On failure it will return from the invoking function.
657 *
658 * @param pGMM The name of the pGMM variable.
659 */
660#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
661 do { \
662 (pGMM) = g_pGMM; \
663 AssertPtrReturnVoid((pGMM)); \
664 AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
665 } while (0)
666
667
668/** @def GMM_CHECK_SANITY_UPON_ENTERING
669 * Checks the sanity of the GMM instance data before making changes.
670 *
671 * This is macro is a stub by default and must be enabled manually in the code.
672 *
673 * @returns true if sane, false if not.
674 * @param pGMM The name of the pGMM variable.
675 */
676#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
677# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
678#else
679# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
680#endif
681
682/** @def GMM_CHECK_SANITY_UPON_LEAVING
683 * Checks the sanity of the GMM instance data after making changes.
684 *
685 * This is macro is a stub by default and must be enabled manually in the code.
686 *
687 * @returns true if sane, false if not.
688 * @param pGMM The name of the pGMM variable.
689 */
690#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
691# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
692#else
693# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
694#endif
695
696/** @def GMM_CHECK_SANITY_IN_LOOPS
697 * Checks the sanity of the GMM instance in the allocation loops.
698 *
699 * This is macro is a stub by default and must be enabled manually in the code.
700 *
701 * @returns true if sane, false if not.
702 * @param pGMM The name of the pGMM variable.
703 */
704#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
705# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
706#else
707# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
708#endif
709
710
711/*******************************************************************************
712* Internal Functions *
713*******************************************************************************/
714static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
715static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
716DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
717DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
718DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
719#ifdef GMMR0_WITH_SANITY_CHECK
720static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
721#endif
722static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
723DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
724DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
725static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
726#ifdef VBOX_WITH_PAGE_SHARING
727static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
728# ifdef VBOX_STRICT
729static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage);
730# endif
731#endif
732
733
734
735/**
736 * Initializes the GMM component.
737 *
738 * This is called when the VMMR0.r0 module is loaded and protected by the
739 * loader semaphore.
740 *
741 * @returns VBox status code.
742 */
743GMMR0DECL(int) GMMR0Init(void)
744{
745 LogFlow(("GMMInit:\n"));
746
747 /*
748 * Allocate the instance data and the locks.
749 */
750 PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
751 if (!pGMM)
752 return VERR_NO_MEMORY;
753
754 pGMM->u32Magic = GMM_MAGIC;
755 for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
756 pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
757 RTListInit(&pGMM->ChunkList);
758 ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
759
760 int rc = RTSemFastMutexCreate(&pGMM->hMtx);
761 if (RT_SUCCESS(rc))
762 {
763 unsigned iMtx;
764 for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
765 {
766 rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
767 if (RT_FAILURE(rc))
768 break;
769 }
770 if (RT_SUCCESS(rc))
771 {
772 /*
773 * Check and see if RTR0MemObjAllocPhysNC works.
774 */
775#if 0 /* later, see @bufref{3170}. */
776 RTR0MEMOBJ MemObj;
777 rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
778 if (RT_SUCCESS(rc))
779 {
780 rc = RTR0MemObjFree(MemObj, true);
781 AssertRC(rc);
782 }
783 else if (rc == VERR_NOT_SUPPORTED)
784 pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
785 else
786 SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
787#else
788# if defined(RT_OS_WINDOWS) || (defined(RT_OS_SOLARIS) && ARCH_BITS == 64) || defined(RT_OS_LINUX) || defined(RT_OS_FREEBSD)
789 pGMM->fLegacyAllocationMode = false;
790# if ARCH_BITS == 32
791 /* Don't reuse possibly partial chunks because of the virtual
792 address space limitation. */
793 pGMM->fBoundMemoryMode = true;
794# else
795 pGMM->fBoundMemoryMode = false;
796# endif
797# else
798 pGMM->fLegacyAllocationMode = true;
799 pGMM->fBoundMemoryMode = true;
800# endif
801#endif
802
803 /*
804 * Query system page count and guess a reasonable cMaxPages value.
805 */
806 pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
807
808 g_pGMM = pGMM;
809 LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
810 return VINF_SUCCESS;
811 }
812
813 /*
814 * Bail out.
815 */
816 while (iMtx-- > 0)
817 RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
818 RTSemFastMutexDestroy(pGMM->hMtx);
819 }
820
821 pGMM->u32Magic = 0;
822 RTMemFree(pGMM);
823 SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
824 return rc;
825}
826
827
828/**
829 * Terminates the GMM component.
830 */
831GMMR0DECL(void) GMMR0Term(void)
832{
833 LogFlow(("GMMTerm:\n"));
834
835 /*
836 * Take care / be paranoid...
837 */
838 PGMM pGMM = g_pGMM;
839 if (!VALID_PTR(pGMM))
840 return;
841 if (pGMM->u32Magic != GMM_MAGIC)
842 {
843 SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
844 return;
845 }
846
847 /*
848 * Undo what init did and free all the resources we've acquired.
849 */
850 /* Destroy the fundamentals. */
851 g_pGMM = NULL;
852 pGMM->u32Magic = ~GMM_MAGIC;
853 RTSemFastMutexDestroy(pGMM->hMtx);
854 pGMM->hMtx = NIL_RTSEMFASTMUTEX;
855
856 /* Free any chunks still hanging around. */
857 RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
858
859 /* Destroy the chunk locks. */
860 for (unsigned iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
861 {
862 Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
863 RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
864 pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
865 }
866
867 /* Finally the instance data itself. */
868 RTMemFree(pGMM);
869 LogFlow(("GMMTerm: done\n"));
870}
871
872
873/**
874 * RTAvlU32Destroy callback.
875 *
876 * @returns 0
877 * @param pNode The node to destroy.
878 * @param pvGMM The GMM handle.
879 */
880static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
881{
882 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
883
884 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
885 SUPR0Printf("GMMR0Term: %p/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
886 pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
887
888 int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
889 if (RT_FAILURE(rc))
890 {
891 SUPR0Printf("GMMR0Term: %p/%#x: RTRMemObjFree(%p,true) -> %d (cMappings=%d)\n", pChunk,
892 pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
893 AssertRC(rc);
894 }
895 pChunk->hMemObj = NIL_RTR0MEMOBJ;
896
897 RTMemFree(pChunk->paMappingsX);
898 pChunk->paMappingsX = NULL;
899
900 RTMemFree(pChunk);
901 NOREF(pvGMM);
902 return 0;
903}
904
905
906/**
907 * Initializes the per-VM data for the GMM.
908 *
909 * This is called from within the GVMM lock (from GVMMR0CreateVM)
910 * and should only initialize the data members so GMMR0CleanupVM
911 * can deal with them. We reserve no memory or anything here,
912 * that's done later in GMMR0InitVM.
913 *
914 * @param pGVM Pointer to the Global VM structure.
915 */
916GMMR0DECL(void) GMMR0InitPerVMData(PGVM pGVM)
917{
918 AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
919
920 pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
921 pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
922 pGVM->gmm.s.Stats.fMayAllocate = false;
923}
924
925
926/**
927 * Acquires the GMM giant lock.
928 *
929 * @returns Assert status code from RTSemFastMutexRequest.
930 * @param pGMM Pointer to the GMM instance.
931 */
932static int gmmR0MutexAcquire(PGMM pGMM)
933{
934 ASMAtomicIncU32(&pGMM->cMtxContenders);
935 int rc = RTSemFastMutexRequest(pGMM->hMtx);
936 ASMAtomicDecU32(&pGMM->cMtxContenders);
937 AssertRC(rc);
938#ifdef VBOX_STRICT
939 pGMM->hMtxOwner = RTThreadNativeSelf();
940#endif
941 return rc;
942}
943
944
945/**
946 * Releases the GMM giant lock.
947 *
948 * @returns Assert status code from RTSemFastMutexRequest.
949 * @param pGMM Pointer to the GMM instance.
950 */
951static int gmmR0MutexRelease(PGMM pGMM)
952{
953#ifdef VBOX_STRICT
954 pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
955#endif
956 int rc = RTSemFastMutexRelease(pGMM->hMtx);
957 AssertRC(rc);
958 return rc;
959}
960
961
962/**
963 * Yields the GMM giant lock if there is contention and a certain minimum time
964 * has elapsed since we took it.
965 *
966 * @returns @c true if the mutex was yielded, @c false if not.
967 * @param pGMM Pointer to the GMM instance.
968 * @param puLockNanoTS Where the lock acquisition time stamp is kept
969 * (in/out).
970 */
971static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
972{
973 /*
974 * If nobody is contending the mutex, don't bother checking the time.
975 */
976 if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
977 return false;
978
979 /*
980 * Don't yield if we haven't executed for at least 2 milliseconds.
981 */
982 uint64_t uNanoNow = RTTimeSystemNanoTS();
983 if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
984 return false;
985
986 /*
987 * Yield the mutex.
988 */
989#ifdef VBOX_STRICT
990 pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
991#endif
992 ASMAtomicIncU32(&pGMM->cMtxContenders);
993 int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
994
995 RTThreadYield();
996
997 int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
998 *puLockNanoTS = RTTimeSystemNanoTS();
999 ASMAtomicDecU32(&pGMM->cMtxContenders);
1000#ifdef VBOX_STRICT
1001 pGMM->hMtxOwner = RTThreadNativeSelf();
1002#endif
1003
1004 return true;
1005}
1006
1007
1008/**
1009 * Acquires a chunk lock.
1010 *
1011 * The caller must own the giant lock.
1012 *
1013 * @returns Assert status code from RTSemFastMutexRequest.
1014 * @param pMtxState The chunk mutex state info. (Avoids
1015 * passing the same flags and stuff around
1016 * for subsequent release and drop-giant
1017 * calls.)
1018 * @param pGMM Pointer to the GMM instance.
1019 * @param pChunk Pointer to the chunk.
1020 * @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
1021 */
1022static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
1023{
1024 Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
1025 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1026
1027 pMtxState->pGMM = pGMM;
1028 pMtxState->fFlags = (uint8_t)fFlags;
1029
1030 /*
1031 * Get the lock index and reference the lock.
1032 */
1033 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1034 uint32_t iChunkMtx = pChunk->iChunkMtx;
1035 if (iChunkMtx == UINT8_MAX)
1036 {
1037 iChunkMtx = pGMM->iNextChunkMtx++;
1038 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1039
1040 /* Try get an unused one... */
1041 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1042 {
1043 iChunkMtx = pGMM->iNextChunkMtx++;
1044 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1045 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1046 {
1047 iChunkMtx = pGMM->iNextChunkMtx++;
1048 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1049 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1050 {
1051 iChunkMtx = pGMM->iNextChunkMtx++;
1052 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1053 }
1054 }
1055 }
1056
1057 pChunk->iChunkMtx = iChunkMtx;
1058 }
1059 AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1060 pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1061 ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1062
1063 /*
1064 * Drop the giant?
1065 */
1066 if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1067 {
1068 /** @todo GMM life cycle cleanup (we may race someone
1069 * destroying and cleaning up GMM)? */
1070 gmmR0MutexRelease(pGMM);
1071 }
1072
1073 /*
1074 * Take the chunk mutex.
1075 */
1076 int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1077 AssertRC(rc);
1078 return rc;
1079}
1080
1081
1082/**
1083 * Releases the GMM giant lock.
1084 *
1085 * @returns Assert status code from RTSemFastMutexRequest.
1086 * @param pGMM Pointer to the GMM instance.
1087 * @param pChunk Pointer to the chunk if it's still
1088 * alive, NULL if it isn't. This is used to deassociate
1089 * the chunk from the mutex on the way out so a new one
1090 * can be selected next time, thus avoiding contented
1091 * mutexes.
1092 */
1093static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1094{
1095 PGMM pGMM = pMtxState->pGMM;
1096
1097 /*
1098 * Release the chunk mutex and reacquire the giant if requested.
1099 */
1100 int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1101 AssertRC(rc);
1102 if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1103 rc = gmmR0MutexAcquire(pGMM);
1104 else
1105 Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1106
1107 /*
1108 * Drop the chunk mutex user reference and deassociate it from the chunk
1109 * when possible.
1110 */
1111 if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1112 && pChunk
1113 && RT_SUCCESS(rc) )
1114 {
1115 if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1116 pChunk->iChunkMtx = UINT8_MAX;
1117 else
1118 {
1119 rc = gmmR0MutexAcquire(pGMM);
1120 if (RT_SUCCESS(rc))
1121 {
1122 if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1123 pChunk->iChunkMtx = UINT8_MAX;
1124 rc = gmmR0MutexRelease(pGMM);
1125 }
1126 }
1127 }
1128
1129 pMtxState->pGMM = NULL;
1130 return rc;
1131}
1132
1133
1134/**
1135 * Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1136 * chunk locked.
1137 *
1138 * This only works if gmmR0ChunkMutexAcquire was called with
1139 * GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1140 * mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1141 *
1142 * @returns VBox status code (assuming success is ok).
1143 * @param pMtxState Pointer to the chunk mutex state.
1144 */
1145static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1146{
1147 AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_GMM_MTX_FLAGS);
1148 Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1149 pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1150 /** @todo GMM life cycle cleanup (we may race someone
1151 * destroying and cleaning up GMM)? */
1152 return gmmR0MutexRelease(pMtxState->pGMM);
1153}
1154
1155
1156/**
1157 * For experimenting with NUMA affinity and such.
1158 *
1159 * @returns The current NUMA Node ID.
1160 */
1161static uint16_t gmmR0GetCurrentNumaNodeId(void)
1162{
1163#if 1
1164 return GMM_CHUNK_NUMA_ID_UNKNOWN;
1165#else
1166 return RTMpCpuId() / 16;
1167#endif
1168}
1169
1170
1171
1172/**
1173 * Cleans up when a VM is terminating.
1174 *
1175 * @param pGVM Pointer to the Global VM structure.
1176 */
1177GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1178{
1179 LogFlow(("GMMR0CleanupVM: pGVM=%p:{.pVM=%p, .hSelf=%#x}\n", pGVM, pGVM->pVM, pGVM->hSelf));
1180
1181 PGMM pGMM;
1182 GMM_GET_VALID_INSTANCE_VOID(pGMM);
1183
1184#ifdef VBOX_WITH_PAGE_SHARING
1185 /*
1186 * Clean up all registered shared modules first.
1187 */
1188 gmmR0SharedModuleCleanup(pGMM, pGVM);
1189#endif
1190
1191 gmmR0MutexAcquire(pGMM);
1192 uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1193 GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1194
1195 /*
1196 * The policy is 'INVALID' until the initial reservation
1197 * request has been serviced.
1198 */
1199 if ( pGVM->gmm.s.Stats.enmPolicy > GMMOCPOLICY_INVALID
1200 && pGVM->gmm.s.Stats.enmPolicy < GMMOCPOLICY_END)
1201 {
1202 /*
1203 * If it's the last VM around, we can skip walking all the chunk looking
1204 * for the pages owned by this VM and instead flush the whole shebang.
1205 *
1206 * This takes care of the eventuality that a VM has left shared page
1207 * references behind (shouldn't happen of course, but you never know).
1208 */
1209 Assert(pGMM->cRegisteredVMs);
1210 pGMM->cRegisteredVMs--;
1211
1212 /*
1213 * Walk the entire pool looking for pages that belong to this VM
1214 * and leftover mappings. (This'll only catch private pages,
1215 * shared pages will be 'left behind'.)
1216 */
1217 /** @todo r=bird: This scanning+freeing could be optimized in bound mode! */
1218 uint64_t cPrivatePages = pGVM->gmm.s.Stats.cPrivatePages; /* save */
1219
1220 unsigned iCountDown = 64;
1221 bool fRedoFromStart;
1222 PGMMCHUNK pChunk;
1223 do
1224 {
1225 fRedoFromStart = false;
1226 RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1227 {
1228 uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1229 if ( ( !pGMM->fBoundMemoryMode
1230 || pChunk->hGVM == pGVM->hSelf)
1231 && gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1232 {
1233 /* We left the giant mutex, so reset the yield counters. */
1234 uLockNanoTS = RTTimeSystemNanoTS();
1235 iCountDown = 64;
1236 }
1237 else
1238 {
1239 /* Didn't leave it, so do normal yielding. */
1240 if (!iCountDown)
1241 gmmR0MutexYield(pGMM, &uLockNanoTS);
1242 else
1243 iCountDown--;
1244 }
1245 if (pGMM->cFreedChunks != cFreeChunksOld)
1246 {
1247 fRedoFromStart = true;
1248 break;
1249 }
1250 }
1251 } while (fRedoFromStart);
1252
1253 if (pGVM->gmm.s.Stats.cPrivatePages)
1254 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cPrivatePages);
1255
1256 pGMM->cAllocatedPages -= cPrivatePages;
1257
1258 /*
1259 * Free empty chunks.
1260 */
1261 PGMMCHUNKFREESET pPrivateSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
1262 do
1263 {
1264 fRedoFromStart = false;
1265 iCountDown = 10240;
1266 pChunk = pPrivateSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
1267 while (pChunk)
1268 {
1269 PGMMCHUNK pNext = pChunk->pFreeNext;
1270 Assert(pChunk->cFree == GMM_CHUNK_NUM_PAGES);
1271 if ( !pGMM->fBoundMemoryMode
1272 || pChunk->hGVM == pGVM->hSelf)
1273 {
1274 uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1275 if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /*fRelaxedSem*/))
1276 {
1277 /* We've left the giant mutex, restart? (+1 for our unlink) */
1278 fRedoFromStart = pPrivateSet->idGeneration != idGenerationOld + 1;
1279 if (fRedoFromStart)
1280 break;
1281 uLockNanoTS = RTTimeSystemNanoTS();
1282 iCountDown = 10240;
1283 }
1284 }
1285
1286 /* Advance and maybe yield the lock. */
1287 pChunk = pNext;
1288 if (--iCountDown == 0)
1289 {
1290 uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1291 fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1292 && pPrivateSet->idGeneration != idGenerationOld;
1293 if (fRedoFromStart)
1294 break;
1295 iCountDown = 10240;
1296 }
1297 }
1298 } while (fRedoFromStart);
1299
1300 /*
1301 * Account for shared pages that weren't freed.
1302 */
1303 if (pGVM->gmm.s.Stats.cSharedPages)
1304 {
1305 Assert(pGMM->cSharedPages >= pGVM->gmm.s.Stats.cSharedPages);
1306 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cSharedPages);
1307 pGMM->cLeftBehindSharedPages += pGVM->gmm.s.Stats.cSharedPages;
1308 }
1309
1310 /*
1311 * Clean up balloon statistics in case the VM process crashed.
1312 */
1313 Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
1314 pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
1315
1316 /*
1317 * Update the over-commitment management statistics.
1318 */
1319 pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1320 + pGVM->gmm.s.Stats.Reserved.cFixedPages
1321 + pGVM->gmm.s.Stats.Reserved.cShadowPages;
1322 switch (pGVM->gmm.s.Stats.enmPolicy)
1323 {
1324 case GMMOCPOLICY_NO_OC:
1325 break;
1326 default:
1327 /** @todo Update GMM->cOverCommittedPages */
1328 break;
1329 }
1330 }
1331
1332 /* zap the GVM data. */
1333 pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1334 pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1335 pGVM->gmm.s.Stats.fMayAllocate = false;
1336
1337 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1338 gmmR0MutexRelease(pGMM);
1339
1340 LogFlow(("GMMR0CleanupVM: returns\n"));
1341}
1342
1343
1344/**
1345 * Scan one chunk for private pages belonging to the specified VM.
1346 *
1347 * @note This function may drop the giant mutex!
1348 *
1349 * @returns @c true if we've temporarily dropped the giant mutex, @c false if
1350 * we didn't.
1351 * @param pGMM Pointer to the GMM instance.
1352 * @param pGVM The global VM handle.
1353 * @param pChunk The chunk to scan.
1354 */
1355static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1356{
1357 Assert(!pGMM->fBoundMemoryMode || pChunk->hGVM == pGVM->hSelf);
1358
1359 /*
1360 * Look for pages belonging to the VM.
1361 * (Perform some internal checks while we're scanning.)
1362 */
1363#ifndef VBOX_STRICT
1364 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1365#endif
1366 {
1367 unsigned cPrivate = 0;
1368 unsigned cShared = 0;
1369 unsigned cFree = 0;
1370
1371 gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1372
1373 uint16_t hGVM = pGVM->hSelf;
1374 unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1375 while (iPage-- > 0)
1376 if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1377 {
1378 if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1379 {
1380 /*
1381 * Free the page.
1382 *
1383 * The reason for not using gmmR0FreePrivatePage here is that we
1384 * must *not* cause the chunk to be freed from under us - we're in
1385 * an AVL tree walk here.
1386 */
1387 pChunk->aPages[iPage].u = 0;
1388 pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1389 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1390 pChunk->iFreeHead = iPage;
1391 pChunk->cPrivate--;
1392 pChunk->cFree++;
1393 pGVM->gmm.s.Stats.cPrivatePages--;
1394 cFree++;
1395 }
1396 else
1397 cPrivate++;
1398 }
1399 else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1400 cFree++;
1401 else
1402 cShared++;
1403
1404 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1405
1406 /*
1407 * Did it add up?
1408 */
1409 if (RT_UNLIKELY( pChunk->cFree != cFree
1410 || pChunk->cPrivate != cPrivate
1411 || pChunk->cShared != cShared))
1412 {
1413 SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %p/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1414 pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1415 pChunk->cFree = cFree;
1416 pChunk->cPrivate = cPrivate;
1417 pChunk->cShared = cShared;
1418 }
1419 }
1420
1421 /*
1422 * If not in bound memory mode, we should reset the hGVM field
1423 * if it has our handle in it.
1424 */
1425 if (pChunk->hGVM == pGVM->hSelf)
1426 {
1427 if (!g_pGMM->fBoundMemoryMode)
1428 pChunk->hGVM = NIL_GVM_HANDLE;
1429 else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1430 {
1431 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1432 pChunk, pChunk->Core.Key, pChunk->cFree);
1433 AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1434
1435 gmmR0UnlinkChunk(pChunk);
1436 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1437 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1438 }
1439 }
1440
1441 /*
1442 * Look for a mapping belonging to the terminating VM.
1443 */
1444 GMMR0CHUNKMTXSTATE MtxState;
1445 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1446 unsigned cMappings = pChunk->cMappingsX;
1447 for (unsigned i = 0; i < cMappings; i++)
1448 if (pChunk->paMappingsX[i].pGVM == pGVM)
1449 {
1450 gmmR0ChunkMutexDropGiant(&MtxState);
1451
1452 RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1453
1454 cMappings--;
1455 if (i < cMappings)
1456 pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1457 pChunk->paMappingsX[cMappings].pGVM = NULL;
1458 pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1459 Assert(pChunk->cMappingsX - 1U == cMappings);
1460 pChunk->cMappingsX = cMappings;
1461
1462 int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1463 if (RT_FAILURE(rc))
1464 {
1465 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: mapping #%x: RTRMemObjFree(%p,false) -> %d \n",
1466 pChunk, pChunk->Core.Key, i, hMemObj, rc);
1467 AssertRC(rc);
1468 }
1469
1470 gmmR0ChunkMutexRelease(&MtxState, pChunk);
1471 return true;
1472 }
1473
1474 gmmR0ChunkMutexRelease(&MtxState, pChunk);
1475 return false;
1476}
1477
1478
1479/**
1480 * The initial resource reservations.
1481 *
1482 * This will make memory reservations according to policy and priority. If there aren't
1483 * sufficient resources available to sustain the VM this function will fail and all
1484 * future allocations requests will fail as well.
1485 *
1486 * These are just the initial reservations made very very early during the VM creation
1487 * process and will be adjusted later in the GMMR0UpdateReservation call after the
1488 * ring-3 init has completed.
1489 *
1490 * @returns VBox status code.
1491 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1492 * @retval VERR_GMM_
1493 *
1494 * @param pVM Pointer to the VM.
1495 * @param idCpu The VCPU id.
1496 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1497 * This does not include MMIO2 and similar.
1498 * @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1499 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1500 * hyper heap, MMIO2 and similar.
1501 * @param enmPolicy The OC policy to use on this VM.
1502 * @param enmPriority The priority in an out-of-memory situation.
1503 *
1504 * @thread The creator thread / EMT.
1505 */
1506GMMR0DECL(int) GMMR0InitialReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages,
1507 GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1508{
1509 LogFlow(("GMMR0InitialReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1510 pVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1511
1512 /*
1513 * Validate, get basics and take the semaphore.
1514 */
1515 PGMM pGMM;
1516 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1517 PGVM pGVM;
1518 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1519 if (RT_FAILURE(rc))
1520 return rc;
1521
1522 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1523 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1524 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1525 AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1526 AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1527
1528 gmmR0MutexAcquire(pGMM);
1529 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1530 {
1531 if ( !pGVM->gmm.s.Stats.Reserved.cBasePages
1532 && !pGVM->gmm.s.Stats.Reserved.cFixedPages
1533 && !pGVM->gmm.s.Stats.Reserved.cShadowPages)
1534 {
1535 /*
1536 * Check if we can accommodate this.
1537 */
1538 /* ... later ... */
1539 if (RT_SUCCESS(rc))
1540 {
1541 /*
1542 * Update the records.
1543 */
1544 pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1545 pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1546 pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1547 pGVM->gmm.s.Stats.enmPolicy = enmPolicy;
1548 pGVM->gmm.s.Stats.enmPriority = enmPriority;
1549 pGVM->gmm.s.Stats.fMayAllocate = true;
1550
1551 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1552 pGMM->cRegisteredVMs++;
1553 }
1554 }
1555 else
1556 rc = VERR_WRONG_ORDER;
1557 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1558 }
1559 else
1560 rc = VERR_GMM_IS_NOT_SANE;
1561 gmmR0MutexRelease(pGMM);
1562 LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1563 return rc;
1564}
1565
1566
1567/**
1568 * VMMR0 request wrapper for GMMR0InitialReservation.
1569 *
1570 * @returns see GMMR0InitialReservation.
1571 * @param pVM Pointer to the VM.
1572 * @param idCpu The VCPU id.
1573 * @param pReq Pointer to the request packet.
1574 */
1575GMMR0DECL(int) GMMR0InitialReservationReq(PVM pVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1576{
1577 /*
1578 * Validate input and pass it on.
1579 */
1580 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1581 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1582 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1583
1584 return GMMR0InitialReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1585}
1586
1587
1588/**
1589 * This updates the memory reservation with the additional MMIO2 and ROM pages.
1590 *
1591 * @returns VBox status code.
1592 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1593 *
1594 * @param pVM Pointer to the VM.
1595 * @param idCpu The VCPU id.
1596 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1597 * This does not include MMIO2 and similar.
1598 * @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1599 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1600 * hyper heap, MMIO2 and similar.
1601 *
1602 * @thread EMT.
1603 */
1604GMMR0DECL(int) GMMR0UpdateReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages)
1605{
1606 LogFlow(("GMMR0UpdateReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1607 pVM, cBasePages, cShadowPages, cFixedPages));
1608
1609 /*
1610 * Validate, get basics and take the semaphore.
1611 */
1612 PGMM pGMM;
1613 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1614 PGVM pGVM;
1615 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1616 if (RT_FAILURE(rc))
1617 return rc;
1618
1619 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1620 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1621 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1622
1623 gmmR0MutexAcquire(pGMM);
1624 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1625 {
1626 if ( pGVM->gmm.s.Stats.Reserved.cBasePages
1627 && pGVM->gmm.s.Stats.Reserved.cFixedPages
1628 && pGVM->gmm.s.Stats.Reserved.cShadowPages)
1629 {
1630 /*
1631 * Check if we can accommodate this.
1632 */
1633 /* ... later ... */
1634 if (RT_SUCCESS(rc))
1635 {
1636 /*
1637 * Update the records.
1638 */
1639 pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1640 + pGVM->gmm.s.Stats.Reserved.cFixedPages
1641 + pGVM->gmm.s.Stats.Reserved.cShadowPages;
1642 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1643
1644 pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1645 pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1646 pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1647 }
1648 }
1649 else
1650 rc = VERR_WRONG_ORDER;
1651 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1652 }
1653 else
1654 rc = VERR_GMM_IS_NOT_SANE;
1655 gmmR0MutexRelease(pGMM);
1656 LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1657 return rc;
1658}
1659
1660
1661/**
1662 * VMMR0 request wrapper for GMMR0UpdateReservation.
1663 *
1664 * @returns see GMMR0UpdateReservation.
1665 * @param pVM Pointer to the VM.
1666 * @param idCpu The VCPU id.
1667 * @param pReq Pointer to the request packet.
1668 */
1669GMMR0DECL(int) GMMR0UpdateReservationReq(PVM pVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1670{
1671 /*
1672 * Validate input and pass it on.
1673 */
1674 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1675 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1676 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1677
1678 return GMMR0UpdateReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1679}
1680
1681#ifdef GMMR0_WITH_SANITY_CHECK
1682
1683/**
1684 * Performs sanity checks on a free set.
1685 *
1686 * @returns Error count.
1687 *
1688 * @param pGMM Pointer to the GMM instance.
1689 * @param pSet Pointer to the set.
1690 * @param pszSetName The set name.
1691 * @param pszFunction The function from which it was called.
1692 * @param uLine The line number.
1693 */
1694static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1695 const char *pszFunction, unsigned uLineNo)
1696{
1697 uint32_t cErrors = 0;
1698
1699 /*
1700 * Count the free pages in all the chunks and match it against pSet->cFreePages.
1701 */
1702 uint32_t cPages = 0;
1703 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1704 {
1705 for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1706 {
1707 /** @todo check that the chunk is hash into the right set. */
1708 cPages += pCur->cFree;
1709 }
1710 }
1711 if (RT_UNLIKELY(cPages != pSet->cFreePages))
1712 {
1713 SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1714 cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1715 cErrors++;
1716 }
1717
1718 return cErrors;
1719}
1720
1721
1722/**
1723 * Performs some sanity checks on the GMM while owning lock.
1724 *
1725 * @returns Error count.
1726 *
1727 * @param pGMM Pointer to the GMM instance.
1728 * @param pszFunction The function from which it is called.
1729 * @param uLineNo The line number.
1730 */
1731static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1732{
1733 uint32_t cErrors = 0;
1734
1735 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->PrivateX, "private", pszFunction, uLineNo);
1736 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1737 /** @todo add more sanity checks. */
1738
1739 return cErrors;
1740}
1741
1742#endif /* GMMR0_WITH_SANITY_CHECK */
1743
1744/**
1745 * Looks up a chunk in the tree and fill in the TLB entry for it.
1746 *
1747 * This is not expected to fail and will bitch if it does.
1748 *
1749 * @returns Pointer to the allocation chunk, NULL if not found.
1750 * @param pGMM Pointer to the GMM instance.
1751 * @param idChunk The ID of the chunk to find.
1752 * @param pTlbe Pointer to the TLB entry.
1753 */
1754static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1755{
1756 PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1757 AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1758 pTlbe->idChunk = idChunk;
1759 pTlbe->pChunk = pChunk;
1760 return pChunk;
1761}
1762
1763
1764/**
1765 * Finds a allocation chunk.
1766 *
1767 * This is not expected to fail and will bitch if it does.
1768 *
1769 * @returns Pointer to the allocation chunk, NULL if not found.
1770 * @param pGMM Pointer to the GMM instance.
1771 * @param idChunk The ID of the chunk to find.
1772 */
1773DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1774{
1775 /*
1776 * Do a TLB lookup, branch if not in the TLB.
1777 */
1778 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1779 if ( pTlbe->idChunk != idChunk
1780 || !pTlbe->pChunk)
1781 return gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1782 return pTlbe->pChunk;
1783}
1784
1785
1786/**
1787 * Finds a page.
1788 *
1789 * This is not expected to fail and will bitch if it does.
1790 *
1791 * @returns Pointer to the page, NULL if not found.
1792 * @param pGMM Pointer to the GMM instance.
1793 * @param idPage The ID of the page to find.
1794 */
1795DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1796{
1797 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1798 if (RT_LIKELY(pChunk))
1799 return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1800 return NULL;
1801}
1802
1803
1804/**
1805 * Gets the host physical address for a page given by it's ID.
1806 *
1807 * @returns The host physical address or NIL_RTHCPHYS.
1808 * @param pGMM Pointer to the GMM instance.
1809 * @param idPage The ID of the page to find.
1810 */
1811DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1812{
1813 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1814 if (RT_LIKELY(pChunk))
1815 return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1816 return NIL_RTHCPHYS;
1817}
1818
1819
1820/**
1821 * Selects the appropriate free list given the number of free pages.
1822 *
1823 * @returns Free list index.
1824 * @param cFree The number of free pages in the chunk.
1825 */
1826DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1827{
1828 unsigned iList = cFree >> GMM_CHUNK_FREE_SET_SHIFT;
1829 AssertMsg(iList < RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists) / RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists[0]),
1830 ("%d (%u)\n", iList, cFree));
1831 return iList;
1832}
1833
1834
1835/**
1836 * Unlinks the chunk from the free list it's currently on (if any).
1837 *
1838 * @param pChunk The allocation chunk.
1839 */
1840DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1841{
1842 PGMMCHUNKFREESET pSet = pChunk->pSet;
1843 if (RT_LIKELY(pSet))
1844 {
1845 pSet->cFreePages -= pChunk->cFree;
1846 pSet->idGeneration++;
1847
1848 PGMMCHUNK pPrev = pChunk->pFreePrev;
1849 PGMMCHUNK pNext = pChunk->pFreeNext;
1850 if (pPrev)
1851 pPrev->pFreeNext = pNext;
1852 else
1853 pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
1854 if (pNext)
1855 pNext->pFreePrev = pPrev;
1856
1857 pChunk->pSet = NULL;
1858 pChunk->pFreeNext = NULL;
1859 pChunk->pFreePrev = NULL;
1860 }
1861 else
1862 {
1863 Assert(!pChunk->pFreeNext);
1864 Assert(!pChunk->pFreePrev);
1865 Assert(!pChunk->cFree);
1866 }
1867}
1868
1869
1870/**
1871 * Links the chunk onto the appropriate free list in the specified free set.
1872 *
1873 * If no free entries, it's not linked into any list.
1874 *
1875 * @param pChunk The allocation chunk.
1876 * @param pSet The free set.
1877 */
1878DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
1879{
1880 Assert(!pChunk->pSet);
1881 Assert(!pChunk->pFreeNext);
1882 Assert(!pChunk->pFreePrev);
1883
1884 if (pChunk->cFree > 0)
1885 {
1886 pChunk->pSet = pSet;
1887 pChunk->pFreePrev = NULL;
1888 unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
1889 pChunk->pFreeNext = pSet->apLists[iList];
1890 if (pChunk->pFreeNext)
1891 pChunk->pFreeNext->pFreePrev = pChunk;
1892 pSet->apLists[iList] = pChunk;
1893
1894 pSet->cFreePages += pChunk->cFree;
1895 pSet->idGeneration++;
1896 }
1897}
1898
1899
1900/**
1901 * Links the chunk onto the appropriate free list in the specified free set.
1902 *
1903 * If no free entries, it's not linked into any list.
1904 *
1905 * @param pChunk The allocation chunk.
1906 */
1907DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1908{
1909 PGMMCHUNKFREESET pSet;
1910 if (pGMM->fBoundMemoryMode)
1911 pSet = &pGVM->gmm.s.Private;
1912 else if (pChunk->cShared)
1913 pSet = &pGMM->Shared;
1914 else
1915 pSet = &pGMM->PrivateX;
1916 gmmR0LinkChunk(pChunk, pSet);
1917}
1918
1919
1920/**
1921 * Frees a Chunk ID.
1922 *
1923 * @param pGMM Pointer to the GMM instance.
1924 * @param idChunk The Chunk ID to free.
1925 */
1926static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
1927{
1928 AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
1929 AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
1930 ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
1931}
1932
1933
1934/**
1935 * Allocates a new Chunk ID.
1936 *
1937 * @returns The Chunk ID.
1938 * @param pGMM Pointer to the GMM instance.
1939 */
1940static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
1941{
1942 AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
1943 AssertCompile(NIL_GMM_CHUNKID == 0);
1944
1945 /*
1946 * Try the next sequential one.
1947 */
1948 int32_t idChunk = ++pGMM->idChunkPrev;
1949#if 0 /** @todo enable this code */
1950 if ( idChunk <= GMM_CHUNKID_LAST
1951 && idChunk > NIL_GMM_CHUNKID
1952 && !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
1953 return idChunk;
1954#endif
1955
1956 /*
1957 * Scan sequentially from the last one.
1958 */
1959 if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
1960 && idChunk > NIL_GMM_CHUNKID)
1961 {
1962 idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk);
1963 if (idChunk > NIL_GMM_CHUNKID)
1964 {
1965 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1966 return pGMM->idChunkPrev = idChunk;
1967 }
1968 }
1969
1970 /*
1971 * Ok, scan from the start.
1972 * We're not racing anyone, so there is no need to expect failures or have restart loops.
1973 */
1974 idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
1975 AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
1976 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1977
1978 return pGMM->idChunkPrev = idChunk;
1979}
1980
1981
1982/**
1983 * Allocates one private page.
1984 *
1985 * Worker for gmmR0AllocatePages.
1986 *
1987 * @param pChunk The chunk to allocate it from.
1988 * @param hGVM The GVM handle of the VM requesting memory.
1989 * @param pPageDesc The page descriptor.
1990 */
1991static void gmmR0AllocatePage(PGMMCHUNK pChunk, uint32_t hGVM, PGMMPAGEDESC pPageDesc)
1992{
1993 /* update the chunk stats. */
1994 if (pChunk->hGVM == NIL_GVM_HANDLE)
1995 pChunk->hGVM = hGVM;
1996 Assert(pChunk->cFree);
1997 pChunk->cFree--;
1998 pChunk->cPrivate++;
1999
2000 /* unlink the first free page. */
2001 const uint32_t iPage = pChunk->iFreeHead;
2002 AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2003 PGMMPAGE pPage = &pChunk->aPages[iPage];
2004 Assert(GMM_PAGE_IS_FREE(pPage));
2005 pChunk->iFreeHead = pPage->Free.iNext;
2006 Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2007 pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage,
2008 pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2009
2010 /* make the page private. */
2011 pPage->u = 0;
2012 AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2013 pPage->Private.hGVM = hGVM;
2014 AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2015 AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2016 if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2017 pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2018 else
2019 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2020
2021 /* update the page descriptor. */
2022 pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2023 Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
2024 pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage;
2025 pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2026}
2027
2028
2029/**
2030 * Picks the free pages from a chunk.
2031 *
2032 * @returns The new page descriptor table index.
2033 * @param pGMM Pointer to the GMM instance data.
2034 * @param hGVM The global VM handle.
2035 * @param pChunk The chunk.
2036 * @param iPage The current page descriptor table index.
2037 * @param cPages The total number of pages to allocate.
2038 * @param paPages The page descriptor table (input + ouput).
2039 */
2040static uint32_t gmmR0AllocatePagesFromChunk(PGMMCHUNK pChunk, uint16_t const hGVM, uint32_t iPage, uint32_t cPages,
2041 PGMMPAGEDESC paPages)
2042{
2043 PGMMCHUNKFREESET pSet = pChunk->pSet; Assert(pSet);
2044 gmmR0UnlinkChunk(pChunk);
2045
2046 for (; pChunk->cFree && iPage < cPages; iPage++)
2047 gmmR0AllocatePage(pChunk, hGVM, &paPages[iPage]);
2048
2049 gmmR0LinkChunk(pChunk, pSet);
2050 return iPage;
2051}
2052
2053
2054/**
2055 * Registers a new chunk of memory.
2056 *
2057 * This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk.
2058 *
2059 * @returns VBox status code. On success, the giant GMM lock will be held, the
2060 * caller must release it (ugly).
2061 * @param pGMM Pointer to the GMM instance.
2062 * @param pSet Pointer to the set.
2063 * @param MemObj The memory object for the chunk.
2064 * @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2065 * affinity.
2066 * @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
2067 * @param ppChunk Chunk address (out). Optional.
2068 *
2069 * @remarks The caller must not own the giant GMM mutex.
2070 * The giant GMM mutex will be acquired and returned acquired in
2071 * the success path. On failure, no locks will be held.
2072 */
2073static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ MemObj, uint16_t hGVM, uint16_t fChunkFlags,
2074 PGMMCHUNK *ppChunk)
2075{
2076 Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
2077 Assert(hGVM != NIL_GVM_HANDLE || pGMM->fBoundMemoryMode);
2078 Assert(fChunkFlags == 0 || fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
2079
2080 int rc;
2081 PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
2082 if (pChunk)
2083 {
2084 /*
2085 * Initialize it.
2086 */
2087 pChunk->hMemObj = MemObj;
2088 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
2089 pChunk->hGVM = hGVM;
2090 /*pChunk->iFreeHead = 0;*/
2091 pChunk->idNumaNode = gmmR0GetCurrentNumaNodeId();
2092 pChunk->iChunkMtx = UINT8_MAX;
2093 pChunk->fFlags = fChunkFlags;
2094 for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
2095 {
2096 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
2097 pChunk->aPages[iPage].Free.iNext = iPage + 1;
2098 }
2099 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
2100 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
2101
2102 /*
2103 * Allocate a Chunk ID and insert it into the tree.
2104 * This has to be done behind the mutex of course.
2105 */
2106 rc = gmmR0MutexAcquire(pGMM);
2107 if (RT_SUCCESS(rc))
2108 {
2109 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2110 {
2111 pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
2112 if ( pChunk->Core.Key != NIL_GMM_CHUNKID
2113 && pChunk->Core.Key <= GMM_CHUNKID_LAST
2114 && RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
2115 {
2116 pGMM->cChunks++;
2117 RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
2118 gmmR0LinkChunk(pChunk, pSet);
2119 LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
2120
2121 if (ppChunk)
2122 *ppChunk = pChunk;
2123 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2124 return VINF_SUCCESS;
2125 }
2126
2127 /* bail out */
2128 rc = VERR_GMM_CHUNK_INSERT;
2129 }
2130 else
2131 rc = VERR_GMM_IS_NOT_SANE;
2132 gmmR0MutexRelease(pGMM);
2133 }
2134
2135 RTMemFree(pChunk);
2136 }
2137 else
2138 rc = VERR_NO_MEMORY;
2139 return rc;
2140}
2141
2142
2143/**
2144 * Allocate a new chunk, immediately pick the requested pages from it, and adds
2145 * what's remaining to the specified free set.
2146 *
2147 * @note This will leave the giant mutex while allocating the new chunk!
2148 *
2149 * @returns VBox status code.
2150 * @param pGMM Pointer to the GMM instance data.
2151 * @param pGVM Pointer to the kernel-only VM instace data.
2152 * @param pSet Pointer to the free set.
2153 * @param cPages The number of pages requested.
2154 * @param paPages The page descriptor table (input + output).
2155 * @param piPage The pointer to the page descriptor table index
2156 * variable. This will be updated.
2157 */
2158static int gmmR0AllocateChunkNew(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages,
2159 PGMMPAGEDESC paPages, uint32_t *piPage)
2160{
2161 gmmR0MutexRelease(pGMM);
2162
2163 RTR0MEMOBJ hMemObj;
2164 int rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2165 if (RT_SUCCESS(rc))
2166 {
2167/** @todo Duplicate gmmR0RegisterChunk here so we can avoid chaining up the
2168 * free pages first and then unchaining them right afterwards. Instead
2169 * do as much work as possible without holding the giant lock. */
2170 PGMMCHUNK pChunk;
2171 rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, 0 /*fChunkFlags*/, &pChunk);
2172 if (RT_SUCCESS(rc))
2173 {
2174 *piPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, *piPage, cPages, paPages);
2175 return VINF_SUCCESS;
2176 }
2177
2178 /* bail out */
2179 RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
2180 }
2181
2182 int rc2 = gmmR0MutexAcquire(pGMM);
2183 AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2184 return rc;
2185
2186}
2187
2188
2189/**
2190 * As a last restort we'll pick any page we can get.
2191 *
2192 * @returns The new page descriptor table index.
2193 * @param pSet The set to pick from.
2194 * @param pGVM Pointer to the global VM structure.
2195 * @param iPage The current page descriptor table index.
2196 * @param cPages The total number of pages to allocate.
2197 * @param paPages The page descriptor table (input + ouput).
2198 */
2199static uint32_t gmmR0AllocatePagesIndiscriminately(PGMMCHUNKFREESET pSet, PGVM pGVM,
2200 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2201{
2202 unsigned iList = RT_ELEMENTS(pSet->apLists);
2203 while (iList-- > 0)
2204 {
2205 PGMMCHUNK pChunk = pSet->apLists[iList];
2206 while (pChunk)
2207 {
2208 PGMMCHUNK pNext = pChunk->pFreeNext;
2209
2210 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2211 if (iPage >= cPages)
2212 return iPage;
2213
2214 pChunk = pNext;
2215 }
2216 }
2217 return iPage;
2218}
2219
2220
2221/**
2222 * Pick pages from empty chunks on the same NUMA node.
2223 *
2224 * @returns The new page descriptor table index.
2225 * @param pSet The set to pick from.
2226 * @param pGVM Pointer to the global VM structure.
2227 * @param iPage The current page descriptor table index.
2228 * @param cPages The total number of pages to allocate.
2229 * @param paPages The page descriptor table (input + ouput).
2230 */
2231static uint32_t gmmR0AllocatePagesFromEmptyChunksOnSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2232 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2233{
2234 PGMMCHUNK pChunk = pSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
2235 if (pChunk)
2236 {
2237 uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2238 while (pChunk)
2239 {
2240 PGMMCHUNK pNext = pChunk->pFreeNext;
2241
2242 if (pChunk->idNumaNode == idNumaNode)
2243 {
2244 pChunk->hGVM = pGVM->hSelf;
2245 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2246 if (iPage >= cPages)
2247 {
2248 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2249 return iPage;
2250 }
2251 }
2252
2253 pChunk = pNext;
2254 }
2255 }
2256 return iPage;
2257}
2258
2259
2260/**
2261 * Pick pages from non-empty chunks on the same NUMA node.
2262 *
2263 * @returns The new page descriptor table index.
2264 * @param pSet The set to pick from.
2265 * @param pGVM Pointer to the global VM structure.
2266 * @param iPage The current page descriptor table index.
2267 * @param cPages The total number of pages to allocate.
2268 * @param paPages The page descriptor table (input + ouput).
2269 */
2270static uint32_t gmmR0AllocatePagesFromSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2271 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2272{
2273 /** @todo start by picking from chunks with about the right size first? */
2274 uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2275 unsigned iList = GMM_CHUNK_FREE_SET_UNUSED_LIST;
2276 while (iList-- > 0)
2277 {
2278 PGMMCHUNK pChunk = pSet->apLists[iList];
2279 while (pChunk)
2280 {
2281 PGMMCHUNK pNext = pChunk->pFreeNext;
2282
2283 if (pChunk->idNumaNode == idNumaNode)
2284 {
2285 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2286 if (iPage >= cPages)
2287 {
2288 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2289 return iPage;
2290 }
2291 }
2292
2293 pChunk = pNext;
2294 }
2295 }
2296 return iPage;
2297}
2298
2299
2300/**
2301 * Pick pages that are in chunks already associated with the VM.
2302 *
2303 * @returns The new page descriptor table index.
2304 * @param pGMM Pointer to the GMM instance data.
2305 * @param pGVM Pointer to the global VM structure.
2306 * @param pSet The set to pick from.
2307 * @param iPage The current page descriptor table index.
2308 * @param cPages The total number of pages to allocate.
2309 * @param paPages The page descriptor table (input + ouput).
2310 */
2311static uint32_t gmmR0AllocatePagesAssociatedWithVM(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet,
2312 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2313{
2314 uint16_t const hGVM = pGVM->hSelf;
2315
2316 /* Hint. */
2317 if (pGVM->gmm.s.idLastChunkHint != NIL_GMM_CHUNKID)
2318 {
2319 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pGVM->gmm.s.idLastChunkHint);
2320 if (pChunk && pChunk->cFree)
2321 {
2322 iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2323 if (iPage >= cPages)
2324 return iPage;
2325 }
2326 }
2327
2328 /* Scan. */
2329 for (unsigned iList = 0; iList < RT_ELEMENTS(pSet->apLists); iList++)
2330 {
2331 PGMMCHUNK pChunk = pSet->apLists[iList];
2332 while (pChunk)
2333 {
2334 PGMMCHUNK pNext = pChunk->pFreeNext;
2335
2336 if (pChunk->hGVM == hGVM)
2337 {
2338 iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2339 if (iPage >= cPages)
2340 {
2341 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2342 return iPage;
2343 }
2344 }
2345
2346 pChunk = pNext;
2347 }
2348 }
2349 return iPage;
2350}
2351
2352
2353
2354/**
2355 * Pick pages in bound memory mode.
2356 *
2357 * @returns The new page descriptor table index.
2358 * @param pGVM Pointer to the global VM structure.
2359 * @param iPage The current page descriptor table index.
2360 * @param cPages The total number of pages to allocate.
2361 * @param paPages The page descriptor table (input + ouput).
2362 */
2363static uint32_t gmmR0AllocatePagesInBoundMode(PGVM pGVM, uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2364{
2365 for (unsigned iList = 0; iList < RT_ELEMENTS(pGVM->gmm.s.Private.apLists); iList++)
2366 {
2367 PGMMCHUNK pChunk = pGVM->gmm.s.Private.apLists[iList];
2368 while (pChunk)
2369 {
2370 Assert(pChunk->hGVM == pGVM->hSelf);
2371 PGMMCHUNK pNext = pChunk->pFreeNext;
2372 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2373 if (iPage >= cPages)
2374 return iPage;
2375 pChunk = pNext;
2376 }
2377 }
2378 return iPage;
2379}
2380
2381
2382/**
2383 * Checks if we should start picking pages from chunks of other VMs.
2384 *
2385 * @returns @c true if we should, @c false if we should first try allocate more
2386 * chunks.
2387 */
2388static bool gmmR0ShouldAllocatePagesInOtherChunks(PGVM pGVM)
2389{
2390 /*
2391 * Don't allocate a new chunk if we're
2392 */
2393 uint64_t cPgReserved = pGVM->gmm.s.Stats.Reserved.cBasePages
2394 + pGVM->gmm.s.Stats.Reserved.cFixedPages
2395 - pGVM->gmm.s.Stats.cBalloonedPages
2396 /** @todo what about shared pages? */;
2397 uint64_t cPgAllocated = pGVM->gmm.s.Stats.Allocated.cBasePages
2398 + pGVM->gmm.s.Stats.Allocated.cFixedPages;
2399 uint64_t cPgDelta = cPgReserved - cPgAllocated;
2400 if (cPgDelta < GMM_CHUNK_NUM_PAGES * 4)
2401 return true;
2402 /** @todo make the threshold configurable, also test the code to see if
2403 * this ever kicks in (we might be reserving too much or smth). */
2404
2405 /*
2406 * Check how close we're to the max memory limit and how many fragments
2407 * there are?...
2408 */
2409 /** @todo. */
2410
2411 return false;
2412}
2413
2414
2415/**
2416 * Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2417 *
2418 * @returns VBox status code:
2419 * @retval VINF_SUCCESS on success.
2420 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
2421 * gmmR0AllocateMoreChunks is necessary.
2422 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2423 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2424 * that is we're trying to allocate more than we've reserved.
2425 *
2426 * @param pGMM Pointer to the GMM instance data.
2427 * @param pGVM Pointer to the VM.
2428 * @param cPages The number of pages to allocate.
2429 * @param paPages Pointer to the page descriptors.
2430 * See GMMPAGEDESC for details on what is expected on input.
2431 * @param enmAccount The account to charge.
2432 *
2433 * @remarks Call takes the giant GMM lock.
2434 */
2435static int gmmR0AllocatePagesNew(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2436{
2437 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
2438
2439 /*
2440 * Check allocation limits.
2441 */
2442 if (RT_UNLIKELY(pGMM->cAllocatedPages + cPages > pGMM->cMaxPages))
2443 return VERR_GMM_HIT_GLOBAL_LIMIT;
2444
2445 switch (enmAccount)
2446 {
2447 case GMMACCOUNT_BASE:
2448 if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2449 > pGVM->gmm.s.Stats.Reserved.cBasePages))
2450 {
2451 Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2452 pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages,
2453 pGVM->gmm.s.Stats.cBalloonedPages, cPages));
2454 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2455 }
2456 break;
2457 case GMMACCOUNT_SHADOW:
2458 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages + cPages > pGVM->gmm.s.Stats.Reserved.cShadowPages))
2459 {
2460 Log(("gmmR0AllocatePages:Shadow: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2461 pGVM->gmm.s.Stats.Reserved.cShadowPages, pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
2462 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2463 }
2464 break;
2465 case GMMACCOUNT_FIXED:
2466 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages + cPages > pGVM->gmm.s.Stats.Reserved.cFixedPages))
2467 {
2468 Log(("gmmR0AllocatePages:Fixed: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2469 pGVM->gmm.s.Stats.Reserved.cFixedPages, pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
2470 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2471 }
2472 break;
2473 default:
2474 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2475 }
2476
2477 /*
2478 * If we're in legacy memory mode, it's easy to figure if we have
2479 * sufficient number of pages up-front.
2480 */
2481 if ( pGMM->fLegacyAllocationMode
2482 && pGVM->gmm.s.Private.cFreePages < cPages)
2483 {
2484 Assert(pGMM->fBoundMemoryMode);
2485 return VERR_GMM_SEED_ME;
2486 }
2487
2488 /*
2489 * Update the accounts before we proceed because we might be leaving the
2490 * protection of the global mutex and thus run the risk of permitting
2491 * too much memory to be allocated.
2492 */
2493 switch (enmAccount)
2494 {
2495 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages += cPages; break;
2496 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages += cPages; break;
2497 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages += cPages; break;
2498 default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2499 }
2500 pGVM->gmm.s.Stats.cPrivatePages += cPages;
2501 pGMM->cAllocatedPages += cPages;
2502
2503 /*
2504 * Part two of it's-easy-in-legacy-memory-mode.
2505 */
2506 uint32_t iPage = 0;
2507 if (pGMM->fLegacyAllocationMode)
2508 {
2509 iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2510 AssertReleaseReturn(iPage == cPages, VERR_GMM_ALLOC_PAGES_IPE);
2511 return VINF_SUCCESS;
2512 }
2513
2514 /*
2515 * Bound mode is also relatively straightforward.
2516 */
2517 int rc = VINF_SUCCESS;
2518 if (pGMM->fBoundMemoryMode)
2519 {
2520 iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2521 if (iPage < cPages)
2522 do
2523 rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGVM->gmm.s.Private, cPages, paPages, &iPage);
2524 while (iPage < cPages && RT_SUCCESS(rc));
2525 }
2526 /*
2527 * Shared mode is trickier as we should try archive the same locality as
2528 * in bound mode, but smartly make use of non-full chunks allocated by
2529 * other VMs if we're low on memory.
2530 */
2531 else
2532 {
2533 /* Pick the most optimal pages first. */
2534 iPage = gmmR0AllocatePagesAssociatedWithVM(pGMM, pGVM, &pGMM->PrivateX, iPage, cPages, paPages);
2535 if (iPage < cPages)
2536 {
2537 /* Maybe we should try getting pages from chunks "belonging" to
2538 other VMs before allocating more chunks? */
2539 if (gmmR0ShouldAllocatePagesInOtherChunks(pGVM))
2540 iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2541
2542 /* Allocate memory from empty chunks. */
2543 if (iPage < cPages)
2544 iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2545
2546 /* Grab empty shared chunks. */
2547 if (iPage < cPages)
2548 iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2549
2550 /*
2551 * Ok, try allocate new chunks.
2552 */
2553 if (iPage < cPages)
2554 {
2555 do
2556 rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGMM->PrivateX, cPages, paPages, &iPage);
2557 while (iPage < cPages && RT_SUCCESS(rc));
2558
2559 /* If the host is out of memory, take whatever we can get. */
2560 if ( (rc == VERR_NO_MEMORY || rc == VERR_NO_PHYS_MEMORY)
2561 && pGMM->PrivateX.cFreePages + pGMM->Shared.cFreePages >= cPages - iPage)
2562 {
2563 iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2564 if (iPage < cPages)
2565 iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2566 AssertRelease(iPage == cPages);
2567 rc = VINF_SUCCESS;
2568 }
2569 }
2570 }
2571 }
2572
2573 /*
2574 * Clean up on failure. Since this is bound to be a low-memory condition
2575 * we will give back any empty chunks that might be hanging around.
2576 */
2577 if (RT_FAILURE(rc))
2578 {
2579 /* Update the statistics. */
2580 pGVM->gmm.s.Stats.cPrivatePages -= cPages;
2581 pGMM->cAllocatedPages -= cPages - iPage;
2582 switch (enmAccount)
2583 {
2584 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages; break;
2585 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= cPages; break;
2586 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= cPages; break;
2587 default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2588 }
2589
2590 /* Release the pages. */
2591 while (iPage-- > 0)
2592 {
2593 uint32_t idPage = paPages[iPage].idPage;
2594 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2595 if (RT_LIKELY(pPage))
2596 {
2597 Assert(GMM_PAGE_IS_PRIVATE(pPage));
2598 Assert(pPage->Private.hGVM == pGVM->hSelf);
2599 gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
2600 }
2601 else
2602 AssertMsgFailed(("idPage=%#x\n", idPage));
2603
2604 paPages[iPage].idPage = NIL_GMM_PAGEID;
2605 paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2606 paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2607 }
2608
2609 /* Free empty chunks. */
2610 /** @todo */
2611
2612 /* return the fail status on failure */
2613 return rc;
2614 }
2615 return VINF_SUCCESS;
2616}
2617
2618
2619/**
2620 * Updates the previous allocations and allocates more pages.
2621 *
2622 * The handy pages are always taken from the 'base' memory account.
2623 * The allocated pages are not cleared and will contains random garbage.
2624 *
2625 * @returns VBox status code:
2626 * @retval VINF_SUCCESS on success.
2627 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2628 * @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2629 * @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2630 * private page.
2631 * @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2632 * shared page.
2633 * @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2634 * owned by the VM.
2635 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2636 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2637 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2638 * that is we're trying to allocate more than we've reserved.
2639 *
2640 * @param pVM Pointer to the VM.
2641 * @param idCpu The VCPU id.
2642 * @param cPagesToUpdate The number of pages to update (starting from the head).
2643 * @param cPagesToAlloc The number of pages to allocate (starting from the head).
2644 * @param paPages The array of page descriptors.
2645 * See GMMPAGEDESC for details on what is expected on input.
2646 * @thread EMT.
2647 */
2648GMMR0DECL(int) GMMR0AllocateHandyPages(PVM pVM, VMCPUID idCpu, uint32_t cPagesToUpdate, uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2649{
2650 LogFlow(("GMMR0AllocateHandyPages: pVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2651 pVM, cPagesToUpdate, cPagesToAlloc, paPages));
2652
2653 /*
2654 * Validate, get basics and take the semaphore.
2655 * (This is a relatively busy path, so make predictions where possible.)
2656 */
2657 PGMM pGMM;
2658 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2659 PGVM pGVM;
2660 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2661 if (RT_FAILURE(rc))
2662 return rc;
2663
2664 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2665 AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2666 || (cPagesToAlloc && cPagesToAlloc < 1024),
2667 ("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2668 VERR_INVALID_PARAMETER);
2669
2670 unsigned iPage = 0;
2671 for (; iPage < cPagesToUpdate; iPage++)
2672 {
2673 AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2674 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2675 || paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2676 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2677 ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2678 VERR_INVALID_PARAMETER);
2679 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2680 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
2681 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2682 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2683 /*|| paPages[iPage].idSharedPage == NIL_GMM_PAGEID*/,
2684 ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2685 }
2686
2687 for (; iPage < cPagesToAlloc; iPage++)
2688 {
2689 AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2690 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2691 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2692 }
2693
2694 gmmR0MutexAcquire(pGMM);
2695 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2696 {
2697 /* No allocations before the initial reservation has been made! */
2698 if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2699 && pGVM->gmm.s.Stats.Reserved.cFixedPages
2700 && pGVM->gmm.s.Stats.Reserved.cShadowPages))
2701 {
2702 /*
2703 * Perform the updates.
2704 * Stop on the first error.
2705 */
2706 for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2707 {
2708 if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2709 {
2710 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2711 if (RT_LIKELY(pPage))
2712 {
2713 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2714 {
2715 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2716 {
2717 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2718 if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2719 pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2720 else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2721 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2722 /* else: NIL_RTHCPHYS nothing */
2723
2724 paPages[iPage].idPage = NIL_GMM_PAGEID;
2725 paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2726 }
2727 else
2728 {
2729 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2730 iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2731 rc = VERR_GMM_NOT_PAGE_OWNER;
2732 break;
2733 }
2734 }
2735 else
2736 {
2737 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.*Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(*pPage), pPage, pPage->Common.u2State));
2738 rc = VERR_GMM_PAGE_NOT_PRIVATE;
2739 break;
2740 }
2741 }
2742 else
2743 {
2744 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2745 rc = VERR_GMM_PAGE_NOT_FOUND;
2746 break;
2747 }
2748 }
2749
2750 if (paPages[iPage].idSharedPage != NIL_GMM_PAGEID)
2751 {
2752 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
2753 if (RT_LIKELY(pPage))
2754 {
2755 if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2756 {
2757 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2758 Assert(pPage->Shared.cRefs);
2759 Assert(pGVM->gmm.s.Stats.cSharedPages);
2760 Assert(pGVM->gmm.s.Stats.Allocated.cBasePages);
2761
2762 Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
2763 pGVM->gmm.s.Stats.cSharedPages--;
2764 pGVM->gmm.s.Stats.Allocated.cBasePages--;
2765 if (!--pPage->Shared.cRefs)
2766 gmmR0FreeSharedPage(pGMM, pGVM, paPages[iPage].idSharedPage, pPage);
2767 else
2768 {
2769 Assert(pGMM->cDuplicatePages);
2770 pGMM->cDuplicatePages--;
2771 }
2772
2773 paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2774 }
2775 else
2776 {
2777 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
2778 rc = VERR_GMM_PAGE_NOT_SHARED;
2779 break;
2780 }
2781 }
2782 else
2783 {
2784 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
2785 rc = VERR_GMM_PAGE_NOT_FOUND;
2786 break;
2787 }
2788 }
2789 } /* for each page to update */
2790
2791 if (RT_SUCCESS(rc) && cPagesToAlloc > 0)
2792 {
2793#if defined(VBOX_STRICT) && 0 /** @todo re-test this later. Appeared to be a PGM init bug. */
2794 for (iPage = 0; iPage < cPagesToAlloc; iPage++)
2795 {
2796 Assert(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS);
2797 Assert(paPages[iPage].idPage == NIL_GMM_PAGEID);
2798 Assert(paPages[iPage].idSharedPage == NIL_GMM_PAGEID);
2799 }
2800#endif
2801
2802 /*
2803 * Join paths with GMMR0AllocatePages for the allocation.
2804 * Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2805 */
2806 rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
2807 }
2808 }
2809 else
2810 rc = VERR_WRONG_ORDER;
2811 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2812 }
2813 else
2814 rc = VERR_GMM_IS_NOT_SANE;
2815 gmmR0MutexRelease(pGMM);
2816 LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
2817 return rc;
2818}
2819
2820
2821/**
2822 * Allocate one or more pages.
2823 *
2824 * This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
2825 * The allocated pages are not cleared and will contain random garbage.
2826 *
2827 * @returns VBox status code:
2828 * @retval VINF_SUCCESS on success.
2829 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2830 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2831 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2832 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2833 * that is we're trying to allocate more than we've reserved.
2834 *
2835 * @param pVM Pointer to the VM.
2836 * @param idCpu The VCPU id.
2837 * @param cPages The number of pages to allocate.
2838 * @param paPages Pointer to the page descriptors.
2839 * See GMMPAGEDESC for details on what is expected on input.
2840 * @param enmAccount The account to charge.
2841 *
2842 * @thread EMT.
2843 */
2844GMMR0DECL(int) GMMR0AllocatePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2845{
2846 LogFlow(("GMMR0AllocatePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
2847
2848 /*
2849 * Validate, get basics and take the semaphore.
2850 */
2851 PGMM pGMM;
2852 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2853 PGVM pGVM;
2854 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2855 if (RT_FAILURE(rc))
2856 return rc;
2857
2858 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2859 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
2860 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
2861
2862 for (unsigned iPage = 0; iPage < cPages; iPage++)
2863 {
2864 AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2865 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
2866 || ( enmAccount == GMMACCOUNT_BASE
2867 && paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2868 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
2869 ("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
2870 VERR_INVALID_PARAMETER);
2871 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2872 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2873 }
2874
2875 gmmR0MutexAcquire(pGMM);
2876 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2877 {
2878
2879 /* No allocations before the initial reservation has been made! */
2880 if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2881 && pGVM->gmm.s.Stats.Reserved.cFixedPages
2882 && pGVM->gmm.s.Stats.Reserved.cShadowPages))
2883 rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPages, paPages, enmAccount);
2884 else
2885 rc = VERR_WRONG_ORDER;
2886 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2887 }
2888 else
2889 rc = VERR_GMM_IS_NOT_SANE;
2890 gmmR0MutexRelease(pGMM);
2891 LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
2892 return rc;
2893}
2894
2895
2896/**
2897 * VMMR0 request wrapper for GMMR0AllocatePages.
2898 *
2899 * @returns see GMMR0AllocatePages.
2900 * @param pVM Pointer to the VM.
2901 * @param idCpu The VCPU id.
2902 * @param pReq Pointer to the request packet.
2903 */
2904GMMR0DECL(int) GMMR0AllocatePagesReq(PVM pVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
2905{
2906 /*
2907 * Validate input and pass it on.
2908 */
2909 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2910 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2911 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
2912 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
2913 VERR_INVALID_PARAMETER);
2914 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
2915 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
2916 VERR_INVALID_PARAMETER);
2917
2918 return GMMR0AllocatePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
2919}
2920
2921
2922/**
2923 * Allocate a large page to represent guest RAM
2924 *
2925 * The allocated pages are not cleared and will contains random garbage.
2926 *
2927 * @returns VBox status code:
2928 * @retval VINF_SUCCESS on success.
2929 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2930 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2931 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2932 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2933 * that is we're trying to allocate more than we've reserved.
2934 * @returns see GMMR0AllocatePages.
2935 * @param pVM Pointer to the VM.
2936 * @param idCpu The VCPU id.
2937 * @param cbPage Large page size.
2938 */
2939GMMR0DECL(int) GMMR0AllocateLargePage(PVM pVM, VMCPUID idCpu, uint32_t cbPage, uint32_t *pIdPage, RTHCPHYS *pHCPhys)
2940{
2941 LogFlow(("GMMR0AllocateLargePage: pVM=%p cbPage=%x\n", pVM, cbPage));
2942
2943 AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
2944 AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
2945 AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
2946
2947 /*
2948 * Validate, get basics and take the semaphore.
2949 */
2950 PGMM pGMM;
2951 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2952 PGVM pGVM;
2953 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2954 if (RT_FAILURE(rc))
2955 return rc;
2956
2957 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
2958 if (pGMM->fLegacyAllocationMode)
2959 return VERR_NOT_SUPPORTED;
2960
2961 *pHCPhys = NIL_RTHCPHYS;
2962 *pIdPage = NIL_GMM_PAGEID;
2963
2964 gmmR0MutexAcquire(pGMM);
2965 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2966 {
2967 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
2968 if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2969 > pGVM->gmm.s.Stats.Reserved.cBasePages))
2970 {
2971 Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
2972 pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
2973 gmmR0MutexRelease(pGMM);
2974 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2975 }
2976
2977 /*
2978 * Allocate a new large page chunk.
2979 *
2980 * Note! We leave the giant GMM lock temporarily as the allocation might
2981 * take a long time. gmmR0RegisterChunk will retake it (ugly).
2982 */
2983 AssertCompile(GMM_CHUNK_SIZE == _2M);
2984 gmmR0MutexRelease(pGMM);
2985
2986 RTR0MEMOBJ hMemObj;
2987 rc = RTR0MemObjAllocPhysEx(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS, GMM_CHUNK_SIZE);
2988 if (RT_SUCCESS(rc))
2989 {
2990 PGMMCHUNKFREESET pSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
2991 PGMMCHUNK pChunk;
2992 rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_LARGE_PAGE, &pChunk);
2993 if (RT_SUCCESS(rc))
2994 {
2995 /*
2996 * Allocate all the pages in the chunk.
2997 */
2998 /* Unlink the new chunk from the free list. */
2999 gmmR0UnlinkChunk(pChunk);
3000
3001 /** @todo rewrite this to skip the looping. */
3002 /* Allocate all pages. */
3003 GMMPAGEDESC PageDesc;
3004 gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3005
3006 /* Return the first page as we'll use the whole chunk as one big page. */
3007 *pIdPage = PageDesc.idPage;
3008 *pHCPhys = PageDesc.HCPhysGCPhys;
3009
3010 for (unsigned i = 1; i < cPages; i++)
3011 gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3012
3013 /* Update accounting. */
3014 pGVM->gmm.s.Stats.Allocated.cBasePages += cPages;
3015 pGVM->gmm.s.Stats.cPrivatePages += cPages;
3016 pGMM->cAllocatedPages += cPages;
3017
3018 gmmR0LinkChunk(pChunk, pSet);
3019 gmmR0MutexRelease(pGMM);
3020 }
3021 else
3022 RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3023 }
3024 }
3025 else
3026 {
3027 gmmR0MutexRelease(pGMM);
3028 rc = VERR_GMM_IS_NOT_SANE;
3029 }
3030
3031 LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
3032 return rc;
3033}
3034
3035
3036/**
3037 * Free a large page.
3038 *
3039 * @returns VBox status code:
3040 * @param pVM Pointer to the VM.
3041 * @param idCpu The VCPU id.
3042 * @param idPage The large page id.
3043 */
3044GMMR0DECL(int) GMMR0FreeLargePage(PVM pVM, VMCPUID idCpu, uint32_t idPage)
3045{
3046 LogFlow(("GMMR0FreeLargePage: pVM=%p idPage=%x\n", pVM, idPage));
3047
3048 /*
3049 * Validate, get basics and take the semaphore.
3050 */
3051 PGMM pGMM;
3052 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3053 PGVM pGVM;
3054 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3055 if (RT_FAILURE(rc))
3056 return rc;
3057
3058 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3059 if (pGMM->fLegacyAllocationMode)
3060 return VERR_NOT_SUPPORTED;
3061
3062 gmmR0MutexAcquire(pGMM);
3063 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3064 {
3065 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3066
3067 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3068 {
3069 Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3070 gmmR0MutexRelease(pGMM);
3071 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3072 }
3073
3074 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3075 if (RT_LIKELY( pPage
3076 && GMM_PAGE_IS_PRIVATE(pPage)))
3077 {
3078 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3079 Assert(pChunk);
3080 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3081 Assert(pChunk->cPrivate > 0);
3082
3083 /* Release the memory immediately. */
3084 gmmR0FreeChunk(pGMM, NULL, pChunk, false /*fRelaxedSem*/); /** @todo this can be relaxed too! */
3085
3086 /* Update accounting. */
3087 pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages;
3088 pGVM->gmm.s.Stats.cPrivatePages -= cPages;
3089 pGMM->cAllocatedPages -= cPages;
3090 }
3091 else
3092 rc = VERR_GMM_PAGE_NOT_FOUND;
3093 }
3094 else
3095 rc = VERR_GMM_IS_NOT_SANE;
3096
3097 gmmR0MutexRelease(pGMM);
3098 LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
3099 return rc;
3100}
3101
3102
3103/**
3104 * VMMR0 request wrapper for GMMR0FreeLargePage.
3105 *
3106 * @returns see GMMR0FreeLargePage.
3107 * @param pVM Pointer to the VM.
3108 * @param idCpu The VCPU id.
3109 * @param pReq Pointer to the request packet.
3110 */
3111GMMR0DECL(int) GMMR0FreeLargePageReq(PVM pVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
3112{
3113 /*
3114 * Validate input and pass it on.
3115 */
3116 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3117 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3118 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
3119 ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
3120 VERR_INVALID_PARAMETER);
3121
3122 return GMMR0FreeLargePage(pVM, idCpu, pReq->idPage);
3123}
3124
3125
3126/**
3127 * Frees a chunk, giving it back to the host OS.
3128 *
3129 * @param pGMM Pointer to the GMM instance.
3130 * @param pGVM This is set when called from GMMR0CleanupVM so we can
3131 * unmap and free the chunk in one go.
3132 * @param pChunk The chunk to free.
3133 * @param fRelaxedSem Whether we can release the semaphore while doing the
3134 * freeing (@c true) or not.
3135 */
3136static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3137{
3138 Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
3139
3140 GMMR0CHUNKMTXSTATE MtxState;
3141 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3142
3143 /*
3144 * Cleanup hack! Unmap the chunk from the callers address space.
3145 * This shouldn't happen, so screw lock contention...
3146 */
3147 if ( pChunk->cMappingsX
3148 && !pGMM->fLegacyAllocationMode
3149 && pGVM)
3150 gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3151
3152 /*
3153 * If there are current mappings of the chunk, then request the
3154 * VMs to unmap them. Reposition the chunk in the free list so
3155 * it won't be a likely candidate for allocations.
3156 */
3157 if (pChunk->cMappingsX)
3158 {
3159 /** @todo R0 -> VM request */
3160 /* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
3161 Log(("gmmR0FreeChunk: chunk still has %d/%d mappings; don't free!\n", pChunk->cMappingsX));
3162 gmmR0ChunkMutexRelease(&MtxState, pChunk);
3163 return false;
3164 }
3165
3166
3167 /*
3168 * Save and trash the handle.
3169 */
3170 RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
3171 pChunk->hMemObj = NIL_RTR0MEMOBJ;
3172
3173 /*
3174 * Unlink it from everywhere.
3175 */
3176 gmmR0UnlinkChunk(pChunk);
3177
3178 RTListNodeRemove(&pChunk->ListNode);
3179
3180 PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
3181 Assert(pCore == &pChunk->Core); NOREF(pCore);
3182
3183 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
3184 if (pTlbe->pChunk == pChunk)
3185 {
3186 pTlbe->idChunk = NIL_GMM_CHUNKID;
3187 pTlbe->pChunk = NULL;
3188 }
3189
3190 Assert(pGMM->cChunks > 0);
3191 pGMM->cChunks--;
3192
3193 /*
3194 * Free the Chunk ID before dropping the locks and freeing the rest.
3195 */
3196 gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
3197 pChunk->Core.Key = NIL_GMM_CHUNKID;
3198
3199 pGMM->cFreedChunks++;
3200
3201 gmmR0ChunkMutexRelease(&MtxState, NULL);
3202 if (fRelaxedSem)
3203 gmmR0MutexRelease(pGMM);
3204
3205 RTMemFree(pChunk->paMappingsX);
3206 pChunk->paMappingsX = NULL;
3207
3208 RTMemFree(pChunk);
3209
3210 int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3211 AssertLogRelRC(rc);
3212
3213 if (fRelaxedSem)
3214 gmmR0MutexAcquire(pGMM);
3215 return fRelaxedSem;
3216}
3217
3218
3219/**
3220 * Free page worker.
3221 *
3222 * The caller does all the statistic decrementing, we do all the incrementing.
3223 *
3224 * @param pGMM Pointer to the GMM instance data.
3225 * @param pGVM Pointer to the GVM instance.
3226 * @param pChunk Pointer to the chunk this page belongs to.
3227 * @param idPage The Page ID.
3228 * @param pPage Pointer to the page.
3229 */
3230static void gmmR0FreePageWorker(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
3231{
3232 Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
3233 pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
3234
3235 /*
3236 * Put the page on the free list.
3237 */
3238 pPage->u = 0;
3239 pPage->Free.u2State = GMM_PAGE_STATE_FREE;
3240 Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) || pChunk->iFreeHead == UINT16_MAX);
3241 pPage->Free.iNext = pChunk->iFreeHead;
3242 pChunk->iFreeHead = pPage - &pChunk->aPages[0];
3243
3244 /*
3245 * Update statistics (the cShared/cPrivate stats are up to date already),
3246 * and relink the chunk if necessary.
3247 */
3248 unsigned const cFree = pChunk->cFree;
3249 if ( !cFree
3250 || gmmR0SelectFreeSetList(cFree) != gmmR0SelectFreeSetList(cFree + 1))
3251 {
3252 gmmR0UnlinkChunk(pChunk);
3253 pChunk->cFree++;
3254 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
3255 }
3256 else
3257 {
3258 pChunk->cFree = cFree + 1;
3259 pChunk->pSet->cFreePages++;
3260 }
3261
3262 /*
3263 * If the chunk becomes empty, consider giving memory back to the host OS.
3264 *
3265 * The current strategy is to try give it back if there are other chunks
3266 * in this free list, meaning if there are at least 240 free pages in this
3267 * category. Note that since there are probably mappings of the chunk,
3268 * it won't be freed up instantly, which probably screws up this logic
3269 * a bit...
3270 */
3271 /** @todo Do this on the way out. */
3272 if (RT_UNLIKELY( pChunk->cFree == GMM_CHUNK_NUM_PAGES
3273 && pChunk->pFreeNext
3274 && pChunk->pFreePrev /** @todo this is probably misfiring, see reset... */
3275 && !pGMM->fLegacyAllocationMode))
3276 gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3277
3278}
3279
3280
3281/**
3282 * Frees a shared page, the page is known to exist and be valid and such.
3283 *
3284 * @param pGMM Pointer to the GMM instance.
3285 * @param pGVM Pointer to the GVM instance.
3286 * @param idPage The page id.
3287 * @param pPage The page structure.
3288 */
3289DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3290{
3291 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3292 Assert(pChunk);
3293 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3294 Assert(pChunk->cShared > 0);
3295 Assert(pGMM->cSharedPages > 0);
3296 Assert(pGMM->cAllocatedPages > 0);
3297 Assert(!pPage->Shared.cRefs);
3298
3299 pChunk->cShared--;
3300 pGMM->cAllocatedPages--;
3301 pGMM->cSharedPages--;
3302 gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3303}
3304
3305
3306/**
3307 * Frees a private page, the page is known to exist and be valid and such.
3308 *
3309 * @param pGMM Pointer to the GMM instance.
3310 * @param pGVM Pointer to the GVM instance.
3311 * @param idPage The page id.
3312 * @param pPage The page structure.
3313 */
3314DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3315{
3316 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3317 Assert(pChunk);
3318 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3319 Assert(pChunk->cPrivate > 0);
3320 Assert(pGMM->cAllocatedPages > 0);
3321
3322 pChunk->cPrivate--;
3323 pGMM->cAllocatedPages--;
3324 gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3325}
3326
3327
3328/**
3329 * Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3330 *
3331 * @returns VBox status code:
3332 * @retval xxx
3333 *
3334 * @param pGMM Pointer to the GMM instance data.
3335 * @param pGVM Pointer to the VM.
3336 * @param cPages The number of pages to free.
3337 * @param paPages Pointer to the page descriptors.
3338 * @param enmAccount The account this relates to.
3339 */
3340static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3341{
3342 /*
3343 * Check that the request isn't impossible wrt to the account status.
3344 */
3345 switch (enmAccount)
3346 {
3347 case GMMACCOUNT_BASE:
3348 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3349 {
3350 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3351 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3352 }
3353 break;
3354 case GMMACCOUNT_SHADOW:
3355 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages < cPages))
3356 {
3357 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
3358 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3359 }
3360 break;
3361 case GMMACCOUNT_FIXED:
3362 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages < cPages))
3363 {
3364 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
3365 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3366 }
3367 break;
3368 default:
3369 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3370 }
3371
3372 /*
3373 * Walk the descriptors and free the pages.
3374 *
3375 * Statistics (except the account) are being updated as we go along,
3376 * unlike the alloc code. Also, stop on the first error.
3377 */
3378 int rc = VINF_SUCCESS;
3379 uint32_t iPage;
3380 for (iPage = 0; iPage < cPages; iPage++)
3381 {
3382 uint32_t idPage = paPages[iPage].idPage;
3383 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3384 if (RT_LIKELY(pPage))
3385 {
3386 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3387 {
3388 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3389 {
3390 Assert(pGVM->gmm.s.Stats.cPrivatePages);
3391 pGVM->gmm.s.Stats.cPrivatePages--;
3392 gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
3393 }
3394 else
3395 {
3396 Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3397 pPage->Private.hGVM, pGVM->hSelf));
3398 rc = VERR_GMM_NOT_PAGE_OWNER;
3399 break;
3400 }
3401 }
3402 else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3403 {
3404 Assert(pGVM->gmm.s.Stats.cSharedPages);
3405 Assert(pPage->Shared.cRefs);
3406#if defined(VBOX_WITH_PAGE_SHARING) && defined(VBOX_STRICT) && HC_ARCH_BITS == 64
3407 if (pPage->Shared.u14Checksum)
3408 {
3409 uint32_t uChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
3410 uChecksum &= UINT32_C(0x00003fff);
3411 AssertMsg(!uChecksum || uChecksum == pPage->Shared.u14Checksum,
3412 ("%#x vs %#x - idPage=%#x\n", uChecksum, pPage->Shared.u14Checksum, idPage));
3413 }
3414#endif
3415 pGVM->gmm.s.Stats.cSharedPages--;
3416 if (!--pPage->Shared.cRefs)
3417 gmmR0FreeSharedPage(pGMM, pGVM, idPage, pPage);
3418 else
3419 {
3420 Assert(pGMM->cDuplicatePages);
3421 pGMM->cDuplicatePages--;
3422 }
3423 }
3424 else
3425 {
3426 Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3427 rc = VERR_GMM_PAGE_ALREADY_FREE;
3428 break;
3429 }
3430 }
3431 else
3432 {
3433 Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3434 rc = VERR_GMM_PAGE_NOT_FOUND;
3435 break;
3436 }
3437 paPages[iPage].idPage = NIL_GMM_PAGEID;
3438 }
3439
3440 /*
3441 * Update the account.
3442 */
3443 switch (enmAccount)
3444 {
3445 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= iPage; break;
3446 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= iPage; break;
3447 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= iPage; break;
3448 default:
3449 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3450 }
3451
3452 /*
3453 * Any threshold stuff to be done here?
3454 */
3455
3456 return rc;
3457}
3458
3459
3460/**
3461 * Free one or more pages.
3462 *
3463 * This is typically used at reset time or power off.
3464 *
3465 * @returns VBox status code:
3466 * @retval xxx
3467 *
3468 * @param pVM Pointer to the VM.
3469 * @param idCpu The VCPU id.
3470 * @param cPages The number of pages to allocate.
3471 * @param paPages Pointer to the page descriptors containing the Page IDs for each page.
3472 * @param enmAccount The account this relates to.
3473 * @thread EMT.
3474 */
3475GMMR0DECL(int) GMMR0FreePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3476{
3477 LogFlow(("GMMR0FreePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
3478
3479 /*
3480 * Validate input and get the basics.
3481 */
3482 PGMM pGMM;
3483 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3484 PGVM pGVM;
3485 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3486 if (RT_FAILURE(rc))
3487 return rc;
3488
3489 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3490 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3491 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3492
3493 for (unsigned iPage = 0; iPage < cPages; iPage++)
3494 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3495 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
3496 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3497
3498 /*
3499 * Take the semaphore and call the worker function.
3500 */
3501 gmmR0MutexAcquire(pGMM);
3502 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3503 {
3504 rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3505 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3506 }
3507 else
3508 rc = VERR_GMM_IS_NOT_SANE;
3509 gmmR0MutexRelease(pGMM);
3510 LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3511 return rc;
3512}
3513
3514
3515/**
3516 * VMMR0 request wrapper for GMMR0FreePages.
3517 *
3518 * @returns see GMMR0FreePages.
3519 * @param pVM Pointer to the VM.
3520 * @param idCpu The VCPU id.
3521 * @param pReq Pointer to the request packet.
3522 */
3523GMMR0DECL(int) GMMR0FreePagesReq(PVM pVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3524{
3525 /*
3526 * Validate input and pass it on.
3527 */
3528 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3529 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3530 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3531 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3532 VERR_INVALID_PARAMETER);
3533 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3534 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3535 VERR_INVALID_PARAMETER);
3536
3537 return GMMR0FreePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3538}
3539
3540
3541/**
3542 * Report back on a memory ballooning request.
3543 *
3544 * The request may or may not have been initiated by the GMM. If it was initiated
3545 * by the GMM it is important that this function is called even if no pages were
3546 * ballooned.
3547 *
3548 * @returns VBox status code:
3549 * @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3550 * @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3551 * @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3552 * indicating that we won't necessarily have sufficient RAM to boot
3553 * the VM again and that it should pause until this changes (we'll try
3554 * balloon some other VM). (For standard deflate we have little choice
3555 * but to hope the VM won't use the memory that was returned to it.)
3556 *
3557 * @param pVM Pointer to the VM.
3558 * @param idCpu The VCPU id.
3559 * @param enmAction Inflate/deflate/reset.
3560 * @param cBalloonedPages The number of pages that was ballooned.
3561 *
3562 * @thread EMT.
3563 */
3564GMMR0DECL(int) GMMR0BalloonedPages(PVM pVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3565{
3566 LogFlow(("GMMR0BalloonedPages: pVM=%p enmAction=%d cBalloonedPages=%#x\n",
3567 pVM, enmAction, cBalloonedPages));
3568
3569 AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3570
3571 /*
3572 * Validate input and get the basics.
3573 */
3574 PGMM pGMM;
3575 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3576 PGVM pGVM;
3577 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3578 if (RT_FAILURE(rc))
3579 return rc;
3580
3581 /*
3582 * Take the semaphore and do some more validations.
3583 */
3584 gmmR0MutexAcquire(pGMM);
3585 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3586 {
3587 switch (enmAction)
3588 {
3589 case GMMBALLOONACTION_INFLATE:
3590 {
3591 if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cBalloonedPages
3592 <= pGVM->gmm.s.Stats.Reserved.cBasePages))
3593 {
3594 /*
3595 * Record the ballooned memory.
3596 */
3597 pGMM->cBalloonedPages += cBalloonedPages;
3598 if (pGVM->gmm.s.Stats.cReqBalloonedPages)
3599 {
3600 /* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3601 AssertFailed();
3602
3603 pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3604 pGVM->gmm.s.Stats.cReqActuallyBalloonedPages += cBalloonedPages;
3605 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n",
3606 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages,
3607 pGVM->gmm.s.Stats.cReqBalloonedPages, pGVM->gmm.s.Stats.cReqActuallyBalloonedPages));
3608 }
3609 else
3610 {
3611 pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3612 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3613 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3614 }
3615 }
3616 else
3617 {
3618 Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3619 pGVM->gmm.s.Stats.Allocated.cBasePages, pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages,
3620 pGVM->gmm.s.Stats.Reserved.cBasePages));
3621 rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3622 }
3623 break;
3624 }
3625
3626 case GMMBALLOONACTION_DEFLATE:
3627 {
3628 /* Deflate. */
3629 if (pGVM->gmm.s.Stats.cBalloonedPages >= cBalloonedPages)
3630 {
3631 /*
3632 * Record the ballooned memory.
3633 */
3634 Assert(pGMM->cBalloonedPages >= cBalloonedPages);
3635 pGMM->cBalloonedPages -= cBalloonedPages;
3636 pGVM->gmm.s.Stats.cBalloonedPages -= cBalloonedPages;
3637 if (pGVM->gmm.s.Stats.cReqDeflatePages)
3638 {
3639 AssertFailed(); /* This is path is for later. */
3640 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
3641 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages, pGVM->gmm.s.Stats.cReqDeflatePages));
3642
3643 /*
3644 * Anything we need to do here now when the request has been completed?
3645 */
3646 pGVM->gmm.s.Stats.cReqDeflatePages = 0;
3647 }
3648 else
3649 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3650 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3651 }
3652 else
3653 {
3654 Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages));
3655 rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
3656 }
3657 break;
3658 }
3659
3660 case GMMBALLOONACTION_RESET:
3661 {
3662 /* Reset to an empty balloon. */
3663 Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
3664
3665 pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
3666 pGVM->gmm.s.Stats.cBalloonedPages = 0;
3667 break;
3668 }
3669
3670 default:
3671 rc = VERR_INVALID_PARAMETER;
3672 break;
3673 }
3674 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3675 }
3676 else
3677 rc = VERR_GMM_IS_NOT_SANE;
3678
3679 gmmR0MutexRelease(pGMM);
3680 LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
3681 return rc;
3682}
3683
3684
3685/**
3686 * VMMR0 request wrapper for GMMR0BalloonedPages.
3687 *
3688 * @returns see GMMR0BalloonedPages.
3689 * @param pVM Pointer to the VM.
3690 * @param idCpu The VCPU id.
3691 * @param pReq Pointer to the request packet.
3692 */
3693GMMR0DECL(int) GMMR0BalloonedPagesReq(PVM pVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
3694{
3695 /*
3696 * Validate input and pass it on.
3697 */
3698 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3699 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3700 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
3701 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
3702 VERR_INVALID_PARAMETER);
3703
3704 return GMMR0BalloonedPages(pVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
3705}
3706
3707/**
3708 * Return memory statistics for the hypervisor
3709 *
3710 * @returns VBox status code:
3711 * @param pVM Pointer to the VM.
3712 * @param pReq Pointer to the request packet.
3713 */
3714GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PVM pVM, PGMMMEMSTATSREQ pReq)
3715{
3716 /*
3717 * Validate input and pass it on.
3718 */
3719 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3720 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3721 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3722 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3723 VERR_INVALID_PARAMETER);
3724
3725 /*
3726 * Validate input and get the basics.
3727 */
3728 PGMM pGMM;
3729 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3730 pReq->cAllocPages = pGMM->cAllocatedPages;
3731 pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
3732 pReq->cBalloonedPages = pGMM->cBalloonedPages;
3733 pReq->cMaxPages = pGMM->cMaxPages;
3734 pReq->cSharedPages = pGMM->cDuplicatePages;
3735 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3736
3737 return VINF_SUCCESS;
3738}
3739
3740/**
3741 * Return memory statistics for the VM
3742 *
3743 * @returns VBox status code:
3744 * @param pVM Pointer to the VM.
3745 * @parma idCpu Cpu id.
3746 * @param pReq Pointer to the request packet.
3747 */
3748GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PVM pVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
3749{
3750 /*
3751 * Validate input and pass it on.
3752 */
3753 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3754 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3755 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3756 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3757 VERR_INVALID_PARAMETER);
3758
3759 /*
3760 * Validate input and get the basics.
3761 */
3762 PGMM pGMM;
3763 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3764 PGVM pGVM;
3765 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3766 if (RT_FAILURE(rc))
3767 return rc;
3768
3769 /*
3770 * Take the semaphore and do some more validations.
3771 */
3772 gmmR0MutexAcquire(pGMM);
3773 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3774 {
3775 pReq->cAllocPages = pGVM->gmm.s.Stats.Allocated.cBasePages;
3776 pReq->cBalloonedPages = pGVM->gmm.s.Stats.cBalloonedPages;
3777 pReq->cMaxPages = pGVM->gmm.s.Stats.Reserved.cBasePages;
3778 pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
3779 }
3780 else
3781 rc = VERR_GMM_IS_NOT_SANE;
3782
3783 gmmR0MutexRelease(pGMM);
3784 LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
3785 return rc;
3786}
3787
3788
3789/**
3790 * Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
3791 *
3792 * Don't call this in legacy allocation mode!
3793 *
3794 * @returns VBox status code.
3795 * @param pGMM Pointer to the GMM instance data.
3796 * @param pGVM Pointer to the Global VM structure.
3797 * @param pChunk Pointer to the chunk to be unmapped.
3798 */
3799static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
3800{
3801 Assert(!pGMM->fLegacyAllocationMode);
3802
3803 /*
3804 * Find the mapping and try unmapping it.
3805 */
3806 uint32_t cMappings = pChunk->cMappingsX;
3807 for (uint32_t i = 0; i < cMappings; i++)
3808 {
3809 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3810 if (pChunk->paMappingsX[i].pGVM == pGVM)
3811 {
3812 /* unmap */
3813 int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
3814 if (RT_SUCCESS(rc))
3815 {
3816 /* update the record. */
3817 cMappings--;
3818 if (i < cMappings)
3819 pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
3820 pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
3821 pChunk->paMappingsX[cMappings].pGVM = NULL;
3822 Assert(pChunk->cMappingsX - 1U == cMappings);
3823 pChunk->cMappingsX = cMappings;
3824 }
3825
3826 return rc;
3827 }
3828 }
3829
3830 Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3831 return VERR_GMM_CHUNK_NOT_MAPPED;
3832}
3833
3834
3835/**
3836 * Unmaps a chunk previously mapped into the address space of the current process.
3837 *
3838 * @returns VBox status code.
3839 * @param pGMM Pointer to the GMM instance data.
3840 * @param pGVM Pointer to the Global VM structure.
3841 * @param pChunk Pointer to the chunk to be unmapped.
3842 */
3843static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3844{
3845 if (!pGMM->fLegacyAllocationMode)
3846 {
3847 /*
3848 * Lock the chunk and if possible leave the giant GMM lock.
3849 */
3850 GMMR0CHUNKMTXSTATE MtxState;
3851 int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
3852 fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
3853 if (RT_SUCCESS(rc))
3854 {
3855 rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3856 gmmR0ChunkMutexRelease(&MtxState, pChunk);
3857 }
3858 return rc;
3859 }
3860
3861 if (pChunk->hGVM == pGVM->hSelf)
3862 return VINF_SUCCESS;
3863
3864 Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x (legacy)\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3865 return VERR_GMM_CHUNK_NOT_MAPPED;
3866}
3867
3868
3869/**
3870 * Worker for gmmR0MapChunk.
3871 *
3872 * @returns VBox status code.
3873 * @param pGMM Pointer to the GMM instance data.
3874 * @param pGVM Pointer to the Global VM structure.
3875 * @param pChunk Pointer to the chunk to be mapped.
3876 * @param ppvR3 Where to store the ring-3 address of the mapping.
3877 * In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3878 * contain the address of the existing mapping.
3879 */
3880static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
3881{
3882 /*
3883 * If we're in legacy mode this is simple.
3884 */
3885 if (pGMM->fLegacyAllocationMode)
3886 {
3887 if (pChunk->hGVM != pGVM->hSelf)
3888 {
3889 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3890 return VERR_GMM_CHUNK_NOT_FOUND;
3891 }
3892
3893 *ppvR3 = RTR0MemObjAddressR3(pChunk->hMemObj);
3894 return VINF_SUCCESS;
3895 }
3896
3897 /*
3898 * Check to see if the chunk is already mapped.
3899 */
3900 for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
3901 {
3902 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3903 if (pChunk->paMappingsX[i].pGVM == pGVM)
3904 {
3905 *ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
3906 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3907#ifdef VBOX_WITH_PAGE_SHARING
3908 /* The ring-3 chunk cache can be out of sync; don't fail. */
3909 return VINF_SUCCESS;
3910#else
3911 return VERR_GMM_CHUNK_ALREADY_MAPPED;
3912#endif
3913 }
3914 }
3915
3916 /*
3917 * Do the mapping.
3918 */
3919 RTR0MEMOBJ hMapObj;
3920 int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
3921 if (RT_SUCCESS(rc))
3922 {
3923 /* reallocate the array? assumes few users per chunk (usually one). */
3924 unsigned iMapping = pChunk->cMappingsX;
3925 if ( iMapping <= 3
3926 || (iMapping & 3) == 0)
3927 {
3928 unsigned cNewSize = iMapping <= 3
3929 ? iMapping + 1
3930 : iMapping + 4;
3931 Assert(cNewSize < 4 || RT_ALIGN_32(cNewSize, 4) == cNewSize);
3932 if (RT_UNLIKELY(cNewSize > UINT16_MAX))
3933 {
3934 rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
3935 return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
3936 }
3937
3938 void *pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize * sizeof(pChunk->paMappingsX[0]));
3939 if (RT_UNLIKELY(!pvMappings))
3940 {
3941 rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
3942 return VERR_NO_MEMORY;
3943 }
3944 pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
3945 }
3946
3947 /* insert new entry */
3948 pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
3949 pChunk->paMappingsX[iMapping].pGVM = pGVM;
3950 Assert(pChunk->cMappingsX == iMapping);
3951 pChunk->cMappingsX = iMapping + 1;
3952
3953 *ppvR3 = RTR0MemObjAddressR3(hMapObj);
3954 }
3955
3956 return rc;
3957}
3958
3959
3960/**
3961 * Maps a chunk into the user address space of the current process.
3962 *
3963 * @returns VBox status code.
3964 * @param pGMM Pointer to the GMM instance data.
3965 * @param pGVM Pointer to the Global VM structure.
3966 * @param pChunk Pointer to the chunk to be mapped.
3967 * @param fRelaxedSem Whether we can release the semaphore while doing the
3968 * mapping (@c true) or not.
3969 * @param ppvR3 Where to store the ring-3 address of the mapping.
3970 * In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3971 * contain the address of the existing mapping.
3972 */
3973static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
3974{
3975 /*
3976 * Take the chunk lock and leave the giant GMM lock when possible, then
3977 * call the worker function.
3978 */
3979 GMMR0CHUNKMTXSTATE MtxState;
3980 int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
3981 fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
3982 if (RT_SUCCESS(rc))
3983 {
3984 rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
3985 gmmR0ChunkMutexRelease(&MtxState, pChunk);
3986 }
3987
3988 return rc;
3989}
3990
3991
3992
3993#if defined(VBOX_WITH_PAGE_SHARING) || (defined(VBOX_STRICT) && HC_ARCH_BITS == 64)
3994/**
3995 * Check if a chunk is mapped into the specified VM
3996 *
3997 * @returns mapped yes/no
3998 * @param pGMM Pointer to the GMM instance.
3999 * @param pGVM Pointer to the Global VM structure.
4000 * @param pChunk Pointer to the chunk to be mapped.
4001 * @param ppvR3 Where to store the ring-3 address of the mapping.
4002 */
4003static bool gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4004{
4005 GMMR0CHUNKMTXSTATE MtxState;
4006 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
4007 for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4008 {
4009 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4010 if (pChunk->paMappingsX[i].pGVM == pGVM)
4011 {
4012 *ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4013 gmmR0ChunkMutexRelease(&MtxState, pChunk);
4014 return true;
4015 }
4016 }
4017 *ppvR3 = NULL;
4018 gmmR0ChunkMutexRelease(&MtxState, pChunk);
4019 return false;
4020}
4021#endif /* VBOX_WITH_PAGE_SHARING || (VBOX_STRICT && 64-BIT) */
4022
4023
4024/**
4025 * Map a chunk and/or unmap another chunk.
4026 *
4027 * The mapping and unmapping applies to the current process.
4028 *
4029 * This API does two things because it saves a kernel call per mapping when
4030 * when the ring-3 mapping cache is full.
4031 *
4032 * @returns VBox status code.
4033 * @param pVM The VM.
4034 * @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
4035 * @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
4036 * @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
4037 * @thread EMT
4038 */
4039GMMR0DECL(int) GMMR0MapUnmapChunk(PVM pVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
4040{
4041 LogFlow(("GMMR0MapUnmapChunk: pVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
4042 pVM, idChunkMap, idChunkUnmap, ppvR3));
4043
4044 /*
4045 * Validate input and get the basics.
4046 */
4047 PGMM pGMM;
4048 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4049 PGVM pGVM;
4050 int rc = GVMMR0ByVM(pVM, &pGVM);
4051 if (RT_FAILURE(rc))
4052 return rc;
4053
4054 AssertCompile(NIL_GMM_CHUNKID == 0);
4055 AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
4056 AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
4057
4058 if ( idChunkMap == NIL_GMM_CHUNKID
4059 && idChunkUnmap == NIL_GMM_CHUNKID)
4060 return VERR_INVALID_PARAMETER;
4061
4062 if (idChunkMap != NIL_GMM_CHUNKID)
4063 {
4064 AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
4065 *ppvR3 = NIL_RTR3PTR;
4066 }
4067
4068 /*
4069 * Take the semaphore and do the work.
4070 *
4071 * The unmapping is done last since it's easier to undo a mapping than
4072 * undoing an unmapping. The ring-3 mapping cache cannot not be so big
4073 * that it pushes the user virtual address space to within a chunk of
4074 * it it's limits, so, no problem here.
4075 */
4076 gmmR0MutexAcquire(pGMM);
4077 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4078 {
4079 PGMMCHUNK pMap = NULL;
4080 if (idChunkMap != NIL_GVM_HANDLE)
4081 {
4082 pMap = gmmR0GetChunk(pGMM, idChunkMap);
4083 if (RT_LIKELY(pMap))
4084 rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /*fRelaxedSem*/, ppvR3);
4085 else
4086 {
4087 Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
4088 rc = VERR_GMM_CHUNK_NOT_FOUND;
4089 }
4090 }
4091/** @todo split this operation, the bail out might (theoretcially) not be
4092 * entirely safe. */
4093
4094 if ( idChunkUnmap != NIL_GMM_CHUNKID
4095 && RT_SUCCESS(rc))
4096 {
4097 PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
4098 if (RT_LIKELY(pUnmap))
4099 rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /*fRelaxedSem*/);
4100 else
4101 {
4102 Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
4103 rc = VERR_GMM_CHUNK_NOT_FOUND;
4104 }
4105
4106 if (RT_FAILURE(rc) && pMap)
4107 gmmR0UnmapChunk(pGMM, pGVM, pMap, false /*fRelaxedSem*/);
4108 }
4109
4110 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4111 }
4112 else
4113 rc = VERR_GMM_IS_NOT_SANE;
4114 gmmR0MutexRelease(pGMM);
4115
4116 LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
4117 return rc;
4118}
4119
4120
4121/**
4122 * VMMR0 request wrapper for GMMR0MapUnmapChunk.
4123 *
4124 * @returns see GMMR0MapUnmapChunk.
4125 * @param pVM Pointer to the VM.
4126 * @param pReq Pointer to the request packet.
4127 */
4128GMMR0DECL(int) GMMR0MapUnmapChunkReq(PVM pVM, PGMMMAPUNMAPCHUNKREQ pReq)
4129{
4130 /*
4131 * Validate input and pass it on.
4132 */
4133 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4134 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4135 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4136
4137 return GMMR0MapUnmapChunk(pVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
4138}
4139
4140
4141/**
4142 * Legacy mode API for supplying pages.
4143 *
4144 * The specified user address points to a allocation chunk sized block that
4145 * will be locked down and used by the GMM when the GM asks for pages.
4146 *
4147 * @returns VBox status code.
4148 * @param pVM Pointer to the VM.
4149 * @param idCpu The VCPU id.
4150 * @param pvR3 Pointer to the chunk size memory block to lock down.
4151 */
4152GMMR0DECL(int) GMMR0SeedChunk(PVM pVM, VMCPUID idCpu, RTR3PTR pvR3)
4153{
4154 /*
4155 * Validate input and get the basics.
4156 */
4157 PGMM pGMM;
4158 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4159 PGVM pGVM;
4160 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4161 if (RT_FAILURE(rc))
4162 return rc;
4163
4164 AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
4165 AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
4166
4167 if (!pGMM->fLegacyAllocationMode)
4168 {
4169 Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
4170 return VERR_NOT_SUPPORTED;
4171 }
4172
4173 /*
4174 * Lock the memory and add it as new chunk with our hGVM.
4175 * (The GMM locking is done inside gmmR0RegisterChunk.)
4176 */
4177 RTR0MEMOBJ MemObj;
4178 rc = RTR0MemObjLockUser(&MemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4179 if (RT_SUCCESS(rc))
4180 {
4181 rc = gmmR0RegisterChunk(pGMM, &pGVM->gmm.s.Private, MemObj, pGVM->hSelf, 0 /*fChunkFlags*/, NULL);
4182 if (RT_SUCCESS(rc))
4183 gmmR0MutexRelease(pGMM);
4184 else
4185 RTR0MemObjFree(MemObj, false /* fFreeMappings */);
4186 }
4187
4188 LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
4189 return rc;
4190}
4191
4192#ifdef VBOX_WITH_PAGE_SHARING
4193
4194# ifdef VBOX_STRICT
4195/**
4196 * For checksumming shared pages in strict builds.
4197 *
4198 * The purpose is making sure that a page doesn't change.
4199 *
4200 * @returns Checksum, 0 on failure.
4201 * @param GMM The GMM instance data.
4202 * @param idPage The page ID.
4203 */
4204static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage)
4205{
4206 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4207 AssertMsgReturn(pChunk, ("idPage=%#x\n", idPage), 0);
4208
4209 uint8_t *pbChunk;
4210 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4211 return 0;
4212 uint8_t const *pbPage = pbChunk + ((idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4213
4214 return RTCrc32(pbPage, PAGE_SIZE);
4215}
4216# endif /* VBOX_STRICT */
4217
4218
4219/**
4220 * Calculates the module hash value.
4221 *
4222 * @returns Hash value.
4223 * @param pszModuleName The module name.
4224 * @param pszVersion The module version string.
4225 */
4226static uint32_t gmmR0ShModCalcHash(const char *pszModuleName, const char *pszVersion)
4227{
4228 return RTStrHash1ExN(3, pszModuleName, RTSTR_MAX, "::", (size_t)2, pszVersion, RTSTR_MAX);
4229}
4230
4231
4232/**
4233 * Finds a global module.
4234 *
4235 * @returns Pointer to the global module on success, NULL if not found.
4236 * @param pGMM The GMM instance data.
4237 * @param uHash The hash as calculated by gmmR0ShModCalcHash.
4238 * @param cbModule The module size.
4239 * @param enmGuestOS The guest OS type.
4240 * @param pszModuleName The module name.
4241 * @param pszVersion The module version.
4242 */
4243static PGMMSHAREDMODULE gmmR0ShModFindGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4244 uint32_t cRegions, const char *pszModuleName, const char *pszVersion,
4245 struct VMMDEVSHAREDREGIONDESC const *paRegions)
4246{
4247 for (PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTAvllU32Get(&pGMM->pGlobalSharedModuleTree, uHash);
4248 pGblMod;
4249 pGblMod = (PGMMSHAREDMODULE)pGblMod->Core.pList)
4250 {
4251 if (pGblMod->cbModule != cbModule)
4252 continue;
4253 if (pGblMod->enmGuestOS != enmGuestOS)
4254 continue;
4255 if (pGblMod->cRegions != cRegions)
4256 continue;
4257 if (strcmp(pGblMod->szName, pszModuleName))
4258 continue;
4259 if (strcmp(pGblMod->szVersion, pszVersion))
4260 continue;
4261
4262 uint32_t i;
4263 for (i = 0; i < cRegions; i++)
4264 {
4265 uint32_t off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4266 if (pGblMod->aRegions[i].off != off)
4267 break;
4268
4269 uint32_t cb = RT_ALIGN_32(paRegions[i].cbRegion + off, PAGE_SIZE);
4270 if (pGblMod->aRegions[i].cb != cb)
4271 break;
4272 }
4273
4274 if (i == cRegions)
4275 return pGblMod;
4276 }
4277
4278 return NULL;
4279}
4280
4281
4282/**
4283 * Creates a new global module.
4284 *
4285 * @returns VBox status code.
4286 * @param pGMM The GMM instance data.
4287 * @param uHash The hash as calculated by gmmR0ShModCalcHash.
4288 * @param cbModule The module size.
4289 * @param enmGuestOS The guest OS type.
4290 * @param cRegions The number of regions.
4291 * @param pszModuleName The module name.
4292 * @param pszVersion The module version.
4293 * @param paRegions The region descriptions.
4294 * @param ppGblMod Where to return the new module on success.
4295 */
4296static int gmmR0ShModNewGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4297 uint32_t cRegions, const char *pszModuleName, const char *pszVersion,
4298 struct VMMDEVSHAREDREGIONDESC const *paRegions, PGMMSHAREDMODULE *ppGblMod)
4299{
4300 Log(("gmmR0ShModNewGlobal: %s %s size %#x os %u rgn %u\n", pszModuleName, pszVersion, cbModule, cRegions));
4301 if (pGMM->cShareableModules >= GMM_MAX_SHARED_GLOBAL_MODULES)
4302 {
4303 Log(("gmmR0ShModNewGlobal: Too many modules\n"));
4304 return VERR_GMM_TOO_MANY_GLOBAL_MODULES;
4305 }
4306
4307 PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULE, aRegions[cRegions]));
4308 if (!pGblMod)
4309 {
4310 Log(("gmmR0ShModNewGlobal: No memory\n"));
4311 return VERR_NO_MEMORY;
4312 }
4313
4314 pGblMod->Core.Key = uHash;
4315 pGblMod->cbModule = cbModule;
4316 pGblMod->cRegions = cRegions;
4317 pGblMod->cUsers = 1;
4318 pGblMod->enmGuestOS = enmGuestOS;
4319 strcpy(pGblMod->szName, pszModuleName);
4320 strcpy(pGblMod->szVersion, pszVersion);
4321
4322 for (uint32_t i = 0; i < cRegions; i++)
4323 {
4324 Log(("gmmR0ShModNewGlobal: rgn[%u]=%RGvLB%#x\n", i, paRegions[i].GCRegionAddr, paRegions[i].cbRegion));
4325 pGblMod->aRegions[i].off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4326 pGblMod->aRegions[i].cb = paRegions[i].cbRegion + pGblMod->aRegions[i].off;
4327 pGblMod->aRegions[i].cb = RT_ALIGN_32(pGblMod->aRegions[i].cb, PAGE_SIZE);
4328 pGblMod->aRegions[i].paidPages = NULL; /* allocated when needed. */
4329 }
4330
4331 bool fInsert = RTAvllU32Insert(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4332 Assert(fInsert); NOREF(fInsert);
4333 pGMM->cShareableModules++;
4334
4335 *ppGblMod = pGblMod;
4336 return VINF_SUCCESS;
4337}
4338
4339
4340/**
4341 * Deletes a global module which is no longer referenced by anyone.
4342 *
4343 * @param pGMM The GMM instance data.
4344 * @param pGblMod The module to delete.
4345 */
4346static void gmmR0ShModDeleteGlobal(PGMM pGMM, PGMMSHAREDMODULE pGblMod)
4347{
4348 Assert(pGblMod->cUsers == 0);
4349 Assert(pGMM->cShareableModules > 0 && pGMM->cShareableModules <= GMM_MAX_SHARED_GLOBAL_MODULES);
4350
4351 void *pvTest = RTAvllU32RemoveNode(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4352 Assert(pvTest == pGblMod); NOREF(pvTest);
4353 pGMM->cShareableModules--;
4354
4355 uint32_t i = pGblMod->cRegions;
4356 while (i-- > 0)
4357 {
4358 if (pGblMod->aRegions[i].paidPages)
4359 {
4360 /* We don't doing anything to the pages as they are handled by the
4361 copy-on-write mechanism in PGM. */
4362 RTMemFree(pGblMod->aRegions[i].paidPages);
4363 pGblMod->aRegions[i].paidPages = NULL;
4364 }
4365 }
4366 RTMemFree(pGblMod);
4367}
4368
4369
4370static int gmmR0ShModNewPerVM(PGVM pGVM, RTGCPTR GCBaseAddr, uint32_t cRegions, const VMMDEVSHAREDREGIONDESC *paRegions,
4371 PGMMSHAREDMODULEPERVM *ppRecVM)
4372{
4373 if (pGVM->gmm.s.Stats.cShareableModules >= GMM_MAX_SHARED_PER_VM_MODULES)
4374 return VERR_GMM_TOO_MANY_PER_VM_MODULES;
4375
4376 PGMMSHAREDMODULEPERVM pRecVM;
4377 pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULEPERVM, aRegionsGCPtrs[cRegions]));
4378 if (!pRecVM)
4379 return VERR_NO_MEMORY;
4380
4381 pRecVM->Core.Key = GCBaseAddr;
4382 for (uint32_t i = 0; i < cRegions; i++)
4383 pRecVM->aRegionsGCPtrs[i] = paRegions[i].GCRegionAddr;
4384
4385 bool fInsert = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4386 Assert(fInsert); NOREF(fInsert);
4387 pGVM->gmm.s.Stats.cShareableModules++;
4388
4389 *ppRecVM = pRecVM;
4390 return VINF_SUCCESS;
4391}
4392
4393
4394static void gmmR0ShModDeletePerVM(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULEPERVM pRecVM, bool fRemove)
4395{
4396 /*
4397 * Free the per-VM module.
4398 */
4399 PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
4400 pRecVM->pGlobalModule = NULL;
4401
4402 if (fRemove)
4403 {
4404 void *pvTest = RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, pRecVM->Core.Key);
4405 Assert(pvTest == &pRecVM->Core);
4406 }
4407
4408 RTMemFree(pRecVM);
4409
4410 /*
4411 * Release the global module.
4412 * (In the registration bailout case, it might not be.)
4413 */
4414 if (pGblMod)
4415 {
4416 Assert(pGblMod->cUsers > 0);
4417 pGblMod->cUsers--;
4418 if (pGblMod->cUsers == 0)
4419 gmmR0ShModDeleteGlobal(pGMM, pGblMod);
4420 }
4421}
4422
4423#endif /* VBOX_WITH_PAGE_SHARING */
4424
4425/**
4426 * Registers a new shared module for the VM.
4427 *
4428 * @returns VBox status code.
4429 * @param pVM Pointer to the VM.
4430 * @param idCpu The VCPU id.
4431 * @param enmGuestOS The guest OS type.
4432 * @param pszModuleName The module name.
4433 * @param pszVersion The module version.
4434 * @param GCPtrModBase The module base address.
4435 * @param cbModule The module size.
4436 * @param cRegions The mumber of shared region descriptors.
4437 * @param paRegions Pointer to an array of shared region(s).
4438 */
4439GMMR0DECL(int) GMMR0RegisterSharedModule(PVM pVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char *pszModuleName,
4440 char *pszVersion, RTGCPTR GCPtrModBase, uint32_t cbModule,
4441 uint32_t cRegions, struct VMMDEVSHAREDREGIONDESC const *paRegions)
4442{
4443#ifdef VBOX_WITH_PAGE_SHARING
4444 /*
4445 * Validate input and get the basics.
4446 *
4447 * Note! Turns out the module size does necessarily match the size of the
4448 * regions. (iTunes on XP)
4449 */
4450 PGMM pGMM;
4451 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4452 PGVM pGVM;
4453 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4454 if (RT_FAILURE(rc))
4455 return rc;
4456
4457 if (RT_UNLIKELY(cRegions > VMMDEVSHAREDREGIONDESC_MAX))
4458 return VERR_GMM_TOO_MANY_REGIONS;
4459
4460 if (RT_UNLIKELY(cbModule == 0 || cbModule > _1G))
4461 return VERR_GMM_BAD_SHARED_MODULE_SIZE;
4462
4463 uint32_t cbTotal = 0;
4464 for (uint32_t i = 0; i < cRegions; i++)
4465 {
4466 if (RT_UNLIKELY(paRegions[i].cbRegion == 0 || paRegions[i].cbRegion > _1G))
4467 return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4468
4469 cbTotal += paRegions[i].cbRegion;
4470 if (RT_UNLIKELY(cbTotal > _1G))
4471 return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4472 }
4473
4474 AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4475 if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4476 return VERR_GMM_MODULE_NAME_TOO_LONG;
4477
4478 AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4479 if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4480 return VERR_GMM_MODULE_NAME_TOO_LONG;
4481
4482 uint32_t const uHash = gmmR0ShModCalcHash(pszModuleName, pszVersion);
4483 Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x hash %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule, uHash));
4484
4485 /*
4486 * Take the semaphore and do some more validations.
4487 */
4488 gmmR0MutexAcquire(pGMM);
4489 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4490 {
4491 /*
4492 * Check if this module is already locally registered and register
4493 * it if it isn't. The base address is a unique module identifier
4494 * locally.
4495 */
4496 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4497 bool fNewModule = pRecVM == NULL;
4498 if (fNewModule)
4499 {
4500 rc = gmmR0ShModNewPerVM(pGVM, GCPtrModBase, cRegions, paRegions, &pRecVM);
4501 if (RT_SUCCESS(rc))
4502 {
4503 /*
4504 * Find a matching global module, register a new one if needed.
4505 */
4506 PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4507 pszModuleName, pszVersion, paRegions);
4508 if (!pGblMod)
4509 {
4510 Assert(fNewModule);
4511 rc = gmmR0ShModNewGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4512 pszModuleName, pszVersion, paRegions, &pGblMod);
4513 if (RT_SUCCESS(rc))
4514 {
4515 pRecVM->pGlobalModule = pGblMod; /* (One referenced returned by gmmR0ShModNewGlobal.) */
4516 Log(("GMMR0RegisterSharedModule: new module %s %s\n", pszModuleName, pszVersion));
4517 }
4518 else
4519 gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /*fRemove*/);
4520 }
4521 else
4522 {
4523 Assert(pGblMod->cUsers > 0 && pGblMod->cUsers < UINT32_MAX / 2);
4524 pGblMod->cUsers++;
4525 pRecVM->pGlobalModule = pGblMod;
4526
4527 Log(("GMMR0RegisterSharedModule: new per vm module %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4528 }
4529 }
4530 }
4531 else
4532 {
4533 /*
4534 * Attempt to re-register an existing module.
4535 */
4536 PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4537 pszModuleName, pszVersion, paRegions);
4538 if (pRecVM->pGlobalModule == pGblMod)
4539 {
4540 Log(("GMMR0RegisterSharedModule: already registered %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4541 rc = VINF_GMM_SHARED_MODULE_ALREADY_REGISTERED;
4542 }
4543 else
4544 {
4545 /** @todo may have to unregister+register when this happens in case it's caused
4546 * by VBoxService crashing and being restarted... */
4547 Log(("GMMR0RegisterSharedModule: Address clash!\n"
4548 " incoming at %RGvLB%#x %s %s rgns %u\n"
4549 " existing at %RGvLB%#x %s %s rgns %u\n",
4550 GCPtrModBase, cbModule, pszModuleName, pszVersion, cRegions,
4551 pRecVM->Core.Key, pRecVM->pGlobalModule->cbModule, pRecVM->pGlobalModule->szName,
4552 pRecVM->pGlobalModule->szVersion, pRecVM->pGlobalModule->cRegions));
4553 rc = VERR_GMM_SHARED_MODULE_ADDRESS_CLASH;
4554 }
4555 }
4556 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4557 }
4558 else
4559 rc = VERR_GMM_IS_NOT_SANE;
4560
4561 gmmR0MutexRelease(pGMM);
4562 return rc;
4563#else
4564
4565 NOREF(pVM); NOREF(idCpu); NOREF(enmGuestOS); NOREF(pszModuleName); NOREF(pszVersion);
4566 NOREF(GCPtrModBase); NOREF(cbModule); NOREF(cRegions); NOREF(paRegions);
4567 return VERR_NOT_IMPLEMENTED;
4568#endif
4569}
4570
4571
4572/**
4573 * VMMR0 request wrapper for GMMR0RegisterSharedModule.
4574 *
4575 * @returns see GMMR0RegisterSharedModule.
4576 * @param pVM Pointer to the VM.
4577 * @param idCpu The VCPU id.
4578 * @param pReq Pointer to the request packet.
4579 */
4580GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
4581{
4582 /*
4583 * Validate input and pass it on.
4584 */
4585 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4586 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4587 AssertMsgReturn(pReq->Hdr.cbReq >= sizeof(*pReq) && pReq->Hdr.cbReq == RT_UOFFSETOF(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4588
4589 /* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
4590 pReq->rc = GMMR0RegisterSharedModule(pVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion,
4591 pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
4592 return VINF_SUCCESS;
4593}
4594
4595
4596/**
4597 * Unregisters a shared module for the VM
4598 *
4599 * @returns VBox status code.
4600 * @param pVM Pointer to the VM.
4601 * @param idCpu The VCPU id.
4602 * @param pszModuleName The module name.
4603 * @param pszVersion The module version.
4604 * @param GCPtrModBase The module base address.
4605 * @param cbModule The module size.
4606 */
4607GMMR0DECL(int) GMMR0UnregisterSharedModule(PVM pVM, VMCPUID idCpu, char *pszModuleName, char *pszVersion,
4608 RTGCPTR GCPtrModBase, uint32_t cbModule)
4609{
4610#ifdef VBOX_WITH_PAGE_SHARING
4611 /*
4612 * Validate input and get the basics.
4613 */
4614 PGMM pGMM;
4615 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4616 PGVM pGVM;
4617 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4618 if (RT_FAILURE(rc))
4619 return rc;
4620
4621 AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4622 AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4623 if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4624 return VERR_GMM_MODULE_NAME_TOO_LONG;
4625 if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4626 return VERR_GMM_MODULE_NAME_TOO_LONG;
4627
4628 Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule));
4629
4630 /*
4631 * Take the semaphore and do some more validations.
4632 */
4633 gmmR0MutexAcquire(pGMM);
4634 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4635 {
4636 /*
4637 * Locate and remove the specified module.
4638 */
4639 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4640 if (pRecVM)
4641 {
4642 /** @todo Do we need to do more validations here, like that the
4643 * name + version + cbModule matches? */
4644 Assert(pRecVM->pGlobalModule);
4645 gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /*fRemove*/);
4646 }
4647 else
4648 rc = VERR_GMM_SHARED_MODULE_NOT_FOUND;
4649
4650 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4651 }
4652 else
4653 rc = VERR_GMM_IS_NOT_SANE;
4654
4655 gmmR0MutexRelease(pGMM);
4656 return rc;
4657#else
4658
4659 NOREF(pVM); NOREF(idCpu); NOREF(pszModuleName); NOREF(pszVersion); NOREF(GCPtrModBase); NOREF(cbModule);
4660 return VERR_NOT_IMPLEMENTED;
4661#endif
4662}
4663
4664
4665/**
4666 * VMMR0 request wrapper for GMMR0UnregisterSharedModule.
4667 *
4668 * @returns see GMMR0UnregisterSharedModule.
4669 * @param pVM Pointer to the VM.
4670 * @param idCpu The VCPU id.
4671 * @param pReq Pointer to the request packet.
4672 */
4673GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
4674{
4675 /*
4676 * Validate input and pass it on.
4677 */
4678 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4679 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4680 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4681
4682 return GMMR0UnregisterSharedModule(pVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
4683}
4684
4685#ifdef VBOX_WITH_PAGE_SHARING
4686
4687/**
4688 * Increase the use count of a shared page, the page is known to exist and be valid and such.
4689 *
4690 * @param pGMM Pointer to the GMM instance.
4691 * @param pGVM Pointer to the GVM instance.
4692 * @param pPage The page structure.
4693 */
4694DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
4695{
4696 Assert(pGMM->cSharedPages > 0);
4697 Assert(pGMM->cAllocatedPages > 0);
4698
4699 pGMM->cDuplicatePages++;
4700
4701 pPage->Shared.cRefs++;
4702 pGVM->gmm.s.Stats.cSharedPages++;
4703 pGVM->gmm.s.Stats.Allocated.cBasePages++;
4704}
4705
4706
4707/**
4708 * Converts a private page to a shared page, the page is known to exist and be valid and such.
4709 *
4710 * @param pGMM Pointer to the GMM instance.
4711 * @param pGVM Pointer to the GVM instance.
4712 * @param HCPhys Host physical address
4713 * @param idPage The Page ID
4714 * @param pPage The page structure.
4715 */
4716DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage,
4717 PGMMSHAREDPAGEDESC pPageDesc)
4718{
4719 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4720 Assert(pChunk);
4721 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
4722 Assert(GMM_PAGE_IS_PRIVATE(pPage));
4723
4724 pChunk->cPrivate--;
4725 pChunk->cShared++;
4726
4727 pGMM->cSharedPages++;
4728
4729 pGVM->gmm.s.Stats.cSharedPages++;
4730 pGVM->gmm.s.Stats.cPrivatePages--;
4731
4732 /* Modify the page structure. */
4733 pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
4734 pPage->Shared.cRefs = 1;
4735#ifdef VBOX_STRICT
4736 pPageDesc->u32StrictChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
4737 pPage->Shared.u14Checksum = pPageDesc->u32StrictChecksum;
4738#else
4739 pPage->Shared.u14Checksum = 0;
4740#endif
4741 pPage->Shared.u2State = GMM_PAGE_STATE_SHARED;
4742}
4743
4744
4745static int gmmR0SharedModuleCheckPageFirstTime(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULE pModule,
4746 unsigned idxRegion, unsigned idxPage,
4747 PGMMSHAREDPAGEDESC pPageDesc, PGMMSHAREDREGIONDESC pGlobalRegion)
4748{
4749 /* Easy case: just change the internal page type. */
4750 PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->idPage);
4751 AssertMsgReturn(pPage, ("idPage=%#x (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x) #1\n",
4752 pPageDesc->idPage, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage),
4753 VERR_PGM_PHYS_INVALID_PAGE_ID);
4754
4755 AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
4756
4757 gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->idPage, pPage, pPageDesc);
4758
4759 /* Keep track of these references. */
4760 pGlobalRegion->paidPages[idxPage] = pPageDesc->idPage;
4761
4762 return VINF_SUCCESS;
4763}
4764
4765/**
4766 * Checks specified shared module range for changes
4767 *
4768 * Performs the following tasks:
4769 * - If a shared page is new, then it changes the GMM page type to shared and
4770 * returns it in the pPageDesc descriptor.
4771 * - If a shared page already exists, then it checks if the VM page is
4772 * identical and if so frees the VM page and returns the shared page in
4773 * pPageDesc descriptor.
4774 *
4775 * @remarks ASSUMES the caller has acquired the GMM semaphore!!
4776 *
4777 * @returns VBox status code.
4778 * @param pGMM Pointer to the GMM instance data.
4779 * @param pGVM Pointer to the GVM instance data.
4780 * @param pModule Module description
4781 * @param idxRegion Region index
4782 * @param idxPage Page index
4783 * @param paPageDesc Page descriptor
4784 */
4785GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, uint32_t idxRegion, uint32_t idxPage,
4786 PGMMSHAREDPAGEDESC pPageDesc)
4787{
4788 int rc;
4789 PGMM pGMM;
4790 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4791 pPageDesc->u32StrictChecksum = 0;
4792
4793 AssertMsgReturn(idxRegion < pModule->cRegions,
4794 ("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
4795 VERR_INVALID_PARAMETER);
4796
4797 uint32_t const cPages = pModule->aRegions[idxRegion].cb >> PAGE_SHIFT;
4798 AssertMsgReturn(idxPage < cPages,
4799 ("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
4800 VERR_INVALID_PARAMETER);
4801
4802 LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
4803
4804 /*
4805 * First time; create a page descriptor array.
4806 */
4807 PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
4808 if (!pGlobalRegion->paidPages)
4809 {
4810 Log(("Allocate page descriptor array for %d pages\n", cPages));
4811 pGlobalRegion->paidPages = (uint32_t *)RTMemAlloc(cPages * sizeof(pGlobalRegion->paidPages[0]));
4812 AssertReturn(pGlobalRegion->paidPages, VERR_NO_MEMORY);
4813
4814 /* Invalidate all descriptors. */
4815 uint32_t i = cPages;
4816 while (i-- > 0)
4817 pGlobalRegion->paidPages[i] = NIL_GMM_PAGEID;
4818 }
4819
4820 /*
4821 * We've seen this shared page for the first time?
4822 */
4823 if (pGlobalRegion->paidPages[idxPage] == NIL_GMM_PAGEID)
4824 {
4825 Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
4826 return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
4827 }
4828
4829 /*
4830 * We've seen it before...
4831 */
4832 Log(("Replace existing page guest %RGp host %RHp id %#x -> id %#x\n",
4833 pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->idPage, pGlobalRegion->paidPages[idxPage]));
4834 Assert(pPageDesc->idPage != pGlobalRegion->paidPages[idxPage]);
4835
4836 /*
4837 * Get the shared page source.
4838 */
4839 PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paidPages[idxPage]);
4840 AssertMsgReturn(pPage, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #2\n", pPageDesc->idPage, idxRegion, idxPage),
4841 VERR_PGM_PHYS_INVALID_PAGE_ID);
4842
4843 if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
4844 {
4845 /*
4846 * Page was freed at some point; invalidate this entry.
4847 */
4848 /** @todo this isn't really bullet proof. */
4849 Log(("Old shared page was freed -> create a new one\n"));
4850 pGlobalRegion->paidPages[idxPage] = NIL_GMM_PAGEID;
4851 return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
4852 }
4853
4854 Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
4855
4856 /*
4857 * Calculate the virtual address of the local page.
4858 */
4859 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pPageDesc->idPage >> GMM_CHUNKID_SHIFT);
4860 AssertMsgReturn(pChunk, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #4\n", pPageDesc->idPage, idxRegion, idxPage),
4861 VERR_PGM_PHYS_INVALID_PAGE_ID);
4862
4863 uint8_t *pbChunk;
4864 AssertMsgReturn(gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk),
4865 ("idPage=%#x (idxRegion=%#x idxPage=%#x) #3\n", pPageDesc->idPage, idxRegion, idxPage),
4866 VERR_PGM_PHYS_INVALID_PAGE_ID);
4867 uint8_t *pbLocalPage = pbChunk + ((pPageDesc->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4868
4869 /*
4870 * Calculate the virtual address of the shared page.
4871 */
4872 pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paidPages[idxPage] >> GMM_CHUNKID_SHIFT);
4873 Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
4874
4875 /*
4876 * Get the virtual address of the physical page; map the chunk into the VM
4877 * process if not already done.
4878 */
4879 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4880 {
4881 Log(("Map chunk into process!\n"));
4882 rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/, (PRTR3PTR)&pbChunk);
4883 AssertRCReturn(rc, rc);
4884 }
4885 uint8_t *pbSharedPage = pbChunk + ((pGlobalRegion->paidPages[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4886
4887#ifdef VBOX_STRICT
4888 pPageDesc->u32StrictChecksum = RTCrc32(pbSharedPage, PAGE_SIZE);
4889 uint32_t uChecksum = pPageDesc->u32StrictChecksum & UINT32_C(0x00003fff);
4890 AssertMsg(!uChecksum || uChecksum == pPage->Shared.u14Checksum || !pPage->Shared.u14Checksum,
4891 ("%#x vs %#x - idPage=%# - %s %s\n", uChecksum, pPage->Shared.u14Checksum,
4892 pGlobalRegion->paidPages[idxPage], pModule->szName, pModule->szVersion));
4893#endif
4894
4895 /** @todo write ASMMemComparePage. */
4896 if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
4897 {
4898 Log(("Unexpected differences found between local and shared page; skip\n"));
4899 /* Signal to the caller that this one hasn't changed. */
4900 pPageDesc->idPage = NIL_GMM_PAGEID;
4901 return VINF_SUCCESS;
4902 }
4903
4904 /*
4905 * Free the old local page.
4906 */
4907 GMMFREEPAGEDESC PageDesc;
4908 PageDesc.idPage = pPageDesc->idPage;
4909 rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
4910 AssertRCReturn(rc, rc);
4911
4912 gmmR0UseSharedPage(pGMM, pGVM, pPage);
4913
4914 /*
4915 * Pass along the new physical address & page id.
4916 */
4917 pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
4918 pPageDesc->idPage = pGlobalRegion->paidPages[idxPage];
4919
4920 return VINF_SUCCESS;
4921}
4922
4923
4924/**
4925 * RTAvlGCPtrDestroy callback.
4926 *
4927 * @returns 0 or VERR_GMM_INSTANCE.
4928 * @param pNode The node to destroy.
4929 * @param pvArgs Pointer to an argument packet.
4930 */
4931static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvArgs)
4932{
4933 gmmR0ShModDeletePerVM(((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGMM,
4934 ((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGVM,
4935 (PGMMSHAREDMODULEPERVM)pNode,
4936 false /*fRemove*/);
4937 return VINF_SUCCESS;
4938}
4939
4940
4941/**
4942 * Used by GMMR0CleanupVM to clean up shared modules.
4943 *
4944 * This is called without taking the GMM lock so that it can be yielded as
4945 * needed here.
4946 *
4947 * @param pGMM The GMM handle.
4948 * @param pGVM The global VM handle.
4949 */
4950static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
4951{
4952 gmmR0MutexAcquire(pGMM);
4953 GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
4954
4955 GMMR0SHMODPERVMDTORARGS Args;
4956 Args.pGVM = pGVM;
4957 Args.pGMM = pGMM;
4958 RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
4959
4960 AssertMsg(pGVM->gmm.s.Stats.cShareableModules == 0, ("%d\n", pGVM->gmm.s.Stats.cShareableModules));
4961 pGVM->gmm.s.Stats.cShareableModules = 0;
4962
4963 gmmR0MutexRelease(pGMM);
4964}
4965
4966#endif /* VBOX_WITH_PAGE_SHARING */
4967
4968/**
4969 * Removes all shared modules for the specified VM
4970 *
4971 * @returns VBox status code.
4972 * @param pVM Pointer to the VM.
4973 * @param idCpu The VCPU id.
4974 */
4975GMMR0DECL(int) GMMR0ResetSharedModules(PVM pVM, VMCPUID idCpu)
4976{
4977#ifdef VBOX_WITH_PAGE_SHARING
4978 /*
4979 * Validate input and get the basics.
4980 */
4981 PGMM pGMM;
4982 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4983 PGVM pGVM;
4984 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4985 if (RT_FAILURE(rc))
4986 return rc;
4987
4988 /*
4989 * Take the semaphore and do some more validations.
4990 */
4991 gmmR0MutexAcquire(pGMM);
4992 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4993 {
4994 Log(("GMMR0ResetSharedModules\n"));
4995 GMMR0SHMODPERVMDTORARGS Args;
4996 Args.pGVM = pGVM;
4997 Args.pGMM = pGMM;
4998 RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
4999 pGVM->gmm.s.Stats.cShareableModules = 0;
5000
5001 rc = VINF_SUCCESS;
5002 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5003 }
5004 else
5005 rc = VERR_GMM_IS_NOT_SANE;
5006
5007 gmmR0MutexRelease(pGMM);
5008 return rc;
5009#else
5010 NOREF(pVM); NOREF(idCpu);
5011 return VERR_NOT_IMPLEMENTED;
5012#endif
5013}
5014
5015#ifdef VBOX_WITH_PAGE_SHARING
5016
5017/**
5018 * Tree enumeration callback for checking a shared module.
5019 */
5020static DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
5021{
5022 GMMCHECKSHAREDMODULEINFO *pArgs = (GMMCHECKSHAREDMODULEINFO*)pvUser;
5023 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
5024 PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
5025
5026 Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x\n",
5027 pGblMod->szName, pGblMod->szVersion, pGblMod->Core.Key, pGblMod->cbModule));
5028
5029 int rc = PGMR0SharedModuleCheck(pArgs->pGVM->pVM, pArgs->pGVM, pArgs->idCpu, pGblMod, pRecVM->aRegionsGCPtrs);
5030 if (RT_FAILURE(rc))
5031 return rc;
5032 return VINF_SUCCESS;
5033}
5034
5035#endif /* VBOX_WITH_PAGE_SHARING */
5036#ifdef DEBUG_sandervl
5037
5038/**
5039 * Setup for a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
5040 *
5041 * @returns VBox status code.
5042 * @param pVM Pointer to the VM.
5043 */
5044GMMR0DECL(int) GMMR0CheckSharedModulesStart(PVM pVM)
5045{
5046 /*
5047 * Validate input and get the basics.
5048 */
5049 PGMM pGMM;
5050 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5051
5052 /*
5053 * Take the semaphore and do some more validations.
5054 */
5055 gmmR0MutexAcquire(pGMM);
5056 if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5057 rc = VERR_GMM_IS_NOT_SANE;
5058 else
5059 rc = VINF_SUCCESS;
5060
5061 return rc;
5062}
5063
5064/**
5065 * Clean up after a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
5066 *
5067 * @returns VBox status code.
5068 * @param pVM Pointer to the VM.
5069 */
5070GMMR0DECL(int) GMMR0CheckSharedModulesEnd(PVM pVM)
5071{
5072 /*
5073 * Validate input and get the basics.
5074 */
5075 PGMM pGMM;
5076 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5077
5078 gmmR0MutexRelease(pGMM);
5079 return VINF_SUCCESS;
5080}
5081
5082#endif /* DEBUG_sandervl */
5083
5084/**
5085 * Check all shared modules for the specified VM.
5086 *
5087 * @returns VBox status code.
5088 * @param pVM Pointer to the VM.
5089 * @param pVCpu Pointer to the VMCPU.
5090 */
5091GMMR0DECL(int) GMMR0CheckSharedModules(PVM pVM, PVMCPU pVCpu)
5092{
5093#ifdef VBOX_WITH_PAGE_SHARING
5094 /*
5095 * Validate input and get the basics.
5096 */
5097 PGMM pGMM;
5098 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5099 PGVM pGVM;
5100 int rc = GVMMR0ByVMAndEMT(pVM, pVCpu->idCpu, &pGVM);
5101 if (RT_FAILURE(rc))
5102 return rc;
5103
5104# ifndef DEBUG_sandervl
5105 /*
5106 * Take the semaphore and do some more validations.
5107 */
5108 gmmR0MutexAcquire(pGMM);
5109# endif
5110 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5111 {
5112 /*
5113 * Walk the tree, checking each module.
5114 */
5115 Log(("GMMR0CheckSharedModules\n"));
5116
5117 GMMCHECKSHAREDMODULEINFO Args;
5118 Args.pGVM = pGVM;
5119 Args.idCpu = pVCpu->idCpu;
5120 rc = RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Args);
5121
5122 Log(("GMMR0CheckSharedModules done!\n"));
5123 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5124 }
5125 else
5126 rc = VERR_GMM_IS_NOT_SANE;
5127
5128# ifndef DEBUG_sandervl
5129 gmmR0MutexRelease(pGMM);
5130# endif
5131 return rc;
5132#else
5133 NOREF(pVM); NOREF(pVCpu);
5134 return VERR_NOT_IMPLEMENTED;
5135#endif
5136}
5137
5138#if defined(VBOX_STRICT) && HC_ARCH_BITS == 64
5139
5140/**
5141 * RTAvlU32DoWithAll callback.
5142 *
5143 * @returns 0
5144 * @param pNode The node to search.
5145 * @param pvUser Pointer to the input argument packet.
5146 */
5147static DECLCALLBACK(int) gmmR0FindDupPageInChunk(PAVLU32NODECORE pNode, void *pvUser)
5148{
5149 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
5150 GMMFINDDUPPAGEINFO *pArgs = (GMMFINDDUPPAGEINFO *)pvUser;
5151 PGVM pGVM = pArgs->pGVM;
5152 PGMM pGMM = pArgs->pGMM;
5153 uint8_t *pbChunk;
5154
5155 /* Only take chunks not mapped into this VM process; not entirely correct. */
5156 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5157 {
5158 int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/, (PRTR3PTR)&pbChunk);
5159 if (RT_SUCCESS(rc))
5160 {
5161 /*
5162 * Look for duplicate pages
5163 */
5164 unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
5165 while (iPage-- > 0)
5166 {
5167 if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
5168 {
5169 uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
5170
5171 if (!memcmp(pArgs->pSourcePage, pbDestPage, PAGE_SIZE))
5172 {
5173 pArgs->fFoundDuplicate = true;
5174 break;
5175 }
5176 }
5177 }
5178 gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/);
5179 }
5180 }
5181 return pArgs->fFoundDuplicate; /* (stops search if true) */
5182}
5183
5184
5185/**
5186 * Find a duplicate of the specified page in other active VMs
5187 *
5188 * @returns VBox status code.
5189 * @param pVM Pointer to the VM.
5190 * @param pReq Pointer to the request packet.
5191 */
5192GMMR0DECL(int) GMMR0FindDuplicatePageReq(PVM pVM, PGMMFINDDUPLICATEPAGEREQ pReq)
5193{
5194 /*
5195 * Validate input and pass it on.
5196 */
5197 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
5198 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5199 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5200
5201 PGMM pGMM;
5202 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5203
5204 PGVM pGVM;
5205 int rc = GVMMR0ByVM(pVM, &pGVM);
5206 if (RT_FAILURE(rc))
5207 return rc;
5208
5209 /*
5210 * Take the semaphore and do some more validations.
5211 */
5212 rc = gmmR0MutexAcquire(pGMM);
5213 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5214 {
5215 uint8_t *pbChunk;
5216 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
5217 if (pChunk)
5218 {
5219 if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5220 {
5221 uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5222 PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
5223 if (pPage)
5224 {
5225 GMMFINDDUPPAGEINFO Args;
5226 Args.pGVM = pGVM;
5227 Args.pGMM = pGMM;
5228 Args.pSourcePage = pbSourcePage;
5229 Args.fFoundDuplicate = false;
5230 RTAvlU32DoWithAll(&pGMM->pChunks, true /* fFromLeft */, gmmR0FindDupPageInChunk, &Args);
5231
5232 pReq->fDuplicate = Args.fFoundDuplicate;
5233 }
5234 else
5235 {
5236 AssertFailed();
5237 rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
5238 }
5239 }
5240 else
5241 AssertFailed();
5242 }
5243 else
5244 AssertFailed();
5245 }
5246 else
5247 rc = VERR_GMM_IS_NOT_SANE;
5248
5249 gmmR0MutexRelease(pGMM);
5250 return rc;
5251}
5252
5253#endif /* VBOX_STRICT && HC_ARCH_BITS == 64 */
5254
5255
5256/**
5257 * Retrieves the GMM statistics visible to the caller.
5258 *
5259 * @returns VBox status code.
5260 *
5261 * @param pStats Where to put the statistics.
5262 * @param pSession The current session.
5263 * @param pVM Pointer to the VM to obtain statistics for. Optional.
5264 */
5265GMMR0DECL(int) GMMR0QueryStatistics(PGMMSTATS pStats, PSUPDRVSESSION pSession, PVM pVM)
5266{
5267 LogFlow(("GVMMR0QueryStatistics: pStats=%p pSession=%p pVM=%p\n", pStats, pSession, pVM));
5268
5269 /*
5270 * Validate input.
5271 */
5272 AssertPtrReturn(pSession, VERR_INVALID_POINTER);
5273 AssertPtrReturn(pStats, VERR_INVALID_POINTER);
5274 pStats->cMaxPages = 0; /* (crash before taking the mutex...) */
5275
5276 PGMM pGMM;
5277 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5278
5279 /*
5280 * Resolve the VM handle, if not NULL, and lock the GMM.
5281 */
5282 int rc;
5283 PGVM pGVM;
5284 if (pVM)
5285 {
5286 rc = GVMMR0ByVM(pVM, &pGVM);
5287 if (RT_FAILURE(rc))
5288 return rc;
5289 }
5290 else
5291 pGVM = NULL;
5292
5293 rc = gmmR0MutexAcquire(pGMM);
5294 if (RT_FAILURE(rc))
5295 return rc;
5296
5297 /*
5298 * Copy out the GMM statistics.
5299 */
5300 pStats->cMaxPages = pGMM->cMaxPages;
5301 pStats->cReservedPages = pGMM->cReservedPages;
5302 pStats->cOverCommittedPages = pGMM->cOverCommittedPages;
5303 pStats->cAllocatedPages = pGMM->cAllocatedPages;
5304 pStats->cSharedPages = pGMM->cSharedPages;
5305 pStats->cDuplicatePages = pGMM->cDuplicatePages;
5306 pStats->cLeftBehindSharedPages = pGMM->cLeftBehindSharedPages;
5307 pStats->cBalloonedPages = pGMM->cBalloonedPages;
5308 pStats->cChunks = pGMM->cChunks;
5309 pStats->cFreedChunks = pGMM->cFreedChunks;
5310 pStats->cShareableModules = pGMM->cShareableModules;
5311 RT_ZERO(pStats->au64Reserved);
5312
5313 /*
5314 * Copy out the VM statistics.
5315 */
5316 if (pGVM)
5317 pStats->VMStats = pGVM->gmm.s.Stats;
5318 else
5319 RT_ZERO(pStats->VMStats);
5320
5321 gmmR0MutexRelease(pGMM);
5322 return rc;
5323}
5324
5325
5326/**
5327 * VMMR0 request wrapper for GMMR0QueryStatistics.
5328 *
5329 * @returns see GMMR0QueryStatistics.
5330 * @param pVM Pointer to the VM. Optional.
5331 * @param pReq Pointer to the request packet.
5332 */
5333GMMR0DECL(int) GMMR0QueryStatisticsReq(PVM pVM, PGMMQUERYSTATISTICSSREQ pReq)
5334{
5335 /*
5336 * Validate input and pass it on.
5337 */
5338 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5339 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5340
5341 return GMMR0QueryStatistics(&pReq->Stats, pReq->pSession, pVM);
5342}
5343
5344
5345/**
5346 * Resets the specified GMM statistics.
5347 *
5348 * @returns VBox status code.
5349 *
5350 * @param pStats Which statistics to reset, that is, non-zero fields
5351 * indicates which to reset.
5352 * @param pSession The current session.
5353 * @param pVM The VM to reset statistics for. Optional.
5354 */
5355GMMR0DECL(int) GMMR0ResetStatistics(PCGMMSTATS pStats, PSUPDRVSESSION pSession, PVM pVM)
5356{
5357 /* Currently nothing we can reset at the moment. */
5358 return VINF_SUCCESS;
5359}
5360
5361
5362/**
5363 * VMMR0 request wrapper for GMMR0ResetStatistics.
5364 *
5365 * @returns see GMMR0ResetStatistics.
5366 * @param pVM Pointer to the VM. Optional.
5367 * @param pReq Pointer to the request packet.
5368 */
5369GMMR0DECL(int) GMMR0ResetStatisticsReq(PVM pVM, PGMMRESETSTATISTICSSREQ pReq)
5370{
5371 /*
5372 * Validate input and pass it on.
5373 */
5374 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5375 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5376
5377 return GMMR0ResetStatistics(&pReq->Stats, pReq->pSession, pVM);
5378}
5379
Note: See TracBrowser for help on using the repository browser.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette