VirtualBox

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 55755

Last change on this file since 55755 was 51940, checked in by vboxsync, 11 years ago

GMMR0: Switched from fast mutex to critical section for the giant GMMR0 lock to avoid running into unnecessary trouble with the windows driver verifier. Required making the critical section code compile and link in the ring-0 environment.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id Revision
File size: 189.9 KB
Line 
1/* $Id: GMMR0.cpp 51940 2014-07-08 17:45:51Z vboxsync $ */
2/** @file
3 * GMM - Global Memory Manager.
4 */
5
6/*
7 * Copyright (C) 2007-2013 Oracle Corporation
8 *
9 * This file is part of VirtualBox Open Source Edition (OSE), as
10 * available from http://www.virtualbox.org. This file is free software;
11 * you can redistribute it and/or modify it under the terms of the GNU
12 * General Public License (GPL) as published by the Free Software
13 * Foundation, in version 2 as it comes in the "COPYING" file of the
14 * VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15 * hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16 */
17
18
19/** @page pg_gmm GMM - The Global Memory Manager
20 *
21 * As the name indicates, this component is responsible for global memory
22 * management. Currently only guest RAM is allocated from the GMM, but this
23 * may change to include shadow page tables and other bits later.
24 *
25 * Guest RAM is managed as individual pages, but allocated from the host OS
26 * in chunks for reasons of portability / efficiency. To minimize the memory
27 * footprint all tracking structure must be as small as possible without
28 * unnecessary performance penalties.
29 *
30 * The allocation chunks has fixed sized, the size defined at compile time
31 * by the #GMM_CHUNK_SIZE \#define.
32 *
33 * Each chunk is given an unique ID. Each page also has a unique ID. The
34 * relation ship between the two IDs is:
35 * @code
36 * GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37 * idPage = (idChunk << GMM_CHUNK_SHIFT) | iPage;
38 * @endcode
39 * Where iPage is the index of the page within the chunk. This ID scheme
40 * permits for efficient chunk and page lookup, but it relies on the chunk size
41 * to be set at compile time. The chunks are organized in an AVL tree with their
42 * IDs being the keys.
43 *
44 * The physical address of each page in an allocation chunk is maintained by
45 * the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46 * need to duplicate this information (it'll cost 8-bytes per page if we did).
47 *
48 * So what do we need to track per page? Most importantly we need to know
49 * which state the page is in:
50 * - Private - Allocated for (eventually) backing one particular VM page.
51 * - Shared - Readonly page that is used by one or more VMs and treated
52 * as COW by PGM.
53 * - Free - Not used by anyone.
54 *
55 * For the page replacement operations (sharing, defragmenting and freeing)
56 * to be somewhat efficient, private pages needs to be associated with a
57 * particular page in a particular VM.
58 *
59 * Tracking the usage of shared pages is impractical and expensive, so we'll
60 * settle for a reference counting system instead.
61 *
62 * Free pages will be chained on LIFOs
63 *
64 * On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65 * systems a 32-bit bitfield will have to suffice because of address space
66 * limitations. The #GMMPAGE structure shows the details.
67 *
68 *
69 * @section sec_gmm_alloc_strat Page Allocation Strategy
70 *
71 * The strategy for allocating pages has to take fragmentation and shared
72 * pages into account, or we may end up with with 2000 chunks with only
73 * a few pages in each. Shared pages cannot easily be reallocated because
74 * of the inaccurate usage accounting (see above). Private pages can be
75 * reallocated by a defragmentation thread in the same manner that sharing
76 * is done.
77 *
78 * The first approach is to manage the free pages in two sets depending on
79 * whether they are mainly for the allocation of shared or private pages.
80 * In the initial implementation there will be almost no possibility for
81 * mixing shared and private pages in the same chunk (only if we're really
82 * stressed on memory), but when we implement forking of VMs and have to
83 * deal with lots of COW pages it'll start getting kind of interesting.
84 *
85 * The sets are lists of chunks with approximately the same number of
86 * free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87 * consists of 16 lists. So, the first list will contain the chunks with
88 * 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89 * moved between the lists as pages are freed up or allocated.
90 *
91 *
92 * @section sec_gmm_costs Costs
93 *
94 * The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95 * entails. In addition there is the chunk cost of approximately
96 * (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97 *
98 * On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99 * and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100 * The cost on Linux is identical, but here it's because of sizeof(struct page *).
101 *
102 *
103 * @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104 *
105 * In legacy mode the page source is locked user pages and not
106 * #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107 * by the VM that locked it. We will make no attempt at implementing
108 * page sharing on these systems, just do enough to make it all work.
109 *
110 *
111 * @subsection sub_gmm_locking Serializing
112 *
113 * One simple fast mutex will be employed in the initial implementation, not
114 * two as mentioned in @ref subsec_pgmPhys_Serializing.
115 *
116 * @see @ref subsec_pgmPhys_Serializing
117 *
118 *
119 * @section sec_gmm_overcommit Memory Over-Commitment Management
120 *
121 * The GVM will have to do the system wide memory over-commitment
122 * management. My current ideas are:
123 * - Per VM oc policy that indicates how much to initially commit
124 * to it and what to do in a out-of-memory situation.
125 * - Prevent overtaxing the host.
126 *
127 * There are some challenges here, the main ones are configurability and
128 * security. Should we for instance permit anyone to request 100% memory
129 * commitment? Who should be allowed to do runtime adjustments of the
130 * config. And how to prevent these settings from being lost when the last
131 * VM process exits? The solution is probably to have an optional root
132 * daemon the will keep VMMR0.r0 in memory and enable the security measures.
133 *
134 *
135 *
136 * @section sec_gmm_numa NUMA
137 *
138 * NUMA considerations will be designed and implemented a bit later.
139 *
140 * The preliminary guesses is that we will have to try allocate memory as
141 * close as possible to the CPUs the VM is executed on (EMT and additional CPU
142 * threads). Which means it's mostly about allocation and sharing policies.
143 * Both the scheduler and allocator interface will to supply some NUMA info
144 * and we'll need to have a way to calc access costs.
145 *
146 */
147
148
149/*******************************************************************************
150* Header Files *
151*******************************************************************************/
152#define LOG_GROUP LOG_GROUP_GMM
153#include <VBox/rawpci.h>
154#include <VBox/vmm/vm.h>
155#include <VBox/vmm/gmm.h>
156#include "GMMR0Internal.h"
157#include <VBox/vmm/gvm.h>
158#include <VBox/vmm/pgm.h>
159#include <VBox/log.h>
160#include <VBox/param.h>
161#include <VBox/err.h>
162#include <iprt/asm.h>
163#include <iprt/avl.h>
164#ifdef VBOX_STRICT
165# include <iprt/crc.h>
166#endif
167#include <iprt/critsect.h>
168#include <iprt/list.h>
169#include <iprt/mem.h>
170#include <iprt/memobj.h>
171#include <iprt/mp.h>
172#include <iprt/semaphore.h>
173#include <iprt/string.h>
174#include <iprt/time.h>
175
176
177/*******************************************************************************
178* Defined Constants And Macros *
179*******************************************************************************/
180/** @def VBOX_USE_CRIT_SECT_FOR_GIANT
181 * Use a critical section instead of a fast mutex for the giant GMM lock.
182 *
183 * @remarks This is primarily a way of avoiding the deadlock checks in the
184 * windows driver verifier. */
185#if defined(RT_OS_WINDOWS) || defined(DOXYGEN_RUNNING)
186# define VBOX_USE_CRIT_SECT_FOR_GIANT
187#endif
188
189
190/*******************************************************************************
191* Structures and Typedefs *
192*******************************************************************************/
193/** Pointer to set of free chunks. */
194typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
195
196/**
197 * The per-page tracking structure employed by the GMM.
198 *
199 * On 32-bit hosts we'll some trickery is necessary to compress all
200 * the information into 32-bits. When the fSharedFree member is set,
201 * the 30th bit decides whether it's a free page or not.
202 *
203 * Because of the different layout on 32-bit and 64-bit hosts, macros
204 * are used to get and set some of the data.
205 */
206typedef union GMMPAGE
207{
208#if HC_ARCH_BITS == 64
209 /** Unsigned integer view. */
210 uint64_t u;
211
212 /** The common view. */
213 struct GMMPAGECOMMON
214 {
215 uint32_t uStuff1 : 32;
216 uint32_t uStuff2 : 30;
217 /** The page state. */
218 uint32_t u2State : 2;
219 } Common;
220
221 /** The view of a private page. */
222 struct GMMPAGEPRIVATE
223 {
224 /** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
225 uint32_t pfn;
226 /** The GVM handle. (64K VMs) */
227 uint32_t hGVM : 16;
228 /** Reserved. */
229 uint32_t u16Reserved : 14;
230 /** The page state. */
231 uint32_t u2State : 2;
232 } Private;
233
234 /** The view of a shared page. */
235 struct GMMPAGESHARED
236 {
237 /** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
238 uint32_t pfn;
239 /** The reference count (64K VMs). */
240 uint32_t cRefs : 16;
241 /** Used for debug checksumming. */
242 uint32_t u14Checksum : 14;
243 /** The page state. */
244 uint32_t u2State : 2;
245 } Shared;
246
247 /** The view of a free page. */
248 struct GMMPAGEFREE
249 {
250 /** The index of the next page in the free list. UINT16_MAX is NIL. */
251 uint16_t iNext;
252 /** Reserved. Checksum or something? */
253 uint16_t u16Reserved0;
254 /** Reserved. Checksum or something? */
255 uint32_t u30Reserved1 : 30;
256 /** The page state. */
257 uint32_t u2State : 2;
258 } Free;
259
260#else /* 32-bit */
261 /** Unsigned integer view. */
262 uint32_t u;
263
264 /** The common view. */
265 struct GMMPAGECOMMON
266 {
267 uint32_t uStuff : 30;
268 /** The page state. */
269 uint32_t u2State : 2;
270 } Common;
271
272 /** The view of a private page. */
273 struct GMMPAGEPRIVATE
274 {
275 /** The guest page frame number. (Max addressable: 2 ^ 36) */
276 uint32_t pfn : 24;
277 /** The GVM handle. (127 VMs) */
278 uint32_t hGVM : 7;
279 /** The top page state bit, MBZ. */
280 uint32_t fZero : 1;
281 } Private;
282
283 /** The view of a shared page. */
284 struct GMMPAGESHARED
285 {
286 /** The reference count. */
287 uint32_t cRefs : 30;
288 /** The page state. */
289 uint32_t u2State : 2;
290 } Shared;
291
292 /** The view of a free page. */
293 struct GMMPAGEFREE
294 {
295 /** The index of the next page in the free list. UINT16_MAX is NIL. */
296 uint32_t iNext : 16;
297 /** Reserved. Checksum or something? */
298 uint32_t u14Reserved : 14;
299 /** The page state. */
300 uint32_t u2State : 2;
301 } Free;
302#endif
303} GMMPAGE;
304AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
305/** Pointer to a GMMPAGE. */
306typedef GMMPAGE *PGMMPAGE;
307
308
309/** @name The Page States.
310 * @{ */
311/** A private page. */
312#define GMM_PAGE_STATE_PRIVATE 0
313/** A private page - alternative value used on the 32-bit implementation.
314 * This will never be used on 64-bit hosts. */
315#define GMM_PAGE_STATE_PRIVATE_32 1
316/** A shared page. */
317#define GMM_PAGE_STATE_SHARED 2
318/** A free page. */
319#define GMM_PAGE_STATE_FREE 3
320/** @} */
321
322
323/** @def GMM_PAGE_IS_PRIVATE
324 *
325 * @returns true if private, false if not.
326 * @param pPage The GMM page.
327 */
328#if HC_ARCH_BITS == 64
329# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
330#else
331# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
332#endif
333
334/** @def GMM_PAGE_IS_SHARED
335 *
336 * @returns true if shared, false if not.
337 * @param pPage The GMM page.
338 */
339#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
340
341/** @def GMM_PAGE_IS_FREE
342 *
343 * @returns true if free, false if not.
344 * @param pPage The GMM page.
345 */
346#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
347
348/** @def GMM_PAGE_PFN_LAST
349 * The last valid guest pfn range.
350 * @remark Some of the values outside the range has special meaning,
351 * see GMM_PAGE_PFN_UNSHAREABLE.
352 */
353#if HC_ARCH_BITS == 64
354# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
355#else
356# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
357#endif
358AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
359
360/** @def GMM_PAGE_PFN_UNSHAREABLE
361 * Indicates that this page isn't used for normal guest memory and thus isn't shareable.
362 */
363#if HC_ARCH_BITS == 64
364# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
365#else
366# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
367#endif
368AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
369
370
371/**
372 * A GMM allocation chunk ring-3 mapping record.
373 *
374 * This should really be associated with a session and not a VM, but
375 * it's simpler to associated with a VM and cleanup with the VM object
376 * is destroyed.
377 */
378typedef struct GMMCHUNKMAP
379{
380 /** The mapping object. */
381 RTR0MEMOBJ hMapObj;
382 /** The VM owning the mapping. */
383 PGVM pGVM;
384} GMMCHUNKMAP;
385/** Pointer to a GMM allocation chunk mapping. */
386typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
387
388
389/**
390 * A GMM allocation chunk.
391 */
392typedef struct GMMCHUNK
393{
394 /** The AVL node core.
395 * The Key is the chunk ID. (Giant mtx.) */
396 AVLU32NODECORE Core;
397 /** The memory object.
398 * Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
399 * what the host can dish up with. (Chunk mtx protects mapping accesses
400 * and related frees.) */
401 RTR0MEMOBJ hMemObj;
402 /** Pointer to the next chunk in the free list. (Giant mtx.) */
403 PGMMCHUNK pFreeNext;
404 /** Pointer to the previous chunk in the free list. (Giant mtx.) */
405 PGMMCHUNK pFreePrev;
406 /** Pointer to the free set this chunk belongs to. NULL for
407 * chunks with no free pages. (Giant mtx.) */
408 PGMMCHUNKFREESET pSet;
409 /** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
410 RTLISTNODE ListNode;
411 /** Pointer to an array of mappings. (Chunk mtx.) */
412 PGMMCHUNKMAP paMappingsX;
413 /** The number of mappings. (Chunk mtx.) */
414 uint16_t cMappingsX;
415 /** The mapping lock this chunk is using using. UINT16_MAX if nobody is
416 * mapping or freeing anything. (Giant mtx.) */
417 uint8_t volatile iChunkMtx;
418 /** Flags field reserved for future use (like eliminating enmType).
419 * (Giant mtx.) */
420 uint8_t fFlags;
421 /** The head of the list of free pages. UINT16_MAX is the NIL value.
422 * (Giant mtx.) */
423 uint16_t iFreeHead;
424 /** The number of free pages. (Giant mtx.) */
425 uint16_t cFree;
426 /** The GVM handle of the VM that first allocated pages from this chunk, this
427 * is used as a preference when there are several chunks to choose from.
428 * When in bound memory mode this isn't a preference any longer. (Giant
429 * mtx.) */
430 uint16_t hGVM;
431 /** The ID of the NUMA node the memory mostly resides on. (Reserved for
432 * future use.) (Giant mtx.) */
433 uint16_t idNumaNode;
434 /** The number of private pages. (Giant mtx.) */
435 uint16_t cPrivate;
436 /** The number of shared pages. (Giant mtx.) */
437 uint16_t cShared;
438 /** The pages. (Giant mtx.) */
439 GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
440} GMMCHUNK;
441
442/** Indicates that the NUMA properies of the memory is unknown. */
443#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
444
445/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
446 * @{ */
447/** Indicates that the chunk is a large page (2MB). */
448#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
449/** @} */
450
451
452/**
453 * An allocation chunk TLB entry.
454 */
455typedef struct GMMCHUNKTLBE
456{
457 /** The chunk id. */
458 uint32_t idChunk;
459 /** Pointer to the chunk. */
460 PGMMCHUNK pChunk;
461} GMMCHUNKTLBE;
462/** Pointer to an allocation chunk TLB entry. */
463typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
464
465
466/** The number of entries tin the allocation chunk TLB. */
467#define GMM_CHUNKTLB_ENTRIES 32
468/** Gets the TLB entry index for the given Chunk ID. */
469#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
470
471/**
472 * An allocation chunk TLB.
473 */
474typedef struct GMMCHUNKTLB
475{
476 /** The TLB entries. */
477 GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
478} GMMCHUNKTLB;
479/** Pointer to an allocation chunk TLB. */
480typedef GMMCHUNKTLB *PGMMCHUNKTLB;
481
482
483/**
484 * The GMM instance data.
485 */
486typedef struct GMM
487{
488 /** Magic / eye catcher. GMM_MAGIC */
489 uint32_t u32Magic;
490 /** The number of threads waiting on the mutex. */
491 uint32_t cMtxContenders;
492#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
493 /** The critical section protecting the GMM.
494 * More fine grained locking can be implemented later if necessary. */
495 RTCRITSECT GiantCritSect;
496#else
497 /** The fast mutex protecting the GMM.
498 * More fine grained locking can be implemented later if necessary. */
499 RTSEMFASTMUTEX hMtx;
500#endif
501#ifdef VBOX_STRICT
502 /** The current mutex owner. */
503 RTNATIVETHREAD hMtxOwner;
504#endif
505 /** The chunk tree. */
506 PAVLU32NODECORE pChunks;
507 /** The chunk TLB. */
508 GMMCHUNKTLB ChunkTLB;
509 /** The private free set. */
510 GMMCHUNKFREESET PrivateX;
511 /** The shared free set. */
512 GMMCHUNKFREESET Shared;
513
514 /** Shared module tree (global).
515 * @todo separate trees for distinctly different guest OSes. */
516 PAVLLU32NODECORE pGlobalSharedModuleTree;
517 /** Sharable modules (count of nodes in pGlobalSharedModuleTree). */
518 uint32_t cShareableModules;
519
520 /** The chunk list. For simplifying the cleanup process. */
521 RTLISTANCHOR ChunkList;
522
523 /** The maximum number of pages we're allowed to allocate.
524 * @gcfgm 64-bit GMM/MaxPages Direct.
525 * @gcfgm 32-bit GMM/PctPages Relative to the number of host pages. */
526 uint64_t cMaxPages;
527 /** The number of pages that has been reserved.
528 * The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
529 uint64_t cReservedPages;
530 /** The number of pages that we have over-committed in reservations. */
531 uint64_t cOverCommittedPages;
532 /** The number of actually allocated (committed if you like) pages. */
533 uint64_t cAllocatedPages;
534 /** The number of pages that are shared. A subset of cAllocatedPages. */
535 uint64_t cSharedPages;
536 /** The number of pages that are actually shared between VMs. */
537 uint64_t cDuplicatePages;
538 /** The number of pages that are shared that has been left behind by
539 * VMs not doing proper cleanups. */
540 uint64_t cLeftBehindSharedPages;
541 /** The number of allocation chunks.
542 * (The number of pages we've allocated from the host can be derived from this.) */
543 uint32_t cChunks;
544 /** The number of current ballooned pages. */
545 uint64_t cBalloonedPages;
546
547 /** The legacy allocation mode indicator.
548 * This is determined at initialization time. */
549 bool fLegacyAllocationMode;
550 /** The bound memory mode indicator.
551 * When set, the memory will be bound to a specific VM and never
552 * shared. This is always set if fLegacyAllocationMode is set.
553 * (Also determined at initialization time.) */
554 bool fBoundMemoryMode;
555 /** The number of registered VMs. */
556 uint16_t cRegisteredVMs;
557
558 /** The number of freed chunks ever. This is used a list generation to
559 * avoid restarting the cleanup scanning when the list wasn't modified. */
560 uint32_t cFreedChunks;
561 /** The previous allocated Chunk ID.
562 * Used as a hint to avoid scanning the whole bitmap. */
563 uint32_t idChunkPrev;
564 /** Chunk ID allocation bitmap.
565 * Bits of allocated IDs are set, free ones are clear.
566 * The NIL id (0) is marked allocated. */
567 uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
568
569 /** The index of the next mutex to use. */
570 uint32_t iNextChunkMtx;
571 /** Chunk locks for reducing lock contention without having to allocate
572 * one lock per chunk. */
573 struct
574 {
575 /** The mutex */
576 RTSEMFASTMUTEX hMtx;
577 /** The number of threads currently using this mutex. */
578 uint32_t volatile cUsers;
579 } aChunkMtx[64];
580} GMM;
581/** Pointer to the GMM instance. */
582typedef GMM *PGMM;
583
584/** The value of GMM::u32Magic (Katsuhiro Otomo). */
585#define GMM_MAGIC UINT32_C(0x19540414)
586
587
588/**
589 * GMM chunk mutex state.
590 *
591 * This is returned by gmmR0ChunkMutexAcquire and is used by the other
592 * gmmR0ChunkMutex* methods.
593 */
594typedef struct GMMR0CHUNKMTXSTATE
595{
596 PGMM pGMM;
597 /** The index of the chunk mutex. */
598 uint8_t iChunkMtx;
599 /** The relevant flags (GMMR0CHUNK_MTX_XXX). */
600 uint8_t fFlags;
601} GMMR0CHUNKMTXSTATE;
602/** Pointer to a chunk mutex state. */
603typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
604
605/** @name GMMR0CHUNK_MTX_XXX
606 * @{ */
607#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
608#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
609#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
610#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
611#define GMMR0CHUNK_MTX_END UINT32_C(4)
612/** @} */
613
614
615/** The maximum number of shared modules per-vm. */
616#define GMM_MAX_SHARED_PER_VM_MODULES 2048
617/** The maximum number of shared modules GMM is allowed to track. */
618#define GMM_MAX_SHARED_GLOBAL_MODULES 16834
619
620
621/**
622 * Argument packet for gmmR0SharedModuleCleanup.
623 */
624typedef struct GMMR0SHMODPERVMDTORARGS
625{
626 PGVM pGVM;
627 PGMM pGMM;
628} GMMR0SHMODPERVMDTORARGS;
629
630/**
631 * Argument packet for gmmR0CheckSharedModule.
632 */
633typedef struct GMMCHECKSHAREDMODULEINFO
634{
635 PGVM pGVM;
636 VMCPUID idCpu;
637} GMMCHECKSHAREDMODULEINFO;
638
639/**
640 * Argument packet for gmmR0FindDupPageInChunk by GMMR0FindDuplicatePage.
641 */
642typedef struct GMMFINDDUPPAGEINFO
643{
644 PGVM pGVM;
645 PGMM pGMM;
646 uint8_t *pSourcePage;
647 bool fFoundDuplicate;
648} GMMFINDDUPPAGEINFO;
649
650
651/*******************************************************************************
652* Global Variables *
653*******************************************************************************/
654/** Pointer to the GMM instance data. */
655static PGMM g_pGMM = NULL;
656
657/** Macro for obtaining and validating the g_pGMM pointer.
658 *
659 * On failure it will return from the invoking function with the specified
660 * return value.
661 *
662 * @param pGMM The name of the pGMM variable.
663 * @param rc The return value on failure. Use VERR_GMM_INSTANCE for VBox
664 * status codes.
665 */
666#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
667 do { \
668 (pGMM) = g_pGMM; \
669 AssertPtrReturn((pGMM), (rc)); \
670 AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
671 } while (0)
672
673/** Macro for obtaining and validating the g_pGMM pointer, void function
674 * variant.
675 *
676 * On failure it will return from the invoking function.
677 *
678 * @param pGMM The name of the pGMM variable.
679 */
680#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
681 do { \
682 (pGMM) = g_pGMM; \
683 AssertPtrReturnVoid((pGMM)); \
684 AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
685 } while (0)
686
687
688/** @def GMM_CHECK_SANITY_UPON_ENTERING
689 * Checks the sanity of the GMM instance data before making changes.
690 *
691 * This is macro is a stub by default and must be enabled manually in the code.
692 *
693 * @returns true if sane, false if not.
694 * @param pGMM The name of the pGMM variable.
695 */
696#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
697# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
698#else
699# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
700#endif
701
702/** @def GMM_CHECK_SANITY_UPON_LEAVING
703 * Checks the sanity of the GMM instance data after making changes.
704 *
705 * This is macro is a stub by default and must be enabled manually in the code.
706 *
707 * @returns true if sane, false if not.
708 * @param pGMM The name of the pGMM variable.
709 */
710#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
711# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
712#else
713# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
714#endif
715
716/** @def GMM_CHECK_SANITY_IN_LOOPS
717 * Checks the sanity of the GMM instance in the allocation loops.
718 *
719 * This is macro is a stub by default and must be enabled manually in the code.
720 *
721 * @returns true if sane, false if not.
722 * @param pGMM The name of the pGMM variable.
723 */
724#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
725# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
726#else
727# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
728#endif
729
730
731/*******************************************************************************
732* Internal Functions *
733*******************************************************************************/
734static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
735static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
736DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
737DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
738DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
739#ifdef GMMR0_WITH_SANITY_CHECK
740static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
741#endif
742static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
743DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
744DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
745static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
746#ifdef VBOX_WITH_PAGE_SHARING
747static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
748# ifdef VBOX_STRICT
749static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage);
750# endif
751#endif
752
753
754
755/**
756 * Initializes the GMM component.
757 *
758 * This is called when the VMMR0.r0 module is loaded and protected by the
759 * loader semaphore.
760 *
761 * @returns VBox status code.
762 */
763GMMR0DECL(int) GMMR0Init(void)
764{
765 LogFlow(("GMMInit:\n"));
766
767 /*
768 * Allocate the instance data and the locks.
769 */
770 PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
771 if (!pGMM)
772 return VERR_NO_MEMORY;
773
774 pGMM->u32Magic = GMM_MAGIC;
775 for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
776 pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
777 RTListInit(&pGMM->ChunkList);
778 ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
779
780#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
781 int rc = RTCritSectInit(&pGMM->GiantCritSect);
782#else
783 int rc = RTSemFastMutexCreate(&pGMM->hMtx);
784#endif
785 if (RT_SUCCESS(rc))
786 {
787 unsigned iMtx;
788 for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
789 {
790 rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
791 if (RT_FAILURE(rc))
792 break;
793 }
794 if (RT_SUCCESS(rc))
795 {
796 /*
797 * Check and see if RTR0MemObjAllocPhysNC works.
798 */
799#if 0 /* later, see @bufref{3170}. */
800 RTR0MEMOBJ MemObj;
801 rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
802 if (RT_SUCCESS(rc))
803 {
804 rc = RTR0MemObjFree(MemObj, true);
805 AssertRC(rc);
806 }
807 else if (rc == VERR_NOT_SUPPORTED)
808 pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
809 else
810 SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
811#else
812# if defined(RT_OS_WINDOWS) || (defined(RT_OS_SOLARIS) && ARCH_BITS == 64) || defined(RT_OS_LINUX) || defined(RT_OS_FREEBSD)
813 pGMM->fLegacyAllocationMode = false;
814# if ARCH_BITS == 32
815 /* Don't reuse possibly partial chunks because of the virtual
816 address space limitation. */
817 pGMM->fBoundMemoryMode = true;
818# else
819 pGMM->fBoundMemoryMode = false;
820# endif
821# else
822 pGMM->fLegacyAllocationMode = true;
823 pGMM->fBoundMemoryMode = true;
824# endif
825#endif
826
827 /*
828 * Query system page count and guess a reasonable cMaxPages value.
829 */
830 pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
831
832 g_pGMM = pGMM;
833 LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
834 return VINF_SUCCESS;
835 }
836
837 /*
838 * Bail out.
839 */
840 while (iMtx-- > 0)
841 RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
842#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
843 RTCritSectDelete(&pGMM->GiantCritSect);
844#else
845 RTSemFastMutexDestroy(pGMM->hMtx);
846#endif
847 }
848
849 pGMM->u32Magic = 0;
850 RTMemFree(pGMM);
851 SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
852 return rc;
853}
854
855
856/**
857 * Terminates the GMM component.
858 */
859GMMR0DECL(void) GMMR0Term(void)
860{
861 LogFlow(("GMMTerm:\n"));
862
863 /*
864 * Take care / be paranoid...
865 */
866 PGMM pGMM = g_pGMM;
867 if (!VALID_PTR(pGMM))
868 return;
869 if (pGMM->u32Magic != GMM_MAGIC)
870 {
871 SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
872 return;
873 }
874
875 /*
876 * Undo what init did and free all the resources we've acquired.
877 */
878 /* Destroy the fundamentals. */
879 g_pGMM = NULL;
880 pGMM->u32Magic = ~GMM_MAGIC;
881#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
882 RTCritSectDelete(&pGMM->GiantCritSect);
883#else
884 RTSemFastMutexDestroy(pGMM->hMtx);
885 pGMM->hMtx = NIL_RTSEMFASTMUTEX;
886#endif
887
888 /* Free any chunks still hanging around. */
889 RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
890
891 /* Destroy the chunk locks. */
892 for (unsigned iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
893 {
894 Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
895 RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
896 pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
897 }
898
899 /* Finally the instance data itself. */
900 RTMemFree(pGMM);
901 LogFlow(("GMMTerm: done\n"));
902}
903
904
905/**
906 * RTAvlU32Destroy callback.
907 *
908 * @returns 0
909 * @param pNode The node to destroy.
910 * @param pvGMM The GMM handle.
911 */
912static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
913{
914 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
915
916 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
917 SUPR0Printf("GMMR0Term: %p/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
918 pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
919
920 int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
921 if (RT_FAILURE(rc))
922 {
923 SUPR0Printf("GMMR0Term: %p/%#x: RTRMemObjFree(%p,true) -> %d (cMappings=%d)\n", pChunk,
924 pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
925 AssertRC(rc);
926 }
927 pChunk->hMemObj = NIL_RTR0MEMOBJ;
928
929 RTMemFree(pChunk->paMappingsX);
930 pChunk->paMappingsX = NULL;
931
932 RTMemFree(pChunk);
933 NOREF(pvGMM);
934 return 0;
935}
936
937
938/**
939 * Initializes the per-VM data for the GMM.
940 *
941 * This is called from within the GVMM lock (from GVMMR0CreateVM)
942 * and should only initialize the data members so GMMR0CleanupVM
943 * can deal with them. We reserve no memory or anything here,
944 * that's done later in GMMR0InitVM.
945 *
946 * @param pGVM Pointer to the Global VM structure.
947 */
948GMMR0DECL(void) GMMR0InitPerVMData(PGVM pGVM)
949{
950 AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
951
952 pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
953 pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
954 pGVM->gmm.s.Stats.fMayAllocate = false;
955}
956
957
958/**
959 * Acquires the GMM giant lock.
960 *
961 * @returns Assert status code from RTSemFastMutexRequest.
962 * @param pGMM Pointer to the GMM instance.
963 */
964static int gmmR0MutexAcquire(PGMM pGMM)
965{
966 ASMAtomicIncU32(&pGMM->cMtxContenders);
967#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
968 int rc = RTCritSectEnter(&pGMM->GiantCritSect);
969#else
970 int rc = RTSemFastMutexRequest(pGMM->hMtx);
971#endif
972 ASMAtomicDecU32(&pGMM->cMtxContenders);
973 AssertRC(rc);
974#ifdef VBOX_STRICT
975 pGMM->hMtxOwner = RTThreadNativeSelf();
976#endif
977 return rc;
978}
979
980
981/**
982 * Releases the GMM giant lock.
983 *
984 * @returns Assert status code from RTSemFastMutexRequest.
985 * @param pGMM Pointer to the GMM instance.
986 */
987static int gmmR0MutexRelease(PGMM pGMM)
988{
989#ifdef VBOX_STRICT
990 pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
991#endif
992#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
993 int rc = RTCritSectLeave(&pGMM->GiantCritSect);
994#else
995 int rc = RTSemFastMutexRelease(pGMM->hMtx);
996 AssertRC(rc);
997#endif
998 return rc;
999}
1000
1001
1002/**
1003 * Yields the GMM giant lock if there is contention and a certain minimum time
1004 * has elapsed since we took it.
1005 *
1006 * @returns @c true if the mutex was yielded, @c false if not.
1007 * @param pGMM Pointer to the GMM instance.
1008 * @param puLockNanoTS Where the lock acquisition time stamp is kept
1009 * (in/out).
1010 */
1011static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
1012{
1013 /*
1014 * If nobody is contending the mutex, don't bother checking the time.
1015 */
1016 if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
1017 return false;
1018
1019 /*
1020 * Don't yield if we haven't executed for at least 2 milliseconds.
1021 */
1022 uint64_t uNanoNow = RTTimeSystemNanoTS();
1023 if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
1024 return false;
1025
1026 /*
1027 * Yield the mutex.
1028 */
1029#ifdef VBOX_STRICT
1030 pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1031#endif
1032 ASMAtomicIncU32(&pGMM->cMtxContenders);
1033#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1034 int rc1 = RTCritSectLeave(&pGMM->GiantCritSect); AssertRC(rc1);
1035#else
1036 int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
1037#endif
1038
1039 RTThreadYield();
1040
1041#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1042 int rc2 = RTCritSectEnter(&pGMM->GiantCritSect); AssertRC(rc2);
1043#else
1044 int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
1045#endif
1046 *puLockNanoTS = RTTimeSystemNanoTS();
1047 ASMAtomicDecU32(&pGMM->cMtxContenders);
1048#ifdef VBOX_STRICT
1049 pGMM->hMtxOwner = RTThreadNativeSelf();
1050#endif
1051
1052 return true;
1053}
1054
1055
1056/**
1057 * Acquires a chunk lock.
1058 *
1059 * The caller must own the giant lock.
1060 *
1061 * @returns Assert status code from RTSemFastMutexRequest.
1062 * @param pMtxState The chunk mutex state info. (Avoids
1063 * passing the same flags and stuff around
1064 * for subsequent release and drop-giant
1065 * calls.)
1066 * @param pGMM Pointer to the GMM instance.
1067 * @param pChunk Pointer to the chunk.
1068 * @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
1069 */
1070static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
1071{
1072 Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
1073 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1074
1075 pMtxState->pGMM = pGMM;
1076 pMtxState->fFlags = (uint8_t)fFlags;
1077
1078 /*
1079 * Get the lock index and reference the lock.
1080 */
1081 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1082 uint32_t iChunkMtx = pChunk->iChunkMtx;
1083 if (iChunkMtx == UINT8_MAX)
1084 {
1085 iChunkMtx = pGMM->iNextChunkMtx++;
1086 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1087
1088 /* Try get an unused one... */
1089 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1090 {
1091 iChunkMtx = pGMM->iNextChunkMtx++;
1092 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1093 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1094 {
1095 iChunkMtx = pGMM->iNextChunkMtx++;
1096 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1097 if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1098 {
1099 iChunkMtx = pGMM->iNextChunkMtx++;
1100 iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1101 }
1102 }
1103 }
1104
1105 pChunk->iChunkMtx = iChunkMtx;
1106 }
1107 AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1108 pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1109 ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1110
1111 /*
1112 * Drop the giant?
1113 */
1114 if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1115 {
1116 /** @todo GMM life cycle cleanup (we may race someone
1117 * destroying and cleaning up GMM)? */
1118 gmmR0MutexRelease(pGMM);
1119 }
1120
1121 /*
1122 * Take the chunk mutex.
1123 */
1124 int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1125 AssertRC(rc);
1126 return rc;
1127}
1128
1129
1130/**
1131 * Releases the GMM giant lock.
1132 *
1133 * @returns Assert status code from RTSemFastMutexRequest.
1134 * @param pGMM Pointer to the GMM instance.
1135 * @param pChunk Pointer to the chunk if it's still
1136 * alive, NULL if it isn't. This is used to deassociate
1137 * the chunk from the mutex on the way out so a new one
1138 * can be selected next time, thus avoiding contented
1139 * mutexes.
1140 */
1141static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1142{
1143 PGMM pGMM = pMtxState->pGMM;
1144
1145 /*
1146 * Release the chunk mutex and reacquire the giant if requested.
1147 */
1148 int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1149 AssertRC(rc);
1150 if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1151 rc = gmmR0MutexAcquire(pGMM);
1152 else
1153 Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1154
1155 /*
1156 * Drop the chunk mutex user reference and deassociate it from the chunk
1157 * when possible.
1158 */
1159 if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1160 && pChunk
1161 && RT_SUCCESS(rc) )
1162 {
1163 if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1164 pChunk->iChunkMtx = UINT8_MAX;
1165 else
1166 {
1167 rc = gmmR0MutexAcquire(pGMM);
1168 if (RT_SUCCESS(rc))
1169 {
1170 if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1171 pChunk->iChunkMtx = UINT8_MAX;
1172 rc = gmmR0MutexRelease(pGMM);
1173 }
1174 }
1175 }
1176
1177 pMtxState->pGMM = NULL;
1178 return rc;
1179}
1180
1181
1182/**
1183 * Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1184 * chunk locked.
1185 *
1186 * This only works if gmmR0ChunkMutexAcquire was called with
1187 * GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1188 * mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1189 *
1190 * @returns VBox status code (assuming success is ok).
1191 * @param pMtxState Pointer to the chunk mutex state.
1192 */
1193static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1194{
1195 AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_GMM_MTX_FLAGS);
1196 Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1197 pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1198 /** @todo GMM life cycle cleanup (we may race someone
1199 * destroying and cleaning up GMM)? */
1200 return gmmR0MutexRelease(pMtxState->pGMM);
1201}
1202
1203
1204/**
1205 * For experimenting with NUMA affinity and such.
1206 *
1207 * @returns The current NUMA Node ID.
1208 */
1209static uint16_t gmmR0GetCurrentNumaNodeId(void)
1210{
1211#if 1
1212 return GMM_CHUNK_NUMA_ID_UNKNOWN;
1213#else
1214 return RTMpCpuId() / 16;
1215#endif
1216}
1217
1218
1219
1220/**
1221 * Cleans up when a VM is terminating.
1222 *
1223 * @param pGVM Pointer to the Global VM structure.
1224 */
1225GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1226{
1227 LogFlow(("GMMR0CleanupVM: pGVM=%p:{.pVM=%p, .hSelf=%#x}\n", pGVM, pGVM->pVM, pGVM->hSelf));
1228
1229 PGMM pGMM;
1230 GMM_GET_VALID_INSTANCE_VOID(pGMM);
1231
1232#ifdef VBOX_WITH_PAGE_SHARING
1233 /*
1234 * Clean up all registered shared modules first.
1235 */
1236 gmmR0SharedModuleCleanup(pGMM, pGVM);
1237#endif
1238
1239 gmmR0MutexAcquire(pGMM);
1240 uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1241 GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1242
1243 /*
1244 * The policy is 'INVALID' until the initial reservation
1245 * request has been serviced.
1246 */
1247 if ( pGVM->gmm.s.Stats.enmPolicy > GMMOCPOLICY_INVALID
1248 && pGVM->gmm.s.Stats.enmPolicy < GMMOCPOLICY_END)
1249 {
1250 /*
1251 * If it's the last VM around, we can skip walking all the chunk looking
1252 * for the pages owned by this VM and instead flush the whole shebang.
1253 *
1254 * This takes care of the eventuality that a VM has left shared page
1255 * references behind (shouldn't happen of course, but you never know).
1256 */
1257 Assert(pGMM->cRegisteredVMs);
1258 pGMM->cRegisteredVMs--;
1259
1260 /*
1261 * Walk the entire pool looking for pages that belong to this VM
1262 * and leftover mappings. (This'll only catch private pages,
1263 * shared pages will be 'left behind'.)
1264 */
1265 /** @todo r=bird: This scanning+freeing could be optimized in bound mode! */
1266 uint64_t cPrivatePages = pGVM->gmm.s.Stats.cPrivatePages; /* save */
1267
1268 unsigned iCountDown = 64;
1269 bool fRedoFromStart;
1270 PGMMCHUNK pChunk;
1271 do
1272 {
1273 fRedoFromStart = false;
1274 RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1275 {
1276 uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1277 if ( ( !pGMM->fBoundMemoryMode
1278 || pChunk->hGVM == pGVM->hSelf)
1279 && gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1280 {
1281 /* We left the giant mutex, so reset the yield counters. */
1282 uLockNanoTS = RTTimeSystemNanoTS();
1283 iCountDown = 64;
1284 }
1285 else
1286 {
1287 /* Didn't leave it, so do normal yielding. */
1288 if (!iCountDown)
1289 gmmR0MutexYield(pGMM, &uLockNanoTS);
1290 else
1291 iCountDown--;
1292 }
1293 if (pGMM->cFreedChunks != cFreeChunksOld)
1294 {
1295 fRedoFromStart = true;
1296 break;
1297 }
1298 }
1299 } while (fRedoFromStart);
1300
1301 if (pGVM->gmm.s.Stats.cPrivatePages)
1302 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cPrivatePages);
1303
1304 pGMM->cAllocatedPages -= cPrivatePages;
1305
1306 /*
1307 * Free empty chunks.
1308 */
1309 PGMMCHUNKFREESET pPrivateSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
1310 do
1311 {
1312 fRedoFromStart = false;
1313 iCountDown = 10240;
1314 pChunk = pPrivateSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
1315 while (pChunk)
1316 {
1317 PGMMCHUNK pNext = pChunk->pFreeNext;
1318 Assert(pChunk->cFree == GMM_CHUNK_NUM_PAGES);
1319 if ( !pGMM->fBoundMemoryMode
1320 || pChunk->hGVM == pGVM->hSelf)
1321 {
1322 uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1323 if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /*fRelaxedSem*/))
1324 {
1325 /* We've left the giant mutex, restart? (+1 for our unlink) */
1326 fRedoFromStart = pPrivateSet->idGeneration != idGenerationOld + 1;
1327 if (fRedoFromStart)
1328 break;
1329 uLockNanoTS = RTTimeSystemNanoTS();
1330 iCountDown = 10240;
1331 }
1332 }
1333
1334 /* Advance and maybe yield the lock. */
1335 pChunk = pNext;
1336 if (--iCountDown == 0)
1337 {
1338 uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1339 fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1340 && pPrivateSet->idGeneration != idGenerationOld;
1341 if (fRedoFromStart)
1342 break;
1343 iCountDown = 10240;
1344 }
1345 }
1346 } while (fRedoFromStart);
1347
1348 /*
1349 * Account for shared pages that weren't freed.
1350 */
1351 if (pGVM->gmm.s.Stats.cSharedPages)
1352 {
1353 Assert(pGMM->cSharedPages >= pGVM->gmm.s.Stats.cSharedPages);
1354 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cSharedPages);
1355 pGMM->cLeftBehindSharedPages += pGVM->gmm.s.Stats.cSharedPages;
1356 }
1357
1358 /*
1359 * Clean up balloon statistics in case the VM process crashed.
1360 */
1361 Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
1362 pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
1363
1364 /*
1365 * Update the over-commitment management statistics.
1366 */
1367 pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1368 + pGVM->gmm.s.Stats.Reserved.cFixedPages
1369 + pGVM->gmm.s.Stats.Reserved.cShadowPages;
1370 switch (pGVM->gmm.s.Stats.enmPolicy)
1371 {
1372 case GMMOCPOLICY_NO_OC:
1373 break;
1374 default:
1375 /** @todo Update GMM->cOverCommittedPages */
1376 break;
1377 }
1378 }
1379
1380 /* zap the GVM data. */
1381 pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1382 pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1383 pGVM->gmm.s.Stats.fMayAllocate = false;
1384
1385 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1386 gmmR0MutexRelease(pGMM);
1387
1388 LogFlow(("GMMR0CleanupVM: returns\n"));
1389}
1390
1391
1392/**
1393 * Scan one chunk for private pages belonging to the specified VM.
1394 *
1395 * @note This function may drop the giant mutex!
1396 *
1397 * @returns @c true if we've temporarily dropped the giant mutex, @c false if
1398 * we didn't.
1399 * @param pGMM Pointer to the GMM instance.
1400 * @param pGVM The global VM handle.
1401 * @param pChunk The chunk to scan.
1402 */
1403static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1404{
1405 Assert(!pGMM->fBoundMemoryMode || pChunk->hGVM == pGVM->hSelf);
1406
1407 /*
1408 * Look for pages belonging to the VM.
1409 * (Perform some internal checks while we're scanning.)
1410 */
1411#ifndef VBOX_STRICT
1412 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1413#endif
1414 {
1415 unsigned cPrivate = 0;
1416 unsigned cShared = 0;
1417 unsigned cFree = 0;
1418
1419 gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1420
1421 uint16_t hGVM = pGVM->hSelf;
1422 unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1423 while (iPage-- > 0)
1424 if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1425 {
1426 if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1427 {
1428 /*
1429 * Free the page.
1430 *
1431 * The reason for not using gmmR0FreePrivatePage here is that we
1432 * must *not* cause the chunk to be freed from under us - we're in
1433 * an AVL tree walk here.
1434 */
1435 pChunk->aPages[iPage].u = 0;
1436 pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1437 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1438 pChunk->iFreeHead = iPage;
1439 pChunk->cPrivate--;
1440 pChunk->cFree++;
1441 pGVM->gmm.s.Stats.cPrivatePages--;
1442 cFree++;
1443 }
1444 else
1445 cPrivate++;
1446 }
1447 else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1448 cFree++;
1449 else
1450 cShared++;
1451
1452 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1453
1454 /*
1455 * Did it add up?
1456 */
1457 if (RT_UNLIKELY( pChunk->cFree != cFree
1458 || pChunk->cPrivate != cPrivate
1459 || pChunk->cShared != cShared))
1460 {
1461 SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %p/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1462 pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1463 pChunk->cFree = cFree;
1464 pChunk->cPrivate = cPrivate;
1465 pChunk->cShared = cShared;
1466 }
1467 }
1468
1469 /*
1470 * If not in bound memory mode, we should reset the hGVM field
1471 * if it has our handle in it.
1472 */
1473 if (pChunk->hGVM == pGVM->hSelf)
1474 {
1475 if (!g_pGMM->fBoundMemoryMode)
1476 pChunk->hGVM = NIL_GVM_HANDLE;
1477 else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1478 {
1479 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1480 pChunk, pChunk->Core.Key, pChunk->cFree);
1481 AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1482
1483 gmmR0UnlinkChunk(pChunk);
1484 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1485 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1486 }
1487 }
1488
1489 /*
1490 * Look for a mapping belonging to the terminating VM.
1491 */
1492 GMMR0CHUNKMTXSTATE MtxState;
1493 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1494 unsigned cMappings = pChunk->cMappingsX;
1495 for (unsigned i = 0; i < cMappings; i++)
1496 if (pChunk->paMappingsX[i].pGVM == pGVM)
1497 {
1498 gmmR0ChunkMutexDropGiant(&MtxState);
1499
1500 RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1501
1502 cMappings--;
1503 if (i < cMappings)
1504 pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1505 pChunk->paMappingsX[cMappings].pGVM = NULL;
1506 pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1507 Assert(pChunk->cMappingsX - 1U == cMappings);
1508 pChunk->cMappingsX = cMappings;
1509
1510 int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1511 if (RT_FAILURE(rc))
1512 {
1513 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: mapping #%x: RTRMemObjFree(%p,false) -> %d \n",
1514 pChunk, pChunk->Core.Key, i, hMemObj, rc);
1515 AssertRC(rc);
1516 }
1517
1518 gmmR0ChunkMutexRelease(&MtxState, pChunk);
1519 return true;
1520 }
1521
1522 gmmR0ChunkMutexRelease(&MtxState, pChunk);
1523 return false;
1524}
1525
1526
1527/**
1528 * The initial resource reservations.
1529 *
1530 * This will make memory reservations according to policy and priority. If there aren't
1531 * sufficient resources available to sustain the VM this function will fail and all
1532 * future allocations requests will fail as well.
1533 *
1534 * These are just the initial reservations made very very early during the VM creation
1535 * process and will be adjusted later in the GMMR0UpdateReservation call after the
1536 * ring-3 init has completed.
1537 *
1538 * @returns VBox status code.
1539 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1540 * @retval VERR_GMM_
1541 *
1542 * @param pVM Pointer to the VM.
1543 * @param idCpu The VCPU id.
1544 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1545 * This does not include MMIO2 and similar.
1546 * @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1547 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1548 * hyper heap, MMIO2 and similar.
1549 * @param enmPolicy The OC policy to use on this VM.
1550 * @param enmPriority The priority in an out-of-memory situation.
1551 *
1552 * @thread The creator thread / EMT.
1553 */
1554GMMR0DECL(int) GMMR0InitialReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages,
1555 GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1556{
1557 LogFlow(("GMMR0InitialReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1558 pVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1559
1560 /*
1561 * Validate, get basics and take the semaphore.
1562 */
1563 PGMM pGMM;
1564 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1565 PGVM pGVM;
1566 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1567 if (RT_FAILURE(rc))
1568 return rc;
1569
1570 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1571 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1572 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1573 AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1574 AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1575
1576 gmmR0MutexAcquire(pGMM);
1577 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1578 {
1579 if ( !pGVM->gmm.s.Stats.Reserved.cBasePages
1580 && !pGVM->gmm.s.Stats.Reserved.cFixedPages
1581 && !pGVM->gmm.s.Stats.Reserved.cShadowPages)
1582 {
1583 /*
1584 * Check if we can accommodate this.
1585 */
1586 /* ... later ... */
1587 if (RT_SUCCESS(rc))
1588 {
1589 /*
1590 * Update the records.
1591 */
1592 pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1593 pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1594 pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1595 pGVM->gmm.s.Stats.enmPolicy = enmPolicy;
1596 pGVM->gmm.s.Stats.enmPriority = enmPriority;
1597 pGVM->gmm.s.Stats.fMayAllocate = true;
1598
1599 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1600 pGMM->cRegisteredVMs++;
1601 }
1602 }
1603 else
1604 rc = VERR_WRONG_ORDER;
1605 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1606 }
1607 else
1608 rc = VERR_GMM_IS_NOT_SANE;
1609 gmmR0MutexRelease(pGMM);
1610 LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1611 return rc;
1612}
1613
1614
1615/**
1616 * VMMR0 request wrapper for GMMR0InitialReservation.
1617 *
1618 * @returns see GMMR0InitialReservation.
1619 * @param pVM Pointer to the VM.
1620 * @param idCpu The VCPU id.
1621 * @param pReq Pointer to the request packet.
1622 */
1623GMMR0DECL(int) GMMR0InitialReservationReq(PVM pVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1624{
1625 /*
1626 * Validate input and pass it on.
1627 */
1628 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1629 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1630 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1631
1632 return GMMR0InitialReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1633}
1634
1635
1636/**
1637 * This updates the memory reservation with the additional MMIO2 and ROM pages.
1638 *
1639 * @returns VBox status code.
1640 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1641 *
1642 * @param pVM Pointer to the VM.
1643 * @param idCpu The VCPU id.
1644 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1645 * This does not include MMIO2 and similar.
1646 * @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1647 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1648 * hyper heap, MMIO2 and similar.
1649 *
1650 * @thread EMT.
1651 */
1652GMMR0DECL(int) GMMR0UpdateReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages)
1653{
1654 LogFlow(("GMMR0UpdateReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1655 pVM, cBasePages, cShadowPages, cFixedPages));
1656
1657 /*
1658 * Validate, get basics and take the semaphore.
1659 */
1660 PGMM pGMM;
1661 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1662 PGVM pGVM;
1663 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1664 if (RT_FAILURE(rc))
1665 return rc;
1666
1667 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1668 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1669 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1670
1671 gmmR0MutexAcquire(pGMM);
1672 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1673 {
1674 if ( pGVM->gmm.s.Stats.Reserved.cBasePages
1675 && pGVM->gmm.s.Stats.Reserved.cFixedPages
1676 && pGVM->gmm.s.Stats.Reserved.cShadowPages)
1677 {
1678 /*
1679 * Check if we can accommodate this.
1680 */
1681 /* ... later ... */
1682 if (RT_SUCCESS(rc))
1683 {
1684 /*
1685 * Update the records.
1686 */
1687 pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1688 + pGVM->gmm.s.Stats.Reserved.cFixedPages
1689 + pGVM->gmm.s.Stats.Reserved.cShadowPages;
1690 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1691
1692 pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1693 pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1694 pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1695 }
1696 }
1697 else
1698 rc = VERR_WRONG_ORDER;
1699 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1700 }
1701 else
1702 rc = VERR_GMM_IS_NOT_SANE;
1703 gmmR0MutexRelease(pGMM);
1704 LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1705 return rc;
1706}
1707
1708
1709/**
1710 * VMMR0 request wrapper for GMMR0UpdateReservation.
1711 *
1712 * @returns see GMMR0UpdateReservation.
1713 * @param pVM Pointer to the VM.
1714 * @param idCpu The VCPU id.
1715 * @param pReq Pointer to the request packet.
1716 */
1717GMMR0DECL(int) GMMR0UpdateReservationReq(PVM pVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1718{
1719 /*
1720 * Validate input and pass it on.
1721 */
1722 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1723 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1724 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1725
1726 return GMMR0UpdateReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1727}
1728
1729#ifdef GMMR0_WITH_SANITY_CHECK
1730
1731/**
1732 * Performs sanity checks on a free set.
1733 *
1734 * @returns Error count.
1735 *
1736 * @param pGMM Pointer to the GMM instance.
1737 * @param pSet Pointer to the set.
1738 * @param pszSetName The set name.
1739 * @param pszFunction The function from which it was called.
1740 * @param uLine The line number.
1741 */
1742static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1743 const char *pszFunction, unsigned uLineNo)
1744{
1745 uint32_t cErrors = 0;
1746
1747 /*
1748 * Count the free pages in all the chunks and match it against pSet->cFreePages.
1749 */
1750 uint32_t cPages = 0;
1751 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1752 {
1753 for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1754 {
1755 /** @todo check that the chunk is hash into the right set. */
1756 cPages += pCur->cFree;
1757 }
1758 }
1759 if (RT_UNLIKELY(cPages != pSet->cFreePages))
1760 {
1761 SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1762 cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1763 cErrors++;
1764 }
1765
1766 return cErrors;
1767}
1768
1769
1770/**
1771 * Performs some sanity checks on the GMM while owning lock.
1772 *
1773 * @returns Error count.
1774 *
1775 * @param pGMM Pointer to the GMM instance.
1776 * @param pszFunction The function from which it is called.
1777 * @param uLineNo The line number.
1778 */
1779static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1780{
1781 uint32_t cErrors = 0;
1782
1783 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->PrivateX, "private", pszFunction, uLineNo);
1784 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1785 /** @todo add more sanity checks. */
1786
1787 return cErrors;
1788}
1789
1790#endif /* GMMR0_WITH_SANITY_CHECK */
1791
1792/**
1793 * Looks up a chunk in the tree and fill in the TLB entry for it.
1794 *
1795 * This is not expected to fail and will bitch if it does.
1796 *
1797 * @returns Pointer to the allocation chunk, NULL if not found.
1798 * @param pGMM Pointer to the GMM instance.
1799 * @param idChunk The ID of the chunk to find.
1800 * @param pTlbe Pointer to the TLB entry.
1801 */
1802static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1803{
1804 PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1805 AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1806 pTlbe->idChunk = idChunk;
1807 pTlbe->pChunk = pChunk;
1808 return pChunk;
1809}
1810
1811
1812/**
1813 * Finds a allocation chunk.
1814 *
1815 * This is not expected to fail and will bitch if it does.
1816 *
1817 * @returns Pointer to the allocation chunk, NULL if not found.
1818 * @param pGMM Pointer to the GMM instance.
1819 * @param idChunk The ID of the chunk to find.
1820 */
1821DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1822{
1823 /*
1824 * Do a TLB lookup, branch if not in the TLB.
1825 */
1826 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1827 if ( pTlbe->idChunk != idChunk
1828 || !pTlbe->pChunk)
1829 return gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1830 return pTlbe->pChunk;
1831}
1832
1833
1834/**
1835 * Finds a page.
1836 *
1837 * This is not expected to fail and will bitch if it does.
1838 *
1839 * @returns Pointer to the page, NULL if not found.
1840 * @param pGMM Pointer to the GMM instance.
1841 * @param idPage The ID of the page to find.
1842 */
1843DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1844{
1845 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1846 if (RT_LIKELY(pChunk))
1847 return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1848 return NULL;
1849}
1850
1851
1852/**
1853 * Gets the host physical address for a page given by it's ID.
1854 *
1855 * @returns The host physical address or NIL_RTHCPHYS.
1856 * @param pGMM Pointer to the GMM instance.
1857 * @param idPage The ID of the page to find.
1858 */
1859DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1860{
1861 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1862 if (RT_LIKELY(pChunk))
1863 return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1864 return NIL_RTHCPHYS;
1865}
1866
1867
1868/**
1869 * Selects the appropriate free list given the number of free pages.
1870 *
1871 * @returns Free list index.
1872 * @param cFree The number of free pages in the chunk.
1873 */
1874DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1875{
1876 unsigned iList = cFree >> GMM_CHUNK_FREE_SET_SHIFT;
1877 AssertMsg(iList < RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists) / RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists[0]),
1878 ("%d (%u)\n", iList, cFree));
1879 return iList;
1880}
1881
1882
1883/**
1884 * Unlinks the chunk from the free list it's currently on (if any).
1885 *
1886 * @param pChunk The allocation chunk.
1887 */
1888DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1889{
1890 PGMMCHUNKFREESET pSet = pChunk->pSet;
1891 if (RT_LIKELY(pSet))
1892 {
1893 pSet->cFreePages -= pChunk->cFree;
1894 pSet->idGeneration++;
1895
1896 PGMMCHUNK pPrev = pChunk->pFreePrev;
1897 PGMMCHUNK pNext = pChunk->pFreeNext;
1898 if (pPrev)
1899 pPrev->pFreeNext = pNext;
1900 else
1901 pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
1902 if (pNext)
1903 pNext->pFreePrev = pPrev;
1904
1905 pChunk->pSet = NULL;
1906 pChunk->pFreeNext = NULL;
1907 pChunk->pFreePrev = NULL;
1908 }
1909 else
1910 {
1911 Assert(!pChunk->pFreeNext);
1912 Assert(!pChunk->pFreePrev);
1913 Assert(!pChunk->cFree);
1914 }
1915}
1916
1917
1918/**
1919 * Links the chunk onto the appropriate free list in the specified free set.
1920 *
1921 * If no free entries, it's not linked into any list.
1922 *
1923 * @param pChunk The allocation chunk.
1924 * @param pSet The free set.
1925 */
1926DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
1927{
1928 Assert(!pChunk->pSet);
1929 Assert(!pChunk->pFreeNext);
1930 Assert(!pChunk->pFreePrev);
1931
1932 if (pChunk->cFree > 0)
1933 {
1934 pChunk->pSet = pSet;
1935 pChunk->pFreePrev = NULL;
1936 unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
1937 pChunk->pFreeNext = pSet->apLists[iList];
1938 if (pChunk->pFreeNext)
1939 pChunk->pFreeNext->pFreePrev = pChunk;
1940 pSet->apLists[iList] = pChunk;
1941
1942 pSet->cFreePages += pChunk->cFree;
1943 pSet->idGeneration++;
1944 }
1945}
1946
1947
1948/**
1949 * Links the chunk onto the appropriate free list in the specified free set.
1950 *
1951 * If no free entries, it's not linked into any list.
1952 *
1953 * @param pChunk The allocation chunk.
1954 */
1955DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1956{
1957 PGMMCHUNKFREESET pSet;
1958 if (pGMM->fBoundMemoryMode)
1959 pSet = &pGVM->gmm.s.Private;
1960 else if (pChunk->cShared)
1961 pSet = &pGMM->Shared;
1962 else
1963 pSet = &pGMM->PrivateX;
1964 gmmR0LinkChunk(pChunk, pSet);
1965}
1966
1967
1968/**
1969 * Frees a Chunk ID.
1970 *
1971 * @param pGMM Pointer to the GMM instance.
1972 * @param idChunk The Chunk ID to free.
1973 */
1974static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
1975{
1976 AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
1977 AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
1978 ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
1979}
1980
1981
1982/**
1983 * Allocates a new Chunk ID.
1984 *
1985 * @returns The Chunk ID.
1986 * @param pGMM Pointer to the GMM instance.
1987 */
1988static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
1989{
1990 AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
1991 AssertCompile(NIL_GMM_CHUNKID == 0);
1992
1993 /*
1994 * Try the next sequential one.
1995 */
1996 int32_t idChunk = ++pGMM->idChunkPrev;
1997#if 0 /** @todo enable this code */
1998 if ( idChunk <= GMM_CHUNKID_LAST
1999 && idChunk > NIL_GMM_CHUNKID
2000 && !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
2001 return idChunk;
2002#endif
2003
2004 /*
2005 * Scan sequentially from the last one.
2006 */
2007 if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
2008 && idChunk > NIL_GMM_CHUNKID)
2009 {
2010 idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk - 1);
2011 if (idChunk > NIL_GMM_CHUNKID)
2012 {
2013 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2014 return pGMM->idChunkPrev = idChunk;
2015 }
2016 }
2017
2018 /*
2019 * Ok, scan from the start.
2020 * We're not racing anyone, so there is no need to expect failures or have restart loops.
2021 */
2022 idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
2023 AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
2024 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2025
2026 return pGMM->idChunkPrev = idChunk;
2027}
2028
2029
2030/**
2031 * Allocates one private page.
2032 *
2033 * Worker for gmmR0AllocatePages.
2034 *
2035 * @param pChunk The chunk to allocate it from.
2036 * @param hGVM The GVM handle of the VM requesting memory.
2037 * @param pPageDesc The page descriptor.
2038 */
2039static void gmmR0AllocatePage(PGMMCHUNK pChunk, uint32_t hGVM, PGMMPAGEDESC pPageDesc)
2040{
2041 /* update the chunk stats. */
2042 if (pChunk->hGVM == NIL_GVM_HANDLE)
2043 pChunk->hGVM = hGVM;
2044 Assert(pChunk->cFree);
2045 pChunk->cFree--;
2046 pChunk->cPrivate++;
2047
2048 /* unlink the first free page. */
2049 const uint32_t iPage = pChunk->iFreeHead;
2050 AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2051 PGMMPAGE pPage = &pChunk->aPages[iPage];
2052 Assert(GMM_PAGE_IS_FREE(pPage));
2053 pChunk->iFreeHead = pPage->Free.iNext;
2054 Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2055 pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage,
2056 pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2057
2058 /* make the page private. */
2059 pPage->u = 0;
2060 AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2061 pPage->Private.hGVM = hGVM;
2062 AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2063 AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2064 if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2065 pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2066 else
2067 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2068
2069 /* update the page descriptor. */
2070 pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2071 Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
2072 pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage;
2073 pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2074}
2075
2076
2077/**
2078 * Picks the free pages from a chunk.
2079 *
2080 * @returns The new page descriptor table index.
2081 * @param pGMM Pointer to the GMM instance data.
2082 * @param hGVM The global VM handle.
2083 * @param pChunk The chunk.
2084 * @param iPage The current page descriptor table index.
2085 * @param cPages The total number of pages to allocate.
2086 * @param paPages The page descriptor table (input + ouput).
2087 */
2088static uint32_t gmmR0AllocatePagesFromChunk(PGMMCHUNK pChunk, uint16_t const hGVM, uint32_t iPage, uint32_t cPages,
2089 PGMMPAGEDESC paPages)
2090{
2091 PGMMCHUNKFREESET pSet = pChunk->pSet; Assert(pSet);
2092 gmmR0UnlinkChunk(pChunk);
2093
2094 for (; pChunk->cFree && iPage < cPages; iPage++)
2095 gmmR0AllocatePage(pChunk, hGVM, &paPages[iPage]);
2096
2097 gmmR0LinkChunk(pChunk, pSet);
2098 return iPage;
2099}
2100
2101
2102/**
2103 * Registers a new chunk of memory.
2104 *
2105 * This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk.
2106 *
2107 * @returns VBox status code. On success, the giant GMM lock will be held, the
2108 * caller must release it (ugly).
2109 * @param pGMM Pointer to the GMM instance.
2110 * @param pSet Pointer to the set.
2111 * @param MemObj The memory object for the chunk.
2112 * @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2113 * affinity.
2114 * @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
2115 * @param ppChunk Chunk address (out). Optional.
2116 *
2117 * @remarks The caller must not own the giant GMM mutex.
2118 * The giant GMM mutex will be acquired and returned acquired in
2119 * the success path. On failure, no locks will be held.
2120 */
2121static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ MemObj, uint16_t hGVM, uint16_t fChunkFlags,
2122 PGMMCHUNK *ppChunk)
2123{
2124 Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
2125 Assert(hGVM != NIL_GVM_HANDLE || pGMM->fBoundMemoryMode);
2126 Assert(fChunkFlags == 0 || fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
2127
2128 int rc;
2129 PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
2130 if (pChunk)
2131 {
2132 /*
2133 * Initialize it.
2134 */
2135 pChunk->hMemObj = MemObj;
2136 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
2137 pChunk->hGVM = hGVM;
2138 /*pChunk->iFreeHead = 0;*/
2139 pChunk->idNumaNode = gmmR0GetCurrentNumaNodeId();
2140 pChunk->iChunkMtx = UINT8_MAX;
2141 pChunk->fFlags = fChunkFlags;
2142 for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
2143 {
2144 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
2145 pChunk->aPages[iPage].Free.iNext = iPage + 1;
2146 }
2147 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
2148 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
2149
2150 /*
2151 * Allocate a Chunk ID and insert it into the tree.
2152 * This has to be done behind the mutex of course.
2153 */
2154 rc = gmmR0MutexAcquire(pGMM);
2155 if (RT_SUCCESS(rc))
2156 {
2157 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2158 {
2159 pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
2160 if ( pChunk->Core.Key != NIL_GMM_CHUNKID
2161 && pChunk->Core.Key <= GMM_CHUNKID_LAST
2162 && RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
2163 {
2164 pGMM->cChunks++;
2165 RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
2166 gmmR0LinkChunk(pChunk, pSet);
2167 LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
2168
2169 if (ppChunk)
2170 *ppChunk = pChunk;
2171 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2172 return VINF_SUCCESS;
2173 }
2174
2175 /* bail out */
2176 rc = VERR_GMM_CHUNK_INSERT;
2177 }
2178 else
2179 rc = VERR_GMM_IS_NOT_SANE;
2180 gmmR0MutexRelease(pGMM);
2181 }
2182
2183 RTMemFree(pChunk);
2184 }
2185 else
2186 rc = VERR_NO_MEMORY;
2187 return rc;
2188}
2189
2190
2191/**
2192 * Allocate a new chunk, immediately pick the requested pages from it, and adds
2193 * what's remaining to the specified free set.
2194 *
2195 * @note This will leave the giant mutex while allocating the new chunk!
2196 *
2197 * @returns VBox status code.
2198 * @param pGMM Pointer to the GMM instance data.
2199 * @param pGVM Pointer to the kernel-only VM instace data.
2200 * @param pSet Pointer to the free set.
2201 * @param cPages The number of pages requested.
2202 * @param paPages The page descriptor table (input + output).
2203 * @param piPage The pointer to the page descriptor table index
2204 * variable. This will be updated.
2205 */
2206static int gmmR0AllocateChunkNew(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages,
2207 PGMMPAGEDESC paPages, uint32_t *piPage)
2208{
2209 gmmR0MutexRelease(pGMM);
2210
2211 RTR0MEMOBJ hMemObj;
2212 int rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2213 if (RT_SUCCESS(rc))
2214 {
2215/** @todo Duplicate gmmR0RegisterChunk here so we can avoid chaining up the
2216 * free pages first and then unchaining them right afterwards. Instead
2217 * do as much work as possible without holding the giant lock. */
2218 PGMMCHUNK pChunk;
2219 rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, 0 /*fChunkFlags*/, &pChunk);
2220 if (RT_SUCCESS(rc))
2221 {
2222 *piPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, *piPage, cPages, paPages);
2223 return VINF_SUCCESS;
2224 }
2225
2226 /* bail out */
2227 RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
2228 }
2229
2230 int rc2 = gmmR0MutexAcquire(pGMM);
2231 AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2232 return rc;
2233
2234}
2235
2236
2237/**
2238 * As a last restort we'll pick any page we can get.
2239 *
2240 * @returns The new page descriptor table index.
2241 * @param pSet The set to pick from.
2242 * @param pGVM Pointer to the global VM structure.
2243 * @param iPage The current page descriptor table index.
2244 * @param cPages The total number of pages to allocate.
2245 * @param paPages The page descriptor table (input + ouput).
2246 */
2247static uint32_t gmmR0AllocatePagesIndiscriminately(PGMMCHUNKFREESET pSet, PGVM pGVM,
2248 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2249{
2250 unsigned iList = RT_ELEMENTS(pSet->apLists);
2251 while (iList-- > 0)
2252 {
2253 PGMMCHUNK pChunk = pSet->apLists[iList];
2254 while (pChunk)
2255 {
2256 PGMMCHUNK pNext = pChunk->pFreeNext;
2257
2258 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2259 if (iPage >= cPages)
2260 return iPage;
2261
2262 pChunk = pNext;
2263 }
2264 }
2265 return iPage;
2266}
2267
2268
2269/**
2270 * Pick pages from empty chunks on the same NUMA node.
2271 *
2272 * @returns The new page descriptor table index.
2273 * @param pSet The set to pick from.
2274 * @param pGVM Pointer to the global VM structure.
2275 * @param iPage The current page descriptor table index.
2276 * @param cPages The total number of pages to allocate.
2277 * @param paPages The page descriptor table (input + ouput).
2278 */
2279static uint32_t gmmR0AllocatePagesFromEmptyChunksOnSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2280 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2281{
2282 PGMMCHUNK pChunk = pSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
2283 if (pChunk)
2284 {
2285 uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2286 while (pChunk)
2287 {
2288 PGMMCHUNK pNext = pChunk->pFreeNext;
2289
2290 if (pChunk->idNumaNode == idNumaNode)
2291 {
2292 pChunk->hGVM = pGVM->hSelf;
2293 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2294 if (iPage >= cPages)
2295 {
2296 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2297 return iPage;
2298 }
2299 }
2300
2301 pChunk = pNext;
2302 }
2303 }
2304 return iPage;
2305}
2306
2307
2308/**
2309 * Pick pages from non-empty chunks on the same NUMA node.
2310 *
2311 * @returns The new page descriptor table index.
2312 * @param pSet The set to pick from.
2313 * @param pGVM Pointer to the global VM structure.
2314 * @param iPage The current page descriptor table index.
2315 * @param cPages The total number of pages to allocate.
2316 * @param paPages The page descriptor table (input + ouput).
2317 */
2318static uint32_t gmmR0AllocatePagesFromSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2319 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2320{
2321 /** @todo start by picking from chunks with about the right size first? */
2322 uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2323 unsigned iList = GMM_CHUNK_FREE_SET_UNUSED_LIST;
2324 while (iList-- > 0)
2325 {
2326 PGMMCHUNK pChunk = pSet->apLists[iList];
2327 while (pChunk)
2328 {
2329 PGMMCHUNK pNext = pChunk->pFreeNext;
2330
2331 if (pChunk->idNumaNode == idNumaNode)
2332 {
2333 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2334 if (iPage >= cPages)
2335 {
2336 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2337 return iPage;
2338 }
2339 }
2340
2341 pChunk = pNext;
2342 }
2343 }
2344 return iPage;
2345}
2346
2347
2348/**
2349 * Pick pages that are in chunks already associated with the VM.
2350 *
2351 * @returns The new page descriptor table index.
2352 * @param pGMM Pointer to the GMM instance data.
2353 * @param pGVM Pointer to the global VM structure.
2354 * @param pSet The set to pick from.
2355 * @param iPage The current page descriptor table index.
2356 * @param cPages The total number of pages to allocate.
2357 * @param paPages The page descriptor table (input + ouput).
2358 */
2359static uint32_t gmmR0AllocatePagesAssociatedWithVM(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet,
2360 uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2361{
2362 uint16_t const hGVM = pGVM->hSelf;
2363
2364 /* Hint. */
2365 if (pGVM->gmm.s.idLastChunkHint != NIL_GMM_CHUNKID)
2366 {
2367 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pGVM->gmm.s.idLastChunkHint);
2368 if (pChunk && pChunk->cFree)
2369 {
2370 iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2371 if (iPage >= cPages)
2372 return iPage;
2373 }
2374 }
2375
2376 /* Scan. */
2377 for (unsigned iList = 0; iList < RT_ELEMENTS(pSet->apLists); iList++)
2378 {
2379 PGMMCHUNK pChunk = pSet->apLists[iList];
2380 while (pChunk)
2381 {
2382 PGMMCHUNK pNext = pChunk->pFreeNext;
2383
2384 if (pChunk->hGVM == hGVM)
2385 {
2386 iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2387 if (iPage >= cPages)
2388 {
2389 pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2390 return iPage;
2391 }
2392 }
2393
2394 pChunk = pNext;
2395 }
2396 }
2397 return iPage;
2398}
2399
2400
2401
2402/**
2403 * Pick pages in bound memory mode.
2404 *
2405 * @returns The new page descriptor table index.
2406 * @param pGVM Pointer to the global VM structure.
2407 * @param iPage The current page descriptor table index.
2408 * @param cPages The total number of pages to allocate.
2409 * @param paPages The page descriptor table (input + ouput).
2410 */
2411static uint32_t gmmR0AllocatePagesInBoundMode(PGVM pGVM, uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2412{
2413 for (unsigned iList = 0; iList < RT_ELEMENTS(pGVM->gmm.s.Private.apLists); iList++)
2414 {
2415 PGMMCHUNK pChunk = pGVM->gmm.s.Private.apLists[iList];
2416 while (pChunk)
2417 {
2418 Assert(pChunk->hGVM == pGVM->hSelf);
2419 PGMMCHUNK pNext = pChunk->pFreeNext;
2420 iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2421 if (iPage >= cPages)
2422 return iPage;
2423 pChunk = pNext;
2424 }
2425 }
2426 return iPage;
2427}
2428
2429
2430/**
2431 * Checks if we should start picking pages from chunks of other VMs because
2432 * we're getting close to the system memory or reserved limit.
2433 *
2434 * @returns @c true if we should, @c false if we should first try allocate more
2435 * chunks.
2436 */
2437static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(PGVM pGVM)
2438{
2439 /*
2440 * Don't allocate a new chunk if we're
2441 */
2442 uint64_t cPgReserved = pGVM->gmm.s.Stats.Reserved.cBasePages
2443 + pGVM->gmm.s.Stats.Reserved.cFixedPages
2444 - pGVM->gmm.s.Stats.cBalloonedPages
2445 /** @todo what about shared pages? */;
2446 uint64_t cPgAllocated = pGVM->gmm.s.Stats.Allocated.cBasePages
2447 + pGVM->gmm.s.Stats.Allocated.cFixedPages;
2448 uint64_t cPgDelta = cPgReserved - cPgAllocated;
2449 if (cPgDelta < GMM_CHUNK_NUM_PAGES * 4)
2450 return true;
2451 /** @todo make the threshold configurable, also test the code to see if
2452 * this ever kicks in (we might be reserving too much or smth). */
2453
2454 /*
2455 * Check how close we're to the max memory limit and how many fragments
2456 * there are?...
2457 */
2458 /** @todo. */
2459
2460 return false;
2461}
2462
2463
2464/**
2465 * Checks if we should start picking pages from chunks of other VMs because
2466 * there is a lot of free pages around.
2467 *
2468 * @returns @c true if we should, @c false if we should first try allocate more
2469 * chunks.
2470 */
2471static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(PGMM pGMM)
2472{
2473 /*
2474 * Setting the limit at 16 chunks (32 MB) at the moment.
2475 */
2476 if (pGMM->PrivateX.cFreePages >= GMM_CHUNK_NUM_PAGES * 16)
2477 return true;
2478 return false;
2479}
2480
2481
2482/**
2483 * Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2484 *
2485 * @returns VBox status code:
2486 * @retval VINF_SUCCESS on success.
2487 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
2488 * gmmR0AllocateMoreChunks is necessary.
2489 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2490 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2491 * that is we're trying to allocate more than we've reserved.
2492 *
2493 * @param pGMM Pointer to the GMM instance data.
2494 * @param pGVM Pointer to the VM.
2495 * @param cPages The number of pages to allocate.
2496 * @param paPages Pointer to the page descriptors.
2497 * See GMMPAGEDESC for details on what is expected on input.
2498 * @param enmAccount The account to charge.
2499 *
2500 * @remarks Call takes the giant GMM lock.
2501 */
2502static int gmmR0AllocatePagesNew(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2503{
2504 Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
2505
2506 /*
2507 * Check allocation limits.
2508 */
2509 if (RT_UNLIKELY(pGMM->cAllocatedPages + cPages > pGMM->cMaxPages))
2510 return VERR_GMM_HIT_GLOBAL_LIMIT;
2511
2512 switch (enmAccount)
2513 {
2514 case GMMACCOUNT_BASE:
2515 if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2516 > pGVM->gmm.s.Stats.Reserved.cBasePages))
2517 {
2518 Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2519 pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages,
2520 pGVM->gmm.s.Stats.cBalloonedPages, cPages));
2521 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2522 }
2523 break;
2524 case GMMACCOUNT_SHADOW:
2525 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages + cPages > pGVM->gmm.s.Stats.Reserved.cShadowPages))
2526 {
2527 Log(("gmmR0AllocatePages:Shadow: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2528 pGVM->gmm.s.Stats.Reserved.cShadowPages, pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
2529 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2530 }
2531 break;
2532 case GMMACCOUNT_FIXED:
2533 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages + cPages > pGVM->gmm.s.Stats.Reserved.cFixedPages))
2534 {
2535 Log(("gmmR0AllocatePages:Fixed: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2536 pGVM->gmm.s.Stats.Reserved.cFixedPages, pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
2537 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2538 }
2539 break;
2540 default:
2541 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2542 }
2543
2544 /*
2545 * If we're in legacy memory mode, it's easy to figure if we have
2546 * sufficient number of pages up-front.
2547 */
2548 if ( pGMM->fLegacyAllocationMode
2549 && pGVM->gmm.s.Private.cFreePages < cPages)
2550 {
2551 Assert(pGMM->fBoundMemoryMode);
2552 return VERR_GMM_SEED_ME;
2553 }
2554
2555 /*
2556 * Update the accounts before we proceed because we might be leaving the
2557 * protection of the global mutex and thus run the risk of permitting
2558 * too much memory to be allocated.
2559 */
2560 switch (enmAccount)
2561 {
2562 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages += cPages; break;
2563 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages += cPages; break;
2564 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages += cPages; break;
2565 default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2566 }
2567 pGVM->gmm.s.Stats.cPrivatePages += cPages;
2568 pGMM->cAllocatedPages += cPages;
2569
2570 /*
2571 * Part two of it's-easy-in-legacy-memory-mode.
2572 */
2573 uint32_t iPage = 0;
2574 if (pGMM->fLegacyAllocationMode)
2575 {
2576 iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2577 AssertReleaseReturn(iPage == cPages, VERR_GMM_ALLOC_PAGES_IPE);
2578 return VINF_SUCCESS;
2579 }
2580
2581 /*
2582 * Bound mode is also relatively straightforward.
2583 */
2584 int rc = VINF_SUCCESS;
2585 if (pGMM->fBoundMemoryMode)
2586 {
2587 iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2588 if (iPage < cPages)
2589 do
2590 rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGVM->gmm.s.Private, cPages, paPages, &iPage);
2591 while (iPage < cPages && RT_SUCCESS(rc));
2592 }
2593 /*
2594 * Shared mode is trickier as we should try archive the same locality as
2595 * in bound mode, but smartly make use of non-full chunks allocated by
2596 * other VMs if we're low on memory.
2597 */
2598 else
2599 {
2600 /* Pick the most optimal pages first. */
2601 iPage = gmmR0AllocatePagesAssociatedWithVM(pGMM, pGVM, &pGMM->PrivateX, iPage, cPages, paPages);
2602 if (iPage < cPages)
2603 {
2604 /* Maybe we should try getting pages from chunks "belonging" to
2605 other VMs before allocating more chunks? */
2606 bool fTriedOnSameAlready = false;
2607 if (gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(pGVM))
2608 {
2609 iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2610 fTriedOnSameAlready = true;
2611 }
2612
2613 /* Allocate memory from empty chunks. */
2614 if (iPage < cPages)
2615 iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2616
2617 /* Grab empty shared chunks. */
2618 if (iPage < cPages)
2619 iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2620
2621 /* If there is a lof of free pages spread around, try not waste
2622 system memory on more chunks. (Should trigger defragmentation.) */
2623 if ( !fTriedOnSameAlready
2624 && gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(pGMM))
2625 {
2626 iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2627 if (iPage < cPages)
2628 iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2629 }
2630
2631 /*
2632 * Ok, try allocate new chunks.
2633 */
2634 if (iPage < cPages)
2635 {
2636 do
2637 rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGMM->PrivateX, cPages, paPages, &iPage);
2638 while (iPage < cPages && RT_SUCCESS(rc));
2639
2640 /* If the host is out of memory, take whatever we can get. */
2641 if ( (rc == VERR_NO_MEMORY || rc == VERR_NO_PHYS_MEMORY)
2642 && pGMM->PrivateX.cFreePages + pGMM->Shared.cFreePages >= cPages - iPage)
2643 {
2644 iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2645 if (iPage < cPages)
2646 iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2647 AssertRelease(iPage == cPages);
2648 rc = VINF_SUCCESS;
2649 }
2650 }
2651 }
2652 }
2653
2654 /*
2655 * Clean up on failure. Since this is bound to be a low-memory condition
2656 * we will give back any empty chunks that might be hanging around.
2657 */
2658 if (RT_FAILURE(rc))
2659 {
2660 /* Update the statistics. */
2661 pGVM->gmm.s.Stats.cPrivatePages -= cPages;
2662 pGMM->cAllocatedPages -= cPages - iPage;
2663 switch (enmAccount)
2664 {
2665 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages; break;
2666 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= cPages; break;
2667 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= cPages; break;
2668 default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2669 }
2670
2671 /* Release the pages. */
2672 while (iPage-- > 0)
2673 {
2674 uint32_t idPage = paPages[iPage].idPage;
2675 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2676 if (RT_LIKELY(pPage))
2677 {
2678 Assert(GMM_PAGE_IS_PRIVATE(pPage));
2679 Assert(pPage->Private.hGVM == pGVM->hSelf);
2680 gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
2681 }
2682 else
2683 AssertMsgFailed(("idPage=%#x\n", idPage));
2684
2685 paPages[iPage].idPage = NIL_GMM_PAGEID;
2686 paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2687 paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2688 }
2689
2690 /* Free empty chunks. */
2691 /** @todo */
2692
2693 /* return the fail status on failure */
2694 return rc;
2695 }
2696 return VINF_SUCCESS;
2697}
2698
2699
2700/**
2701 * Updates the previous allocations and allocates more pages.
2702 *
2703 * The handy pages are always taken from the 'base' memory account.
2704 * The allocated pages are not cleared and will contains random garbage.
2705 *
2706 * @returns VBox status code:
2707 * @retval VINF_SUCCESS on success.
2708 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2709 * @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2710 * @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2711 * private page.
2712 * @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2713 * shared page.
2714 * @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2715 * owned by the VM.
2716 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2717 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2718 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2719 * that is we're trying to allocate more than we've reserved.
2720 *
2721 * @param pVM Pointer to the VM.
2722 * @param idCpu The VCPU id.
2723 * @param cPagesToUpdate The number of pages to update (starting from the head).
2724 * @param cPagesToAlloc The number of pages to allocate (starting from the head).
2725 * @param paPages The array of page descriptors.
2726 * See GMMPAGEDESC for details on what is expected on input.
2727 * @thread EMT.
2728 */
2729GMMR0DECL(int) GMMR0AllocateHandyPages(PVM pVM, VMCPUID idCpu, uint32_t cPagesToUpdate, uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2730{
2731 LogFlow(("GMMR0AllocateHandyPages: pVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2732 pVM, cPagesToUpdate, cPagesToAlloc, paPages));
2733
2734 /*
2735 * Validate, get basics and take the semaphore.
2736 * (This is a relatively busy path, so make predictions where possible.)
2737 */
2738 PGMM pGMM;
2739 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2740 PGVM pGVM;
2741 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2742 if (RT_FAILURE(rc))
2743 return rc;
2744
2745 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2746 AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2747 || (cPagesToAlloc && cPagesToAlloc < 1024),
2748 ("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2749 VERR_INVALID_PARAMETER);
2750
2751 unsigned iPage = 0;
2752 for (; iPage < cPagesToUpdate; iPage++)
2753 {
2754 AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2755 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2756 || paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2757 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2758 ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2759 VERR_INVALID_PARAMETER);
2760 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2761 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
2762 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2763 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2764 /*|| paPages[iPage].idSharedPage == NIL_GMM_PAGEID*/,
2765 ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2766 }
2767
2768 for (; iPage < cPagesToAlloc; iPage++)
2769 {
2770 AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2771 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2772 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2773 }
2774
2775 gmmR0MutexAcquire(pGMM);
2776 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2777 {
2778 /* No allocations before the initial reservation has been made! */
2779 if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2780 && pGVM->gmm.s.Stats.Reserved.cFixedPages
2781 && pGVM->gmm.s.Stats.Reserved.cShadowPages))
2782 {
2783 /*
2784 * Perform the updates.
2785 * Stop on the first error.
2786 */
2787 for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2788 {
2789 if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2790 {
2791 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2792 if (RT_LIKELY(pPage))
2793 {
2794 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2795 {
2796 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2797 {
2798 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2799 if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2800 pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2801 else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2802 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2803 /* else: NIL_RTHCPHYS nothing */
2804
2805 paPages[iPage].idPage = NIL_GMM_PAGEID;
2806 paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2807 }
2808 else
2809 {
2810 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2811 iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2812 rc = VERR_GMM_NOT_PAGE_OWNER;
2813 break;
2814 }
2815 }
2816 else
2817 {
2818 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.*Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(*pPage), pPage, pPage->Common.u2State));
2819 rc = VERR_GMM_PAGE_NOT_PRIVATE;
2820 break;
2821 }
2822 }
2823 else
2824 {
2825 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2826 rc = VERR_GMM_PAGE_NOT_FOUND;
2827 break;
2828 }
2829 }
2830
2831 if (paPages[iPage].idSharedPage != NIL_GMM_PAGEID)
2832 {
2833 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
2834 if (RT_LIKELY(pPage))
2835 {
2836 if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2837 {
2838 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2839 Assert(pPage->Shared.cRefs);
2840 Assert(pGVM->gmm.s.Stats.cSharedPages);
2841 Assert(pGVM->gmm.s.Stats.Allocated.cBasePages);
2842
2843 Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
2844 pGVM->gmm.s.Stats.cSharedPages--;
2845 pGVM->gmm.s.Stats.Allocated.cBasePages--;
2846 if (!--pPage->Shared.cRefs)
2847 gmmR0FreeSharedPage(pGMM, pGVM, paPages[iPage].idSharedPage, pPage);
2848 else
2849 {
2850 Assert(pGMM->cDuplicatePages);
2851 pGMM->cDuplicatePages--;
2852 }
2853
2854 paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2855 }
2856 else
2857 {
2858 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
2859 rc = VERR_GMM_PAGE_NOT_SHARED;
2860 break;
2861 }
2862 }
2863 else
2864 {
2865 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
2866 rc = VERR_GMM_PAGE_NOT_FOUND;
2867 break;
2868 }
2869 }
2870 } /* for each page to update */
2871
2872 if (RT_SUCCESS(rc) && cPagesToAlloc > 0)
2873 {
2874#if defined(VBOX_STRICT) && 0 /** @todo re-test this later. Appeared to be a PGM init bug. */
2875 for (iPage = 0; iPage < cPagesToAlloc; iPage++)
2876 {
2877 Assert(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS);
2878 Assert(paPages[iPage].idPage == NIL_GMM_PAGEID);
2879 Assert(paPages[iPage].idSharedPage == NIL_GMM_PAGEID);
2880 }
2881#endif
2882
2883 /*
2884 * Join paths with GMMR0AllocatePages for the allocation.
2885 * Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2886 */
2887 rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
2888 }
2889 }
2890 else
2891 rc = VERR_WRONG_ORDER;
2892 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2893 }
2894 else
2895 rc = VERR_GMM_IS_NOT_SANE;
2896 gmmR0MutexRelease(pGMM);
2897 LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
2898 return rc;
2899}
2900
2901
2902/**
2903 * Allocate one or more pages.
2904 *
2905 * This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
2906 * The allocated pages are not cleared and will contain random garbage.
2907 *
2908 * @returns VBox status code:
2909 * @retval VINF_SUCCESS on success.
2910 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2911 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2912 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2913 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2914 * that is we're trying to allocate more than we've reserved.
2915 *
2916 * @param pVM Pointer to the VM.
2917 * @param idCpu The VCPU id.
2918 * @param cPages The number of pages to allocate.
2919 * @param paPages Pointer to the page descriptors.
2920 * See GMMPAGEDESC for details on what is expected on input.
2921 * @param enmAccount The account to charge.
2922 *
2923 * @thread EMT.
2924 */
2925GMMR0DECL(int) GMMR0AllocatePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2926{
2927 LogFlow(("GMMR0AllocatePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
2928
2929 /*
2930 * Validate, get basics and take the semaphore.
2931 */
2932 PGMM pGMM;
2933 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2934 PGVM pGVM;
2935 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2936 if (RT_FAILURE(rc))
2937 return rc;
2938
2939 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2940 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
2941 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
2942
2943 for (unsigned iPage = 0; iPage < cPages; iPage++)
2944 {
2945 AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2946 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
2947 || ( enmAccount == GMMACCOUNT_BASE
2948 && paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2949 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
2950 ("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
2951 VERR_INVALID_PARAMETER);
2952 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2953 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2954 }
2955
2956 gmmR0MutexAcquire(pGMM);
2957 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2958 {
2959
2960 /* No allocations before the initial reservation has been made! */
2961 if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2962 && pGVM->gmm.s.Stats.Reserved.cFixedPages
2963 && pGVM->gmm.s.Stats.Reserved.cShadowPages))
2964 rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPages, paPages, enmAccount);
2965 else
2966 rc = VERR_WRONG_ORDER;
2967 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2968 }
2969 else
2970 rc = VERR_GMM_IS_NOT_SANE;
2971 gmmR0MutexRelease(pGMM);
2972 LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
2973 return rc;
2974}
2975
2976
2977/**
2978 * VMMR0 request wrapper for GMMR0AllocatePages.
2979 *
2980 * @returns see GMMR0AllocatePages.
2981 * @param pVM Pointer to the VM.
2982 * @param idCpu The VCPU id.
2983 * @param pReq Pointer to the request packet.
2984 */
2985GMMR0DECL(int) GMMR0AllocatePagesReq(PVM pVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
2986{
2987 /*
2988 * Validate input and pass it on.
2989 */
2990 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2991 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2992 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
2993 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
2994 VERR_INVALID_PARAMETER);
2995 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
2996 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
2997 VERR_INVALID_PARAMETER);
2998
2999 return GMMR0AllocatePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3000}
3001
3002
3003/**
3004 * Allocate a large page to represent guest RAM
3005 *
3006 * The allocated pages are not cleared and will contains random garbage.
3007 *
3008 * @returns VBox status code:
3009 * @retval VINF_SUCCESS on success.
3010 * @retval VERR_NOT_OWNER if the caller is not an EMT.
3011 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
3012 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3013 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3014 * that is we're trying to allocate more than we've reserved.
3015 * @returns see GMMR0AllocatePages.
3016 * @param pVM Pointer to the VM.
3017 * @param idCpu The VCPU id.
3018 * @param cbPage Large page size.
3019 */
3020GMMR0DECL(int) GMMR0AllocateLargePage(PVM pVM, VMCPUID idCpu, uint32_t cbPage, uint32_t *pIdPage, RTHCPHYS *pHCPhys)
3021{
3022 LogFlow(("GMMR0AllocateLargePage: pVM=%p cbPage=%x\n", pVM, cbPage));
3023
3024 AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
3025 AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
3026 AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
3027
3028 /*
3029 * Validate, get basics and take the semaphore.
3030 */
3031 PGMM pGMM;
3032 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3033 PGVM pGVM;
3034 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3035 if (RT_FAILURE(rc))
3036 return rc;
3037
3038 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3039 if (pGMM->fLegacyAllocationMode)
3040 return VERR_NOT_SUPPORTED;
3041
3042 *pHCPhys = NIL_RTHCPHYS;
3043 *pIdPage = NIL_GMM_PAGEID;
3044
3045 gmmR0MutexAcquire(pGMM);
3046 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3047 {
3048 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3049 if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
3050 > pGVM->gmm.s.Stats.Reserved.cBasePages))
3051 {
3052 Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
3053 pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3054 gmmR0MutexRelease(pGMM);
3055 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
3056 }
3057
3058 /*
3059 * Allocate a new large page chunk.
3060 *
3061 * Note! We leave the giant GMM lock temporarily as the allocation might
3062 * take a long time. gmmR0RegisterChunk will retake it (ugly).
3063 */
3064 AssertCompile(GMM_CHUNK_SIZE == _2M);
3065 gmmR0MutexRelease(pGMM);
3066
3067 RTR0MEMOBJ hMemObj;
3068 rc = RTR0MemObjAllocPhysEx(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS, GMM_CHUNK_SIZE);
3069 if (RT_SUCCESS(rc))
3070 {
3071 PGMMCHUNKFREESET pSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
3072 PGMMCHUNK pChunk;
3073 rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_LARGE_PAGE, &pChunk);
3074 if (RT_SUCCESS(rc))
3075 {
3076 /*
3077 * Allocate all the pages in the chunk.
3078 */
3079 /* Unlink the new chunk from the free list. */
3080 gmmR0UnlinkChunk(pChunk);
3081
3082 /** @todo rewrite this to skip the looping. */
3083 /* Allocate all pages. */
3084 GMMPAGEDESC PageDesc;
3085 gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3086
3087 /* Return the first page as we'll use the whole chunk as one big page. */
3088 *pIdPage = PageDesc.idPage;
3089 *pHCPhys = PageDesc.HCPhysGCPhys;
3090
3091 for (unsigned i = 1; i < cPages; i++)
3092 gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3093
3094 /* Update accounting. */
3095 pGVM->gmm.s.Stats.Allocated.cBasePages += cPages;
3096 pGVM->gmm.s.Stats.cPrivatePages += cPages;
3097 pGMM->cAllocatedPages += cPages;
3098
3099 gmmR0LinkChunk(pChunk, pSet);
3100 gmmR0MutexRelease(pGMM);
3101 }
3102 else
3103 RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3104 }
3105 }
3106 else
3107 {
3108 gmmR0MutexRelease(pGMM);
3109 rc = VERR_GMM_IS_NOT_SANE;
3110 }
3111
3112 LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
3113 return rc;
3114}
3115
3116
3117/**
3118 * Free a large page.
3119 *
3120 * @returns VBox status code:
3121 * @param pVM Pointer to the VM.
3122 * @param idCpu The VCPU id.
3123 * @param idPage The large page id.
3124 */
3125GMMR0DECL(int) GMMR0FreeLargePage(PVM pVM, VMCPUID idCpu, uint32_t idPage)
3126{
3127 LogFlow(("GMMR0FreeLargePage: pVM=%p idPage=%x\n", pVM, idPage));
3128
3129 /*
3130 * Validate, get basics and take the semaphore.
3131 */
3132 PGMM pGMM;
3133 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3134 PGVM pGVM;
3135 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3136 if (RT_FAILURE(rc))
3137 return rc;
3138
3139 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3140 if (pGMM->fLegacyAllocationMode)
3141 return VERR_NOT_SUPPORTED;
3142
3143 gmmR0MutexAcquire(pGMM);
3144 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3145 {
3146 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3147
3148 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3149 {
3150 Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3151 gmmR0MutexRelease(pGMM);
3152 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3153 }
3154
3155 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3156 if (RT_LIKELY( pPage
3157 && GMM_PAGE_IS_PRIVATE(pPage)))
3158 {
3159 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3160 Assert(pChunk);
3161 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3162 Assert(pChunk->cPrivate > 0);
3163
3164 /* Release the memory immediately. */
3165 gmmR0FreeChunk(pGMM, NULL, pChunk, false /*fRelaxedSem*/); /** @todo this can be relaxed too! */
3166
3167 /* Update accounting. */
3168 pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages;
3169 pGVM->gmm.s.Stats.cPrivatePages -= cPages;
3170 pGMM->cAllocatedPages -= cPages;
3171 }
3172 else
3173 rc = VERR_GMM_PAGE_NOT_FOUND;
3174 }
3175 else
3176 rc = VERR_GMM_IS_NOT_SANE;
3177
3178 gmmR0MutexRelease(pGMM);
3179 LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
3180 return rc;
3181}
3182
3183
3184/**
3185 * VMMR0 request wrapper for GMMR0FreeLargePage.
3186 *
3187 * @returns see GMMR0FreeLargePage.
3188 * @param pVM Pointer to the VM.
3189 * @param idCpu The VCPU id.
3190 * @param pReq Pointer to the request packet.
3191 */
3192GMMR0DECL(int) GMMR0FreeLargePageReq(PVM pVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
3193{
3194 /*
3195 * Validate input and pass it on.
3196 */
3197 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3198 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3199 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
3200 ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
3201 VERR_INVALID_PARAMETER);
3202
3203 return GMMR0FreeLargePage(pVM, idCpu, pReq->idPage);
3204}
3205
3206
3207/**
3208 * Frees a chunk, giving it back to the host OS.
3209 *
3210 * @param pGMM Pointer to the GMM instance.
3211 * @param pGVM This is set when called from GMMR0CleanupVM so we can
3212 * unmap and free the chunk in one go.
3213 * @param pChunk The chunk to free.
3214 * @param fRelaxedSem Whether we can release the semaphore while doing the
3215 * freeing (@c true) or not.
3216 */
3217static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3218{
3219 Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
3220
3221 GMMR0CHUNKMTXSTATE MtxState;
3222 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3223
3224 /*
3225 * Cleanup hack! Unmap the chunk from the callers address space.
3226 * This shouldn't happen, so screw lock contention...
3227 */
3228 if ( pChunk->cMappingsX
3229 && !pGMM->fLegacyAllocationMode
3230 && pGVM)
3231 gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3232
3233 /*
3234 * If there are current mappings of the chunk, then request the
3235 * VMs to unmap them. Reposition the chunk in the free list so
3236 * it won't be a likely candidate for allocations.
3237 */
3238 if (pChunk->cMappingsX)
3239 {
3240 /** @todo R0 -> VM request */
3241 /* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
3242 Log(("gmmR0FreeChunk: chunk still has %d/%d mappings; don't free!\n", pChunk->cMappingsX));
3243 gmmR0ChunkMutexRelease(&MtxState, pChunk);
3244 return false;
3245 }
3246
3247
3248 /*
3249 * Save and trash the handle.
3250 */
3251 RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
3252 pChunk->hMemObj = NIL_RTR0MEMOBJ;
3253
3254 /*
3255 * Unlink it from everywhere.
3256 */
3257 gmmR0UnlinkChunk(pChunk);
3258
3259 RTListNodeRemove(&pChunk->ListNode);
3260
3261 PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
3262 Assert(pCore == &pChunk->Core); NOREF(pCore);
3263
3264 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
3265 if (pTlbe->pChunk == pChunk)
3266 {
3267 pTlbe->idChunk = NIL_GMM_CHUNKID;
3268 pTlbe->pChunk = NULL;
3269 }
3270
3271 Assert(pGMM->cChunks > 0);
3272 pGMM->cChunks--;
3273
3274 /*
3275 * Free the Chunk ID before dropping the locks and freeing the rest.
3276 */
3277 gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
3278 pChunk->Core.Key = NIL_GMM_CHUNKID;
3279
3280 pGMM->cFreedChunks++;
3281
3282 gmmR0ChunkMutexRelease(&MtxState, NULL);
3283 if (fRelaxedSem)
3284 gmmR0MutexRelease(pGMM);
3285
3286 RTMemFree(pChunk->paMappingsX);
3287 pChunk->paMappingsX = NULL;
3288
3289 RTMemFree(pChunk);
3290
3291 int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3292 AssertLogRelRC(rc);
3293
3294 if (fRelaxedSem)
3295 gmmR0MutexAcquire(pGMM);
3296 return fRelaxedSem;
3297}
3298
3299
3300/**
3301 * Free page worker.
3302 *
3303 * The caller does all the statistic decrementing, we do all the incrementing.
3304 *
3305 * @param pGMM Pointer to the GMM instance data.
3306 * @param pGVM Pointer to the GVM instance.
3307 * @param pChunk Pointer to the chunk this page belongs to.
3308 * @param idPage The Page ID.
3309 * @param pPage Pointer to the page.
3310 */
3311static void gmmR0FreePageWorker(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
3312{
3313 Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
3314 pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
3315
3316 /*
3317 * Put the page on the free list.
3318 */
3319 pPage->u = 0;
3320 pPage->Free.u2State = GMM_PAGE_STATE_FREE;
3321 Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) || pChunk->iFreeHead == UINT16_MAX);
3322 pPage->Free.iNext = pChunk->iFreeHead;
3323 pChunk->iFreeHead = pPage - &pChunk->aPages[0];
3324
3325 /*
3326 * Update statistics (the cShared/cPrivate stats are up to date already),
3327 * and relink the chunk if necessary.
3328 */
3329 unsigned const cFree = pChunk->cFree;
3330 if ( !cFree
3331 || gmmR0SelectFreeSetList(cFree) != gmmR0SelectFreeSetList(cFree + 1))
3332 {
3333 gmmR0UnlinkChunk(pChunk);
3334 pChunk->cFree++;
3335 gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
3336 }
3337 else
3338 {
3339 pChunk->cFree = cFree + 1;
3340 pChunk->pSet->cFreePages++;
3341 }
3342
3343 /*
3344 * If the chunk becomes empty, consider giving memory back to the host OS.
3345 *
3346 * The current strategy is to try give it back if there are other chunks
3347 * in this free list, meaning if there are at least 240 free pages in this
3348 * category. Note that since there are probably mappings of the chunk,
3349 * it won't be freed up instantly, which probably screws up this logic
3350 * a bit...
3351 */
3352 /** @todo Do this on the way out. */
3353 if (RT_UNLIKELY( pChunk->cFree == GMM_CHUNK_NUM_PAGES
3354 && pChunk->pFreeNext
3355 && pChunk->pFreePrev /** @todo this is probably misfiring, see reset... */
3356 && !pGMM->fLegacyAllocationMode))
3357 gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3358
3359}
3360
3361
3362/**
3363 * Frees a shared page, the page is known to exist and be valid and such.
3364 *
3365 * @param pGMM Pointer to the GMM instance.
3366 * @param pGVM Pointer to the GVM instance.
3367 * @param idPage The page id.
3368 * @param pPage The page structure.
3369 */
3370DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3371{
3372 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3373 Assert(pChunk);
3374 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3375 Assert(pChunk->cShared > 0);
3376 Assert(pGMM->cSharedPages > 0);
3377 Assert(pGMM->cAllocatedPages > 0);
3378 Assert(!pPage->Shared.cRefs);
3379
3380 pChunk->cShared--;
3381 pGMM->cAllocatedPages--;
3382 pGMM->cSharedPages--;
3383 gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3384}
3385
3386
3387/**
3388 * Frees a private page, the page is known to exist and be valid and such.
3389 *
3390 * @param pGMM Pointer to the GMM instance.
3391 * @param pGVM Pointer to the GVM instance.
3392 * @param idPage The page id.
3393 * @param pPage The page structure.
3394 */
3395DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3396{
3397 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3398 Assert(pChunk);
3399 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3400 Assert(pChunk->cPrivate > 0);
3401 Assert(pGMM->cAllocatedPages > 0);
3402
3403 pChunk->cPrivate--;
3404 pGMM->cAllocatedPages--;
3405 gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3406}
3407
3408
3409/**
3410 * Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3411 *
3412 * @returns VBox status code:
3413 * @retval xxx
3414 *
3415 * @param pGMM Pointer to the GMM instance data.
3416 * @param pGVM Pointer to the VM.
3417 * @param cPages The number of pages to free.
3418 * @param paPages Pointer to the page descriptors.
3419 * @param enmAccount The account this relates to.
3420 */
3421static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3422{
3423 /*
3424 * Check that the request isn't impossible wrt to the account status.
3425 */
3426 switch (enmAccount)
3427 {
3428 case GMMACCOUNT_BASE:
3429 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3430 {
3431 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3432 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3433 }
3434 break;
3435 case GMMACCOUNT_SHADOW:
3436 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages < cPages))
3437 {
3438 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
3439 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3440 }
3441 break;
3442 case GMMACCOUNT_FIXED:
3443 if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages < cPages))
3444 {
3445 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
3446 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3447 }
3448 break;
3449 default:
3450 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3451 }
3452
3453 /*
3454 * Walk the descriptors and free the pages.
3455 *
3456 * Statistics (except the account) are being updated as we go along,
3457 * unlike the alloc code. Also, stop on the first error.
3458 */
3459 int rc = VINF_SUCCESS;
3460 uint32_t iPage;
3461 for (iPage = 0; iPage < cPages; iPage++)
3462 {
3463 uint32_t idPage = paPages[iPage].idPage;
3464 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3465 if (RT_LIKELY(pPage))
3466 {
3467 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3468 {
3469 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3470 {
3471 Assert(pGVM->gmm.s.Stats.cPrivatePages);
3472 pGVM->gmm.s.Stats.cPrivatePages--;
3473 gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
3474 }
3475 else
3476 {
3477 Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3478 pPage->Private.hGVM, pGVM->hSelf));
3479 rc = VERR_GMM_NOT_PAGE_OWNER;
3480 break;
3481 }
3482 }
3483 else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3484 {
3485 Assert(pGVM->gmm.s.Stats.cSharedPages);
3486 Assert(pPage->Shared.cRefs);
3487#if defined(VBOX_WITH_PAGE_SHARING) && defined(VBOX_STRICT) && HC_ARCH_BITS == 64
3488 if (pPage->Shared.u14Checksum)
3489 {
3490 uint32_t uChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
3491 uChecksum &= UINT32_C(0x00003fff);
3492 AssertMsg(!uChecksum || uChecksum == pPage->Shared.u14Checksum,
3493 ("%#x vs %#x - idPage=%#x\n", uChecksum, pPage->Shared.u14Checksum, idPage));
3494 }
3495#endif
3496 pGVM->gmm.s.Stats.cSharedPages--;
3497 if (!--pPage->Shared.cRefs)
3498 gmmR0FreeSharedPage(pGMM, pGVM, idPage, pPage);
3499 else
3500 {
3501 Assert(pGMM->cDuplicatePages);
3502 pGMM->cDuplicatePages--;
3503 }
3504 }
3505 else
3506 {
3507 Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3508 rc = VERR_GMM_PAGE_ALREADY_FREE;
3509 break;
3510 }
3511 }
3512 else
3513 {
3514 Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3515 rc = VERR_GMM_PAGE_NOT_FOUND;
3516 break;
3517 }
3518 paPages[iPage].idPage = NIL_GMM_PAGEID;
3519 }
3520
3521 /*
3522 * Update the account.
3523 */
3524 switch (enmAccount)
3525 {
3526 case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= iPage; break;
3527 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= iPage; break;
3528 case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= iPage; break;
3529 default:
3530 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3531 }
3532
3533 /*
3534 * Any threshold stuff to be done here?
3535 */
3536
3537 return rc;
3538}
3539
3540
3541/**
3542 * Free one or more pages.
3543 *
3544 * This is typically used at reset time or power off.
3545 *
3546 * @returns VBox status code:
3547 * @retval xxx
3548 *
3549 * @param pVM Pointer to the VM.
3550 * @param idCpu The VCPU id.
3551 * @param cPages The number of pages to allocate.
3552 * @param paPages Pointer to the page descriptors containing the Page IDs for each page.
3553 * @param enmAccount The account this relates to.
3554 * @thread EMT.
3555 */
3556GMMR0DECL(int) GMMR0FreePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3557{
3558 LogFlow(("GMMR0FreePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
3559
3560 /*
3561 * Validate input and get the basics.
3562 */
3563 PGMM pGMM;
3564 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3565 PGVM pGVM;
3566 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3567 if (RT_FAILURE(rc))
3568 return rc;
3569
3570 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3571 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3572 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3573
3574 for (unsigned iPage = 0; iPage < cPages; iPage++)
3575 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3576 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
3577 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3578
3579 /*
3580 * Take the semaphore and call the worker function.
3581 */
3582 gmmR0MutexAcquire(pGMM);
3583 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3584 {
3585 rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3586 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3587 }
3588 else
3589 rc = VERR_GMM_IS_NOT_SANE;
3590 gmmR0MutexRelease(pGMM);
3591 LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3592 return rc;
3593}
3594
3595
3596/**
3597 * VMMR0 request wrapper for GMMR0FreePages.
3598 *
3599 * @returns see GMMR0FreePages.
3600 * @param pVM Pointer to the VM.
3601 * @param idCpu The VCPU id.
3602 * @param pReq Pointer to the request packet.
3603 */
3604GMMR0DECL(int) GMMR0FreePagesReq(PVM pVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3605{
3606 /*
3607 * Validate input and pass it on.
3608 */
3609 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3610 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3611 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3612 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3613 VERR_INVALID_PARAMETER);
3614 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3615 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3616 VERR_INVALID_PARAMETER);
3617
3618 return GMMR0FreePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3619}
3620
3621
3622/**
3623 * Report back on a memory ballooning request.
3624 *
3625 * The request may or may not have been initiated by the GMM. If it was initiated
3626 * by the GMM it is important that this function is called even if no pages were
3627 * ballooned.
3628 *
3629 * @returns VBox status code:
3630 * @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3631 * @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3632 * @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3633 * indicating that we won't necessarily have sufficient RAM to boot
3634 * the VM again and that it should pause until this changes (we'll try
3635 * balloon some other VM). (For standard deflate we have little choice
3636 * but to hope the VM won't use the memory that was returned to it.)
3637 *
3638 * @param pVM Pointer to the VM.
3639 * @param idCpu The VCPU id.
3640 * @param enmAction Inflate/deflate/reset.
3641 * @param cBalloonedPages The number of pages that was ballooned.
3642 *
3643 * @thread EMT.
3644 */
3645GMMR0DECL(int) GMMR0BalloonedPages(PVM pVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3646{
3647 LogFlow(("GMMR0BalloonedPages: pVM=%p enmAction=%d cBalloonedPages=%#x\n",
3648 pVM, enmAction, cBalloonedPages));
3649
3650 AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3651
3652 /*
3653 * Validate input and get the basics.
3654 */
3655 PGMM pGMM;
3656 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3657 PGVM pGVM;
3658 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3659 if (RT_FAILURE(rc))
3660 return rc;
3661
3662 /*
3663 * Take the semaphore and do some more validations.
3664 */
3665 gmmR0MutexAcquire(pGMM);
3666 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3667 {
3668 switch (enmAction)
3669 {
3670 case GMMBALLOONACTION_INFLATE:
3671 {
3672 if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cBalloonedPages
3673 <= pGVM->gmm.s.Stats.Reserved.cBasePages))
3674 {
3675 /*
3676 * Record the ballooned memory.
3677 */
3678 pGMM->cBalloonedPages += cBalloonedPages;
3679 if (pGVM->gmm.s.Stats.cReqBalloonedPages)
3680 {
3681 /* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3682 AssertFailed();
3683
3684 pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3685 pGVM->gmm.s.Stats.cReqActuallyBalloonedPages += cBalloonedPages;
3686 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n",
3687 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages,
3688 pGVM->gmm.s.Stats.cReqBalloonedPages, pGVM->gmm.s.Stats.cReqActuallyBalloonedPages));
3689 }
3690 else
3691 {
3692 pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3693 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3694 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3695 }
3696 }
3697 else
3698 {
3699 Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3700 pGVM->gmm.s.Stats.Allocated.cBasePages, pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages,
3701 pGVM->gmm.s.Stats.Reserved.cBasePages));
3702 rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3703 }
3704 break;
3705 }
3706
3707 case GMMBALLOONACTION_DEFLATE:
3708 {
3709 /* Deflate. */
3710 if (pGVM->gmm.s.Stats.cBalloonedPages >= cBalloonedPages)
3711 {
3712 /*
3713 * Record the ballooned memory.
3714 */
3715 Assert(pGMM->cBalloonedPages >= cBalloonedPages);
3716 pGMM->cBalloonedPages -= cBalloonedPages;
3717 pGVM->gmm.s.Stats.cBalloonedPages -= cBalloonedPages;
3718 if (pGVM->gmm.s.Stats.cReqDeflatePages)
3719 {
3720 AssertFailed(); /* This is path is for later. */
3721 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
3722 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages, pGVM->gmm.s.Stats.cReqDeflatePages));
3723
3724 /*
3725 * Anything we need to do here now when the request has been completed?
3726 */
3727 pGVM->gmm.s.Stats.cReqDeflatePages = 0;
3728 }
3729 else
3730 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3731 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3732 }
3733 else
3734 {
3735 Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages));
3736 rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
3737 }
3738 break;
3739 }
3740
3741 case GMMBALLOONACTION_RESET:
3742 {
3743 /* Reset to an empty balloon. */
3744 Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
3745
3746 pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
3747 pGVM->gmm.s.Stats.cBalloonedPages = 0;
3748 break;
3749 }
3750
3751 default:
3752 rc = VERR_INVALID_PARAMETER;
3753 break;
3754 }
3755 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3756 }
3757 else
3758 rc = VERR_GMM_IS_NOT_SANE;
3759
3760 gmmR0MutexRelease(pGMM);
3761 LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
3762 return rc;
3763}
3764
3765
3766/**
3767 * VMMR0 request wrapper for GMMR0BalloonedPages.
3768 *
3769 * @returns see GMMR0BalloonedPages.
3770 * @param pVM Pointer to the VM.
3771 * @param idCpu The VCPU id.
3772 * @param pReq Pointer to the request packet.
3773 */
3774GMMR0DECL(int) GMMR0BalloonedPagesReq(PVM pVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
3775{
3776 /*
3777 * Validate input and pass it on.
3778 */
3779 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3780 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3781 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
3782 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
3783 VERR_INVALID_PARAMETER);
3784
3785 return GMMR0BalloonedPages(pVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
3786}
3787
3788/**
3789 * Return memory statistics for the hypervisor
3790 *
3791 * @returns VBox status code:
3792 * @param pVM Pointer to the VM.
3793 * @param pReq Pointer to the request packet.
3794 */
3795GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PVM pVM, PGMMMEMSTATSREQ pReq)
3796{
3797 /*
3798 * Validate input and pass it on.
3799 */
3800 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3801 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3802 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3803 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3804 VERR_INVALID_PARAMETER);
3805
3806 /*
3807 * Validate input and get the basics.
3808 */
3809 PGMM pGMM;
3810 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3811 pReq->cAllocPages = pGMM->cAllocatedPages;
3812 pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
3813 pReq->cBalloonedPages = pGMM->cBalloonedPages;
3814 pReq->cMaxPages = pGMM->cMaxPages;
3815 pReq->cSharedPages = pGMM->cDuplicatePages;
3816 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3817
3818 return VINF_SUCCESS;
3819}
3820
3821/**
3822 * Return memory statistics for the VM
3823 *
3824 * @returns VBox status code:
3825 * @param pVM Pointer to the VM.
3826 * @parma idCpu Cpu id.
3827 * @param pReq Pointer to the request packet.
3828 */
3829GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PVM pVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
3830{
3831 /*
3832 * Validate input and pass it on.
3833 */
3834 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3835 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3836 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3837 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3838 VERR_INVALID_PARAMETER);
3839
3840 /*
3841 * Validate input and get the basics.
3842 */
3843 PGMM pGMM;
3844 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3845 PGVM pGVM;
3846 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3847 if (RT_FAILURE(rc))
3848 return rc;
3849
3850 /*
3851 * Take the semaphore and do some more validations.
3852 */
3853 gmmR0MutexAcquire(pGMM);
3854 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3855 {
3856 pReq->cAllocPages = pGVM->gmm.s.Stats.Allocated.cBasePages;
3857 pReq->cBalloonedPages = pGVM->gmm.s.Stats.cBalloonedPages;
3858 pReq->cMaxPages = pGVM->gmm.s.Stats.Reserved.cBasePages;
3859 pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
3860 }
3861 else
3862 rc = VERR_GMM_IS_NOT_SANE;
3863
3864 gmmR0MutexRelease(pGMM);
3865 LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
3866 return rc;
3867}
3868
3869
3870/**
3871 * Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
3872 *
3873 * Don't call this in legacy allocation mode!
3874 *
3875 * @returns VBox status code.
3876 * @param pGMM Pointer to the GMM instance data.
3877 * @param pGVM Pointer to the Global VM structure.
3878 * @param pChunk Pointer to the chunk to be unmapped.
3879 */
3880static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
3881{
3882 Assert(!pGMM->fLegacyAllocationMode);
3883
3884 /*
3885 * Find the mapping and try unmapping it.
3886 */
3887 uint32_t cMappings = pChunk->cMappingsX;
3888 for (uint32_t i = 0; i < cMappings; i++)
3889 {
3890 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3891 if (pChunk->paMappingsX[i].pGVM == pGVM)
3892 {
3893 /* unmap */
3894 int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
3895 if (RT_SUCCESS(rc))
3896 {
3897 /* update the record. */
3898 cMappings--;
3899 if (i < cMappings)
3900 pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
3901 pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
3902 pChunk->paMappingsX[cMappings].pGVM = NULL;
3903 Assert(pChunk->cMappingsX - 1U == cMappings);
3904 pChunk->cMappingsX = cMappings;
3905 }
3906
3907 return rc;
3908 }
3909 }
3910
3911 Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3912 return VERR_GMM_CHUNK_NOT_MAPPED;
3913}
3914
3915
3916/**
3917 * Unmaps a chunk previously mapped into the address space of the current process.
3918 *
3919 * @returns VBox status code.
3920 * @param pGMM Pointer to the GMM instance data.
3921 * @param pGVM Pointer to the Global VM structure.
3922 * @param pChunk Pointer to the chunk to be unmapped.
3923 */
3924static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3925{
3926 if (!pGMM->fLegacyAllocationMode)
3927 {
3928 /*
3929 * Lock the chunk and if possible leave the giant GMM lock.
3930 */
3931 GMMR0CHUNKMTXSTATE MtxState;
3932 int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
3933 fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
3934 if (RT_SUCCESS(rc))
3935 {
3936 rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3937 gmmR0ChunkMutexRelease(&MtxState, pChunk);
3938 }
3939 return rc;
3940 }
3941
3942 if (pChunk->hGVM == pGVM->hSelf)
3943 return VINF_SUCCESS;
3944
3945 Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x (legacy)\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3946 return VERR_GMM_CHUNK_NOT_MAPPED;
3947}
3948
3949
3950/**
3951 * Worker for gmmR0MapChunk.
3952 *
3953 * @returns VBox status code.
3954 * @param pGMM Pointer to the GMM instance data.
3955 * @param pGVM Pointer to the Global VM structure.
3956 * @param pChunk Pointer to the chunk to be mapped.
3957 * @param ppvR3 Where to store the ring-3 address of the mapping.
3958 * In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3959 * contain the address of the existing mapping.
3960 */
3961static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
3962{
3963 /*
3964 * If we're in legacy mode this is simple.
3965 */
3966 if (pGMM->fLegacyAllocationMode)
3967 {
3968 if (pChunk->hGVM != pGVM->hSelf)
3969 {
3970 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3971 return VERR_GMM_CHUNK_NOT_FOUND;
3972 }
3973
3974 *ppvR3 = RTR0MemObjAddressR3(pChunk->hMemObj);
3975 return VINF_SUCCESS;
3976 }
3977
3978 /*
3979 * Check to see if the chunk is already mapped.
3980 */
3981 for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
3982 {
3983 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3984 if (pChunk->paMappingsX[i].pGVM == pGVM)
3985 {
3986 *ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
3987 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3988#ifdef VBOX_WITH_PAGE_SHARING
3989 /* The ring-3 chunk cache can be out of sync; don't fail. */
3990 return VINF_SUCCESS;
3991#else
3992 return VERR_GMM_CHUNK_ALREADY_MAPPED;
3993#endif
3994 }
3995 }
3996
3997 /*
3998 * Do the mapping.
3999 */
4000 RTR0MEMOBJ hMapObj;
4001 int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4002 if (RT_SUCCESS(rc))
4003 {
4004 /* reallocate the array? assumes few users per chunk (usually one). */
4005 unsigned iMapping = pChunk->cMappingsX;
4006 if ( iMapping <= 3
4007 || (iMapping & 3) == 0)
4008 {
4009 unsigned cNewSize = iMapping <= 3
4010 ? iMapping + 1
4011 : iMapping + 4;
4012 Assert(cNewSize < 4 || RT_ALIGN_32(cNewSize, 4) == cNewSize);
4013 if (RT_UNLIKELY(cNewSize > UINT16_MAX))
4014 {
4015 rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4016 return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
4017 }
4018
4019 void *pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize * sizeof(pChunk->paMappingsX[0]));
4020 if (RT_UNLIKELY(!pvMappings))
4021 {
4022 rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4023 return VERR_NO_MEMORY;
4024 }
4025 pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
4026 }
4027
4028 /* insert new entry */
4029 pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
4030 pChunk->paMappingsX[iMapping].pGVM = pGVM;
4031 Assert(pChunk->cMappingsX == iMapping);
4032 pChunk->cMappingsX = iMapping + 1;
4033
4034 *ppvR3 = RTR0MemObjAddressR3(hMapObj);
4035 }
4036
4037 return rc;
4038}
4039
4040
4041/**
4042 * Maps a chunk into the user address space of the current process.
4043 *
4044 * @returns VBox status code.
4045 * @param pGMM Pointer to the GMM instance data.
4046 * @param pGVM Pointer to the Global VM structure.
4047 * @param pChunk Pointer to the chunk to be mapped.
4048 * @param fRelaxedSem Whether we can release the semaphore while doing the
4049 * mapping (@c true) or not.
4050 * @param ppvR3 Where to store the ring-3 address of the mapping.
4051 * In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4052 * contain the address of the existing mapping.
4053 */
4054static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
4055{
4056 /*
4057 * Take the chunk lock and leave the giant GMM lock when possible, then
4058 * call the worker function.
4059 */
4060 GMMR0CHUNKMTXSTATE MtxState;
4061 int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4062 fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4063 if (RT_SUCCESS(rc))
4064 {
4065 rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
4066 gmmR0ChunkMutexRelease(&MtxState, pChunk);
4067 }
4068
4069 return rc;
4070}
4071
4072
4073
4074#if defined(VBOX_WITH_PAGE_SHARING) || (defined(VBOX_STRICT) && HC_ARCH_BITS == 64)
4075/**
4076 * Check if a chunk is mapped into the specified VM
4077 *
4078 * @returns mapped yes/no
4079 * @param pGMM Pointer to the GMM instance.
4080 * @param pGVM Pointer to the Global VM structure.
4081 * @param pChunk Pointer to the chunk to be mapped.
4082 * @param ppvR3 Where to store the ring-3 address of the mapping.
4083 */
4084static bool gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4085{
4086 GMMR0CHUNKMTXSTATE MtxState;
4087 gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
4088 for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4089 {
4090 Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4091 if (pChunk->paMappingsX[i].pGVM == pGVM)
4092 {
4093 *ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4094 gmmR0ChunkMutexRelease(&MtxState, pChunk);
4095 return true;
4096 }
4097 }
4098 *ppvR3 = NULL;
4099 gmmR0ChunkMutexRelease(&MtxState, pChunk);
4100 return false;
4101}
4102#endif /* VBOX_WITH_PAGE_SHARING || (VBOX_STRICT && 64-BIT) */
4103
4104
4105/**
4106 * Map a chunk and/or unmap another chunk.
4107 *
4108 * The mapping and unmapping applies to the current process.
4109 *
4110 * This API does two things because it saves a kernel call per mapping when
4111 * when the ring-3 mapping cache is full.
4112 *
4113 * @returns VBox status code.
4114 * @param pVM The VM.
4115 * @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
4116 * @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
4117 * @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
4118 * @thread EMT
4119 */
4120GMMR0DECL(int) GMMR0MapUnmapChunk(PVM pVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
4121{
4122 LogFlow(("GMMR0MapUnmapChunk: pVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
4123 pVM, idChunkMap, idChunkUnmap, ppvR3));
4124
4125 /*
4126 * Validate input and get the basics.
4127 */
4128 PGMM pGMM;
4129 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4130 PGVM pGVM;
4131 int rc = GVMMR0ByVM(pVM, &pGVM);
4132 if (RT_FAILURE(rc))
4133 return rc;
4134
4135 AssertCompile(NIL_GMM_CHUNKID == 0);
4136 AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
4137 AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
4138
4139 if ( idChunkMap == NIL_GMM_CHUNKID
4140 && idChunkUnmap == NIL_GMM_CHUNKID)
4141 return VERR_INVALID_PARAMETER;
4142
4143 if (idChunkMap != NIL_GMM_CHUNKID)
4144 {
4145 AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
4146 *ppvR3 = NIL_RTR3PTR;
4147 }
4148
4149 /*
4150 * Take the semaphore and do the work.
4151 *
4152 * The unmapping is done last since it's easier to undo a mapping than
4153 * undoing an unmapping. The ring-3 mapping cache cannot not be so big
4154 * that it pushes the user virtual address space to within a chunk of
4155 * it it's limits, so, no problem here.
4156 */
4157 gmmR0MutexAcquire(pGMM);
4158 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4159 {
4160 PGMMCHUNK pMap = NULL;
4161 if (idChunkMap != NIL_GVM_HANDLE)
4162 {
4163 pMap = gmmR0GetChunk(pGMM, idChunkMap);
4164 if (RT_LIKELY(pMap))
4165 rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /*fRelaxedSem*/, ppvR3);
4166 else
4167 {
4168 Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
4169 rc = VERR_GMM_CHUNK_NOT_FOUND;
4170 }
4171 }
4172/** @todo split this operation, the bail out might (theoretcially) not be
4173 * entirely safe. */
4174
4175 if ( idChunkUnmap != NIL_GMM_CHUNKID
4176 && RT_SUCCESS(rc))
4177 {
4178 PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
4179 if (RT_LIKELY(pUnmap))
4180 rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /*fRelaxedSem*/);
4181 else
4182 {
4183 Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
4184 rc = VERR_GMM_CHUNK_NOT_FOUND;
4185 }
4186
4187 if (RT_FAILURE(rc) && pMap)
4188 gmmR0UnmapChunk(pGMM, pGVM, pMap, false /*fRelaxedSem*/);
4189 }
4190
4191 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4192 }
4193 else
4194 rc = VERR_GMM_IS_NOT_SANE;
4195 gmmR0MutexRelease(pGMM);
4196
4197 LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
4198 return rc;
4199}
4200
4201
4202/**
4203 * VMMR0 request wrapper for GMMR0MapUnmapChunk.
4204 *
4205 * @returns see GMMR0MapUnmapChunk.
4206 * @param pVM Pointer to the VM.
4207 * @param pReq Pointer to the request packet.
4208 */
4209GMMR0DECL(int) GMMR0MapUnmapChunkReq(PVM pVM, PGMMMAPUNMAPCHUNKREQ pReq)
4210{
4211 /*
4212 * Validate input and pass it on.
4213 */
4214 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4215 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4216 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4217
4218 return GMMR0MapUnmapChunk(pVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
4219}
4220
4221
4222/**
4223 * Legacy mode API for supplying pages.
4224 *
4225 * The specified user address points to a allocation chunk sized block that
4226 * will be locked down and used by the GMM when the GM asks for pages.
4227 *
4228 * @returns VBox status code.
4229 * @param pVM Pointer to the VM.
4230 * @param idCpu The VCPU id.
4231 * @param pvR3 Pointer to the chunk size memory block to lock down.
4232 */
4233GMMR0DECL(int) GMMR0SeedChunk(PVM pVM, VMCPUID idCpu, RTR3PTR pvR3)
4234{
4235 /*
4236 * Validate input and get the basics.
4237 */
4238 PGMM pGMM;
4239 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4240 PGVM pGVM;
4241 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4242 if (RT_FAILURE(rc))
4243 return rc;
4244
4245 AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
4246 AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
4247
4248 if (!pGMM->fLegacyAllocationMode)
4249 {
4250 Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
4251 return VERR_NOT_SUPPORTED;
4252 }
4253
4254 /*
4255 * Lock the memory and add it as new chunk with our hGVM.
4256 * (The GMM locking is done inside gmmR0RegisterChunk.)
4257 */
4258 RTR0MEMOBJ MemObj;
4259 rc = RTR0MemObjLockUser(&MemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4260 if (RT_SUCCESS(rc))
4261 {
4262 rc = gmmR0RegisterChunk(pGMM, &pGVM->gmm.s.Private, MemObj, pGVM->hSelf, 0 /*fChunkFlags*/, NULL);
4263 if (RT_SUCCESS(rc))
4264 gmmR0MutexRelease(pGMM);
4265 else
4266 RTR0MemObjFree(MemObj, false /* fFreeMappings */);
4267 }
4268
4269 LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
4270 return rc;
4271}
4272
4273#ifdef VBOX_WITH_PAGE_SHARING
4274
4275# ifdef VBOX_STRICT
4276/**
4277 * For checksumming shared pages in strict builds.
4278 *
4279 * The purpose is making sure that a page doesn't change.
4280 *
4281 * @returns Checksum, 0 on failure.
4282 * @param GMM The GMM instance data.
4283 * @param idPage The page ID.
4284 */
4285static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage)
4286{
4287 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4288 AssertMsgReturn(pChunk, ("idPage=%#x\n", idPage), 0);
4289
4290 uint8_t *pbChunk;
4291 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4292 return 0;
4293 uint8_t const *pbPage = pbChunk + ((idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4294
4295 return RTCrc32(pbPage, PAGE_SIZE);
4296}
4297# endif /* VBOX_STRICT */
4298
4299
4300/**
4301 * Calculates the module hash value.
4302 *
4303 * @returns Hash value.
4304 * @param pszModuleName The module name.
4305 * @param pszVersion The module version string.
4306 */
4307static uint32_t gmmR0ShModCalcHash(const char *pszModuleName, const char *pszVersion)
4308{
4309 return RTStrHash1ExN(3, pszModuleName, RTSTR_MAX, "::", (size_t)2, pszVersion, RTSTR_MAX);
4310}
4311
4312
4313/**
4314 * Finds a global module.
4315 *
4316 * @returns Pointer to the global module on success, NULL if not found.
4317 * @param pGMM The GMM instance data.
4318 * @param uHash The hash as calculated by gmmR0ShModCalcHash.
4319 * @param cbModule The module size.
4320 * @param enmGuestOS The guest OS type.
4321 * @param pszModuleName The module name.
4322 * @param pszVersion The module version.
4323 */
4324static PGMMSHAREDMODULE gmmR0ShModFindGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4325 uint32_t cRegions, const char *pszModuleName, const char *pszVersion,
4326 struct VMMDEVSHAREDREGIONDESC const *paRegions)
4327{
4328 for (PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTAvllU32Get(&pGMM->pGlobalSharedModuleTree, uHash);
4329 pGblMod;
4330 pGblMod = (PGMMSHAREDMODULE)pGblMod->Core.pList)
4331 {
4332 if (pGblMod->cbModule != cbModule)
4333 continue;
4334 if (pGblMod->enmGuestOS != enmGuestOS)
4335 continue;
4336 if (pGblMod->cRegions != cRegions)
4337 continue;
4338 if (strcmp(pGblMod->szName, pszModuleName))
4339 continue;
4340 if (strcmp(pGblMod->szVersion, pszVersion))
4341 continue;
4342
4343 uint32_t i;
4344 for (i = 0; i < cRegions; i++)
4345 {
4346 uint32_t off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4347 if (pGblMod->aRegions[i].off != off)
4348 break;
4349
4350 uint32_t cb = RT_ALIGN_32(paRegions[i].cbRegion + off, PAGE_SIZE);
4351 if (pGblMod->aRegions[i].cb != cb)
4352 break;
4353 }
4354
4355 if (i == cRegions)
4356 return pGblMod;
4357 }
4358
4359 return NULL;
4360}
4361
4362
4363/**
4364 * Creates a new global module.
4365 *
4366 * @returns VBox status code.
4367 * @param pGMM The GMM instance data.
4368 * @param uHash The hash as calculated by gmmR0ShModCalcHash.
4369 * @param cbModule The module size.
4370 * @param enmGuestOS The guest OS type.
4371 * @param cRegions The number of regions.
4372 * @param pszModuleName The module name.
4373 * @param pszVersion The module version.
4374 * @param paRegions The region descriptions.
4375 * @param ppGblMod Where to return the new module on success.
4376 */
4377static int gmmR0ShModNewGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4378 uint32_t cRegions, const char *pszModuleName, const char *pszVersion,
4379 struct VMMDEVSHAREDREGIONDESC const *paRegions, PGMMSHAREDMODULE *ppGblMod)
4380{
4381 Log(("gmmR0ShModNewGlobal: %s %s size %#x os %u rgn %u\n", pszModuleName, pszVersion, cbModule, cRegions));
4382 if (pGMM->cShareableModules >= GMM_MAX_SHARED_GLOBAL_MODULES)
4383 {
4384 Log(("gmmR0ShModNewGlobal: Too many modules\n"));
4385 return VERR_GMM_TOO_MANY_GLOBAL_MODULES;
4386 }
4387
4388 PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULE, aRegions[cRegions]));
4389 if (!pGblMod)
4390 {
4391 Log(("gmmR0ShModNewGlobal: No memory\n"));
4392 return VERR_NO_MEMORY;
4393 }
4394
4395 pGblMod->Core.Key = uHash;
4396 pGblMod->cbModule = cbModule;
4397 pGblMod->cRegions = cRegions;
4398 pGblMod->cUsers = 1;
4399 pGblMod->enmGuestOS = enmGuestOS;
4400 strcpy(pGblMod->szName, pszModuleName);
4401 strcpy(pGblMod->szVersion, pszVersion);
4402
4403 for (uint32_t i = 0; i < cRegions; i++)
4404 {
4405 Log(("gmmR0ShModNewGlobal: rgn[%u]=%RGvLB%#x\n", i, paRegions[i].GCRegionAddr, paRegions[i].cbRegion));
4406 pGblMod->aRegions[i].off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4407 pGblMod->aRegions[i].cb = paRegions[i].cbRegion + pGblMod->aRegions[i].off;
4408 pGblMod->aRegions[i].cb = RT_ALIGN_32(pGblMod->aRegions[i].cb, PAGE_SIZE);
4409 pGblMod->aRegions[i].paidPages = NULL; /* allocated when needed. */
4410 }
4411
4412 bool fInsert = RTAvllU32Insert(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4413 Assert(fInsert); NOREF(fInsert);
4414 pGMM->cShareableModules++;
4415
4416 *ppGblMod = pGblMod;
4417 return VINF_SUCCESS;
4418}
4419
4420
4421/**
4422 * Deletes a global module which is no longer referenced by anyone.
4423 *
4424 * @param pGMM The GMM instance data.
4425 * @param pGblMod The module to delete.
4426 */
4427static void gmmR0ShModDeleteGlobal(PGMM pGMM, PGMMSHAREDMODULE pGblMod)
4428{
4429 Assert(pGblMod->cUsers == 0);
4430 Assert(pGMM->cShareableModules > 0 && pGMM->cShareableModules <= GMM_MAX_SHARED_GLOBAL_MODULES);
4431
4432 void *pvTest = RTAvllU32RemoveNode(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4433 Assert(pvTest == pGblMod); NOREF(pvTest);
4434 pGMM->cShareableModules--;
4435
4436 uint32_t i = pGblMod->cRegions;
4437 while (i-- > 0)
4438 {
4439 if (pGblMod->aRegions[i].paidPages)
4440 {
4441 /* We don't doing anything to the pages as they are handled by the
4442 copy-on-write mechanism in PGM. */
4443 RTMemFree(pGblMod->aRegions[i].paidPages);
4444 pGblMod->aRegions[i].paidPages = NULL;
4445 }
4446 }
4447 RTMemFree(pGblMod);
4448}
4449
4450
4451static int gmmR0ShModNewPerVM(PGVM pGVM, RTGCPTR GCBaseAddr, uint32_t cRegions, const VMMDEVSHAREDREGIONDESC *paRegions,
4452 PGMMSHAREDMODULEPERVM *ppRecVM)
4453{
4454 if (pGVM->gmm.s.Stats.cShareableModules >= GMM_MAX_SHARED_PER_VM_MODULES)
4455 return VERR_GMM_TOO_MANY_PER_VM_MODULES;
4456
4457 PGMMSHAREDMODULEPERVM pRecVM;
4458 pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULEPERVM, aRegionsGCPtrs[cRegions]));
4459 if (!pRecVM)
4460 return VERR_NO_MEMORY;
4461
4462 pRecVM->Core.Key = GCBaseAddr;
4463 for (uint32_t i = 0; i < cRegions; i++)
4464 pRecVM->aRegionsGCPtrs[i] = paRegions[i].GCRegionAddr;
4465
4466 bool fInsert = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4467 Assert(fInsert); NOREF(fInsert);
4468 pGVM->gmm.s.Stats.cShareableModules++;
4469
4470 *ppRecVM = pRecVM;
4471 return VINF_SUCCESS;
4472}
4473
4474
4475static void gmmR0ShModDeletePerVM(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULEPERVM pRecVM, bool fRemove)
4476{
4477 /*
4478 * Free the per-VM module.
4479 */
4480 PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
4481 pRecVM->pGlobalModule = NULL;
4482
4483 if (fRemove)
4484 {
4485 void *pvTest = RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, pRecVM->Core.Key);
4486 Assert(pvTest == &pRecVM->Core);
4487 }
4488
4489 RTMemFree(pRecVM);
4490
4491 /*
4492 * Release the global module.
4493 * (In the registration bailout case, it might not be.)
4494 */
4495 if (pGblMod)
4496 {
4497 Assert(pGblMod->cUsers > 0);
4498 pGblMod->cUsers--;
4499 if (pGblMod->cUsers == 0)
4500 gmmR0ShModDeleteGlobal(pGMM, pGblMod);
4501 }
4502}
4503
4504#endif /* VBOX_WITH_PAGE_SHARING */
4505
4506/**
4507 * Registers a new shared module for the VM.
4508 *
4509 * @returns VBox status code.
4510 * @param pVM Pointer to the VM.
4511 * @param idCpu The VCPU id.
4512 * @param enmGuestOS The guest OS type.
4513 * @param pszModuleName The module name.
4514 * @param pszVersion The module version.
4515 * @param GCPtrModBase The module base address.
4516 * @param cbModule The module size.
4517 * @param cRegions The mumber of shared region descriptors.
4518 * @param paRegions Pointer to an array of shared region(s).
4519 */
4520GMMR0DECL(int) GMMR0RegisterSharedModule(PVM pVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char *pszModuleName,
4521 char *pszVersion, RTGCPTR GCPtrModBase, uint32_t cbModule,
4522 uint32_t cRegions, struct VMMDEVSHAREDREGIONDESC const *paRegions)
4523{
4524#ifdef VBOX_WITH_PAGE_SHARING
4525 /*
4526 * Validate input and get the basics.
4527 *
4528 * Note! Turns out the module size does necessarily match the size of the
4529 * regions. (iTunes on XP)
4530 */
4531 PGMM pGMM;
4532 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4533 PGVM pGVM;
4534 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4535 if (RT_FAILURE(rc))
4536 return rc;
4537
4538 if (RT_UNLIKELY(cRegions > VMMDEVSHAREDREGIONDESC_MAX))
4539 return VERR_GMM_TOO_MANY_REGIONS;
4540
4541 if (RT_UNLIKELY(cbModule == 0 || cbModule > _1G))
4542 return VERR_GMM_BAD_SHARED_MODULE_SIZE;
4543
4544 uint32_t cbTotal = 0;
4545 for (uint32_t i = 0; i < cRegions; i++)
4546 {
4547 if (RT_UNLIKELY(paRegions[i].cbRegion == 0 || paRegions[i].cbRegion > _1G))
4548 return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4549
4550 cbTotal += paRegions[i].cbRegion;
4551 if (RT_UNLIKELY(cbTotal > _1G))
4552 return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4553 }
4554
4555 AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4556 if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4557 return VERR_GMM_MODULE_NAME_TOO_LONG;
4558
4559 AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4560 if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4561 return VERR_GMM_MODULE_NAME_TOO_LONG;
4562
4563 uint32_t const uHash = gmmR0ShModCalcHash(pszModuleName, pszVersion);
4564 Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x hash %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule, uHash));
4565
4566 /*
4567 * Take the semaphore and do some more validations.
4568 */
4569 gmmR0MutexAcquire(pGMM);
4570 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4571 {
4572 /*
4573 * Check if this module is already locally registered and register
4574 * it if it isn't. The base address is a unique module identifier
4575 * locally.
4576 */
4577 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4578 bool fNewModule = pRecVM == NULL;
4579 if (fNewModule)
4580 {
4581 rc = gmmR0ShModNewPerVM(pGVM, GCPtrModBase, cRegions, paRegions, &pRecVM);
4582 if (RT_SUCCESS(rc))
4583 {
4584 /*
4585 * Find a matching global module, register a new one if needed.
4586 */
4587 PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4588 pszModuleName, pszVersion, paRegions);
4589 if (!pGblMod)
4590 {
4591 Assert(fNewModule);
4592 rc = gmmR0ShModNewGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4593 pszModuleName, pszVersion, paRegions, &pGblMod);
4594 if (RT_SUCCESS(rc))
4595 {
4596 pRecVM->pGlobalModule = pGblMod; /* (One referenced returned by gmmR0ShModNewGlobal.) */
4597 Log(("GMMR0RegisterSharedModule: new module %s %s\n", pszModuleName, pszVersion));
4598 }
4599 else
4600 gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /*fRemove*/);
4601 }
4602 else
4603 {
4604 Assert(pGblMod->cUsers > 0 && pGblMod->cUsers < UINT32_MAX / 2);
4605 pGblMod->cUsers++;
4606 pRecVM->pGlobalModule = pGblMod;
4607
4608 Log(("GMMR0RegisterSharedModule: new per vm module %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4609 }
4610 }
4611 }
4612 else
4613 {
4614 /*
4615 * Attempt to re-register an existing module.
4616 */
4617 PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4618 pszModuleName, pszVersion, paRegions);
4619 if (pRecVM->pGlobalModule == pGblMod)
4620 {
4621 Log(("GMMR0RegisterSharedModule: already registered %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4622 rc = VINF_GMM_SHARED_MODULE_ALREADY_REGISTERED;
4623 }
4624 else
4625 {
4626 /** @todo may have to unregister+register when this happens in case it's caused
4627 * by VBoxService crashing and being restarted... */
4628 Log(("GMMR0RegisterSharedModule: Address clash!\n"
4629 " incoming at %RGvLB%#x %s %s rgns %u\n"
4630 " existing at %RGvLB%#x %s %s rgns %u\n",
4631 GCPtrModBase, cbModule, pszModuleName, pszVersion, cRegions,
4632 pRecVM->Core.Key, pRecVM->pGlobalModule->cbModule, pRecVM->pGlobalModule->szName,
4633 pRecVM->pGlobalModule->szVersion, pRecVM->pGlobalModule->cRegions));
4634 rc = VERR_GMM_SHARED_MODULE_ADDRESS_CLASH;
4635 }
4636 }
4637 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4638 }
4639 else
4640 rc = VERR_GMM_IS_NOT_SANE;
4641
4642 gmmR0MutexRelease(pGMM);
4643 return rc;
4644#else
4645
4646 NOREF(pVM); NOREF(idCpu); NOREF(enmGuestOS); NOREF(pszModuleName); NOREF(pszVersion);
4647 NOREF(GCPtrModBase); NOREF(cbModule); NOREF(cRegions); NOREF(paRegions);
4648 return VERR_NOT_IMPLEMENTED;
4649#endif
4650}
4651
4652
4653/**
4654 * VMMR0 request wrapper for GMMR0RegisterSharedModule.
4655 *
4656 * @returns see GMMR0RegisterSharedModule.
4657 * @param pVM Pointer to the VM.
4658 * @param idCpu The VCPU id.
4659 * @param pReq Pointer to the request packet.
4660 */
4661GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
4662{
4663 /*
4664 * Validate input and pass it on.
4665 */
4666 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4667 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4668 AssertMsgReturn(pReq->Hdr.cbReq >= sizeof(*pReq) && pReq->Hdr.cbReq == RT_UOFFSETOF(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4669
4670 /* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
4671 pReq->rc = GMMR0RegisterSharedModule(pVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion,
4672 pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
4673 return VINF_SUCCESS;
4674}
4675
4676
4677/**
4678 * Unregisters a shared module for the VM
4679 *
4680 * @returns VBox status code.
4681 * @param pVM Pointer to the VM.
4682 * @param idCpu The VCPU id.
4683 * @param pszModuleName The module name.
4684 * @param pszVersion The module version.
4685 * @param GCPtrModBase The module base address.
4686 * @param cbModule The module size.
4687 */
4688GMMR0DECL(int) GMMR0UnregisterSharedModule(PVM pVM, VMCPUID idCpu, char *pszModuleName, char *pszVersion,
4689 RTGCPTR GCPtrModBase, uint32_t cbModule)
4690{
4691#ifdef VBOX_WITH_PAGE_SHARING
4692 /*
4693 * Validate input and get the basics.
4694 */
4695 PGMM pGMM;
4696 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4697 PGVM pGVM;
4698 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4699 if (RT_FAILURE(rc))
4700 return rc;
4701
4702 AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4703 AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4704 if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4705 return VERR_GMM_MODULE_NAME_TOO_LONG;
4706 if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4707 return VERR_GMM_MODULE_NAME_TOO_LONG;
4708
4709 Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule));
4710
4711 /*
4712 * Take the semaphore and do some more validations.
4713 */
4714 gmmR0MutexAcquire(pGMM);
4715 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4716 {
4717 /*
4718 * Locate and remove the specified module.
4719 */
4720 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4721 if (pRecVM)
4722 {
4723 /** @todo Do we need to do more validations here, like that the
4724 * name + version + cbModule matches? */
4725 Assert(pRecVM->pGlobalModule);
4726 gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /*fRemove*/);
4727 }
4728 else
4729 rc = VERR_GMM_SHARED_MODULE_NOT_FOUND;
4730
4731 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4732 }
4733 else
4734 rc = VERR_GMM_IS_NOT_SANE;
4735
4736 gmmR0MutexRelease(pGMM);
4737 return rc;
4738#else
4739
4740 NOREF(pVM); NOREF(idCpu); NOREF(pszModuleName); NOREF(pszVersion); NOREF(GCPtrModBase); NOREF(cbModule);
4741 return VERR_NOT_IMPLEMENTED;
4742#endif
4743}
4744
4745
4746/**
4747 * VMMR0 request wrapper for GMMR0UnregisterSharedModule.
4748 *
4749 * @returns see GMMR0UnregisterSharedModule.
4750 * @param pVM Pointer to the VM.
4751 * @param idCpu The VCPU id.
4752 * @param pReq Pointer to the request packet.
4753 */
4754GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
4755{
4756 /*
4757 * Validate input and pass it on.
4758 */
4759 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4760 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4761 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
4762
4763 return GMMR0UnregisterSharedModule(pVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
4764}
4765
4766#ifdef VBOX_WITH_PAGE_SHARING
4767
4768/**
4769 * Increase the use count of a shared page, the page is known to exist and be valid and such.
4770 *
4771 * @param pGMM Pointer to the GMM instance.
4772 * @param pGVM Pointer to the GVM instance.
4773 * @param pPage The page structure.
4774 */
4775DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
4776{
4777 Assert(pGMM->cSharedPages > 0);
4778 Assert(pGMM->cAllocatedPages > 0);
4779
4780 pGMM->cDuplicatePages++;
4781
4782 pPage->Shared.cRefs++;
4783 pGVM->gmm.s.Stats.cSharedPages++;
4784 pGVM->gmm.s.Stats.Allocated.cBasePages++;
4785}
4786
4787
4788/**
4789 * Converts a private page to a shared page, the page is known to exist and be valid and such.
4790 *
4791 * @param pGMM Pointer to the GMM instance.
4792 * @param pGVM Pointer to the GVM instance.
4793 * @param HCPhys Host physical address
4794 * @param idPage The Page ID
4795 * @param pPage The page structure.
4796 */
4797DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage,
4798 PGMMSHAREDPAGEDESC pPageDesc)
4799{
4800 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4801 Assert(pChunk);
4802 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
4803 Assert(GMM_PAGE_IS_PRIVATE(pPage));
4804
4805 pChunk->cPrivate--;
4806 pChunk->cShared++;
4807
4808 pGMM->cSharedPages++;
4809
4810 pGVM->gmm.s.Stats.cSharedPages++;
4811 pGVM->gmm.s.Stats.cPrivatePages--;
4812
4813 /* Modify the page structure. */
4814 pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
4815 pPage->Shared.cRefs = 1;
4816#ifdef VBOX_STRICT
4817 pPageDesc->u32StrictChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
4818 pPage->Shared.u14Checksum = pPageDesc->u32StrictChecksum;
4819#else
4820 pPage->Shared.u14Checksum = 0;
4821#endif
4822 pPage->Shared.u2State = GMM_PAGE_STATE_SHARED;
4823}
4824
4825
4826static int gmmR0SharedModuleCheckPageFirstTime(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULE pModule,
4827 unsigned idxRegion, unsigned idxPage,
4828 PGMMSHAREDPAGEDESC pPageDesc, PGMMSHAREDREGIONDESC pGlobalRegion)
4829{
4830 NOREF(pModule);
4831
4832 /* Easy case: just change the internal page type. */
4833 PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->idPage);
4834 AssertMsgReturn(pPage, ("idPage=%#x (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x) #1\n",
4835 pPageDesc->idPage, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage),
4836 VERR_PGM_PHYS_INVALID_PAGE_ID);
4837
4838 AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
4839
4840 gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->idPage, pPage, pPageDesc);
4841
4842 /* Keep track of these references. */
4843 pGlobalRegion->paidPages[idxPage] = pPageDesc->idPage;
4844
4845 return VINF_SUCCESS;
4846}
4847
4848/**
4849 * Checks specified shared module range for changes
4850 *
4851 * Performs the following tasks:
4852 * - If a shared page is new, then it changes the GMM page type to shared and
4853 * returns it in the pPageDesc descriptor.
4854 * - If a shared page already exists, then it checks if the VM page is
4855 * identical and if so frees the VM page and returns the shared page in
4856 * pPageDesc descriptor.
4857 *
4858 * @remarks ASSUMES the caller has acquired the GMM semaphore!!
4859 *
4860 * @returns VBox status code.
4861 * @param pGMM Pointer to the GMM instance data.
4862 * @param pGVM Pointer to the GVM instance data.
4863 * @param pModule Module description
4864 * @param idxRegion Region index
4865 * @param idxPage Page index
4866 * @param paPageDesc Page descriptor
4867 */
4868GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, uint32_t idxRegion, uint32_t idxPage,
4869 PGMMSHAREDPAGEDESC pPageDesc)
4870{
4871 int rc;
4872 PGMM pGMM;
4873 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4874 pPageDesc->u32StrictChecksum = 0;
4875
4876 AssertMsgReturn(idxRegion < pModule->cRegions,
4877 ("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
4878 VERR_INVALID_PARAMETER);
4879
4880 uint32_t const cPages = pModule->aRegions[idxRegion].cb >> PAGE_SHIFT;
4881 AssertMsgReturn(idxPage < cPages,
4882 ("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
4883 VERR_INVALID_PARAMETER);
4884
4885 LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
4886
4887 /*
4888 * First time; create a page descriptor array.
4889 */
4890 PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
4891 if (!pGlobalRegion->paidPages)
4892 {
4893 Log(("Allocate page descriptor array for %d pages\n", cPages));
4894 pGlobalRegion->paidPages = (uint32_t *)RTMemAlloc(cPages * sizeof(pGlobalRegion->paidPages[0]));
4895 AssertReturn(pGlobalRegion->paidPages, VERR_NO_MEMORY);
4896
4897 /* Invalidate all descriptors. */
4898 uint32_t i = cPages;
4899 while (i-- > 0)
4900 pGlobalRegion->paidPages[i] = NIL_GMM_PAGEID;
4901 }
4902
4903 /*
4904 * We've seen this shared page for the first time?
4905 */
4906 if (pGlobalRegion->paidPages[idxPage] == NIL_GMM_PAGEID)
4907 {
4908 Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
4909 return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
4910 }
4911
4912 /*
4913 * We've seen it before...
4914 */
4915 Log(("Replace existing page guest %RGp host %RHp id %#x -> id %#x\n",
4916 pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->idPage, pGlobalRegion->paidPages[idxPage]));
4917 Assert(pPageDesc->idPage != pGlobalRegion->paidPages[idxPage]);
4918
4919 /*
4920 * Get the shared page source.
4921 */
4922 PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paidPages[idxPage]);
4923 AssertMsgReturn(pPage, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #2\n", pPageDesc->idPage, idxRegion, idxPage),
4924 VERR_PGM_PHYS_INVALID_PAGE_ID);
4925
4926 if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
4927 {
4928 /*
4929 * Page was freed at some point; invalidate this entry.
4930 */
4931 /** @todo this isn't really bullet proof. */
4932 Log(("Old shared page was freed -> create a new one\n"));
4933 pGlobalRegion->paidPages[idxPage] = NIL_GMM_PAGEID;
4934 return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
4935 }
4936
4937 Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
4938
4939 /*
4940 * Calculate the virtual address of the local page.
4941 */
4942 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pPageDesc->idPage >> GMM_CHUNKID_SHIFT);
4943 AssertMsgReturn(pChunk, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #4\n", pPageDesc->idPage, idxRegion, idxPage),
4944 VERR_PGM_PHYS_INVALID_PAGE_ID);
4945
4946 uint8_t *pbChunk;
4947 AssertMsgReturn(gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk),
4948 ("idPage=%#x (idxRegion=%#x idxPage=%#x) #3\n", pPageDesc->idPage, idxRegion, idxPage),
4949 VERR_PGM_PHYS_INVALID_PAGE_ID);
4950 uint8_t *pbLocalPage = pbChunk + ((pPageDesc->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4951
4952 /*
4953 * Calculate the virtual address of the shared page.
4954 */
4955 pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paidPages[idxPage] >> GMM_CHUNKID_SHIFT);
4956 Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
4957
4958 /*
4959 * Get the virtual address of the physical page; map the chunk into the VM
4960 * process if not already done.
4961 */
4962 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4963 {
4964 Log(("Map chunk into process!\n"));
4965 rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/, (PRTR3PTR)&pbChunk);
4966 AssertRCReturn(rc, rc);
4967 }
4968 uint8_t *pbSharedPage = pbChunk + ((pGlobalRegion->paidPages[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4969
4970#ifdef VBOX_STRICT
4971 pPageDesc->u32StrictChecksum = RTCrc32(pbSharedPage, PAGE_SIZE);
4972 uint32_t uChecksum = pPageDesc->u32StrictChecksum & UINT32_C(0x00003fff);
4973 AssertMsg(!uChecksum || uChecksum == pPage->Shared.u14Checksum || !pPage->Shared.u14Checksum,
4974 ("%#x vs %#x - idPage=%# - %s %s\n", uChecksum, pPage->Shared.u14Checksum,
4975 pGlobalRegion->paidPages[idxPage], pModule->szName, pModule->szVersion));
4976#endif
4977
4978 /** @todo write ASMMemComparePage. */
4979 if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
4980 {
4981 Log(("Unexpected differences found between local and shared page; skip\n"));
4982 /* Signal to the caller that this one hasn't changed. */
4983 pPageDesc->idPage = NIL_GMM_PAGEID;
4984 return VINF_SUCCESS;
4985 }
4986
4987 /*
4988 * Free the old local page.
4989 */
4990 GMMFREEPAGEDESC PageDesc;
4991 PageDesc.idPage = pPageDesc->idPage;
4992 rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
4993 AssertRCReturn(rc, rc);
4994
4995 gmmR0UseSharedPage(pGMM, pGVM, pPage);
4996
4997 /*
4998 * Pass along the new physical address & page id.
4999 */
5000 pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
5001 pPageDesc->idPage = pGlobalRegion->paidPages[idxPage];
5002
5003 return VINF_SUCCESS;
5004}
5005
5006
5007/**
5008 * RTAvlGCPtrDestroy callback.
5009 *
5010 * @returns 0 or VERR_GMM_INSTANCE.
5011 * @param pNode The node to destroy.
5012 * @param pvArgs Pointer to an argument packet.
5013 */
5014static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvArgs)
5015{
5016 gmmR0ShModDeletePerVM(((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGMM,
5017 ((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGVM,
5018 (PGMMSHAREDMODULEPERVM)pNode,
5019 false /*fRemove*/);
5020 return VINF_SUCCESS;
5021}
5022
5023
5024/**
5025 * Used by GMMR0CleanupVM to clean up shared modules.
5026 *
5027 * This is called without taking the GMM lock so that it can be yielded as
5028 * needed here.
5029 *
5030 * @param pGMM The GMM handle.
5031 * @param pGVM The global VM handle.
5032 */
5033static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
5034{
5035 gmmR0MutexAcquire(pGMM);
5036 GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
5037
5038 GMMR0SHMODPERVMDTORARGS Args;
5039 Args.pGVM = pGVM;
5040 Args.pGMM = pGMM;
5041 RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5042
5043 AssertMsg(pGVM->gmm.s.Stats.cShareableModules == 0, ("%d\n", pGVM->gmm.s.Stats.cShareableModules));
5044 pGVM->gmm.s.Stats.cShareableModules = 0;
5045
5046 gmmR0MutexRelease(pGMM);
5047}
5048
5049#endif /* VBOX_WITH_PAGE_SHARING */
5050
5051/**
5052 * Removes all shared modules for the specified VM
5053 *
5054 * @returns VBox status code.
5055 * @param pVM Pointer to the VM.
5056 * @param idCpu The VCPU id.
5057 */
5058GMMR0DECL(int) GMMR0ResetSharedModules(PVM pVM, VMCPUID idCpu)
5059{
5060#ifdef VBOX_WITH_PAGE_SHARING
5061 /*
5062 * Validate input and get the basics.
5063 */
5064 PGMM pGMM;
5065 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5066 PGVM pGVM;
5067 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
5068 if (RT_FAILURE(rc))
5069 return rc;
5070
5071 /*
5072 * Take the semaphore and do some more validations.
5073 */
5074 gmmR0MutexAcquire(pGMM);
5075 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5076 {
5077 Log(("GMMR0ResetSharedModules\n"));
5078 GMMR0SHMODPERVMDTORARGS Args;
5079 Args.pGVM = pGVM;
5080 Args.pGMM = pGMM;
5081 RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5082 pGVM->gmm.s.Stats.cShareableModules = 0;
5083
5084 rc = VINF_SUCCESS;
5085 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5086 }
5087 else
5088 rc = VERR_GMM_IS_NOT_SANE;
5089
5090 gmmR0MutexRelease(pGMM);
5091 return rc;
5092#else
5093 NOREF(pVM); NOREF(idCpu);
5094 return VERR_NOT_IMPLEMENTED;
5095#endif
5096}
5097
5098#ifdef VBOX_WITH_PAGE_SHARING
5099
5100/**
5101 * Tree enumeration callback for checking a shared module.
5102 */
5103static DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
5104{
5105 GMMCHECKSHAREDMODULEINFO *pArgs = (GMMCHECKSHAREDMODULEINFO*)pvUser;
5106 PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
5107 PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
5108
5109 Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x\n",
5110 pGblMod->szName, pGblMod->szVersion, pGblMod->Core.Key, pGblMod->cbModule));
5111
5112 int rc = PGMR0SharedModuleCheck(pArgs->pGVM->pVM, pArgs->pGVM, pArgs->idCpu, pGblMod, pRecVM->aRegionsGCPtrs);
5113 if (RT_FAILURE(rc))
5114 return rc;
5115 return VINF_SUCCESS;
5116}
5117
5118#endif /* VBOX_WITH_PAGE_SHARING */
5119#ifdef DEBUG_sandervl
5120
5121/**
5122 * Setup for a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
5123 *
5124 * @returns VBox status code.
5125 * @param pVM Pointer to the VM.
5126 */
5127GMMR0DECL(int) GMMR0CheckSharedModulesStart(PVM pVM)
5128{
5129 /*
5130 * Validate input and get the basics.
5131 */
5132 PGMM pGMM;
5133 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5134
5135 /*
5136 * Take the semaphore and do some more validations.
5137 */
5138 gmmR0MutexAcquire(pGMM);
5139 if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5140 rc = VERR_GMM_IS_NOT_SANE;
5141 else
5142 rc = VINF_SUCCESS;
5143
5144 return rc;
5145}
5146
5147/**
5148 * Clean up after a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
5149 *
5150 * @returns VBox status code.
5151 * @param pVM Pointer to the VM.
5152 */
5153GMMR0DECL(int) GMMR0CheckSharedModulesEnd(PVM pVM)
5154{
5155 /*
5156 * Validate input and get the basics.
5157 */
5158 PGMM pGMM;
5159 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5160
5161 gmmR0MutexRelease(pGMM);
5162 return VINF_SUCCESS;
5163}
5164
5165#endif /* DEBUG_sandervl */
5166
5167/**
5168 * Check all shared modules for the specified VM.
5169 *
5170 * @returns VBox status code.
5171 * @param pVM Pointer to the VM.
5172 * @param pVCpu Pointer to the VMCPU.
5173 */
5174GMMR0DECL(int) GMMR0CheckSharedModules(PVM pVM, PVMCPU pVCpu)
5175{
5176#ifdef VBOX_WITH_PAGE_SHARING
5177 /*
5178 * Validate input and get the basics.
5179 */
5180 PGMM pGMM;
5181 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5182 PGVM pGVM;
5183 int rc = GVMMR0ByVMAndEMT(pVM, pVCpu->idCpu, &pGVM);
5184 if (RT_FAILURE(rc))
5185 return rc;
5186
5187# ifndef DEBUG_sandervl
5188 /*
5189 * Take the semaphore and do some more validations.
5190 */
5191 gmmR0MutexAcquire(pGMM);
5192# endif
5193 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5194 {
5195 /*
5196 * Walk the tree, checking each module.
5197 */
5198 Log(("GMMR0CheckSharedModules\n"));
5199
5200 GMMCHECKSHAREDMODULEINFO Args;
5201 Args.pGVM = pGVM;
5202 Args.idCpu = pVCpu->idCpu;
5203 rc = RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Args);
5204
5205 Log(("GMMR0CheckSharedModules done (rc=%Rrc)!\n", rc));
5206 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5207 }
5208 else
5209 rc = VERR_GMM_IS_NOT_SANE;
5210
5211# ifndef DEBUG_sandervl
5212 gmmR0MutexRelease(pGMM);
5213# endif
5214 return rc;
5215#else
5216 NOREF(pVM); NOREF(pVCpu);
5217 return VERR_NOT_IMPLEMENTED;
5218#endif
5219}
5220
5221#if defined(VBOX_STRICT) && HC_ARCH_BITS == 64
5222
5223/**
5224 * RTAvlU32DoWithAll callback.
5225 *
5226 * @returns 0
5227 * @param pNode The node to search.
5228 * @param pvUser Pointer to the input argument packet.
5229 */
5230static DECLCALLBACK(int) gmmR0FindDupPageInChunk(PAVLU32NODECORE pNode, void *pvUser)
5231{
5232 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
5233 GMMFINDDUPPAGEINFO *pArgs = (GMMFINDDUPPAGEINFO *)pvUser;
5234 PGVM pGVM = pArgs->pGVM;
5235 PGMM pGMM = pArgs->pGMM;
5236 uint8_t *pbChunk;
5237
5238 /* Only take chunks not mapped into this VM process; not entirely correct. */
5239 if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5240 {
5241 int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/, (PRTR3PTR)&pbChunk);
5242 if (RT_SUCCESS(rc))
5243 {
5244 /*
5245 * Look for duplicate pages
5246 */
5247 unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
5248 while (iPage-- > 0)
5249 {
5250 if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
5251 {
5252 uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
5253
5254 if (!memcmp(pArgs->pSourcePage, pbDestPage, PAGE_SIZE))
5255 {
5256 pArgs->fFoundDuplicate = true;
5257 break;
5258 }
5259 }
5260 }
5261 gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /*fRelaxedSem*/);
5262 }
5263 }
5264 return pArgs->fFoundDuplicate; /* (stops search if true) */
5265}
5266
5267
5268/**
5269 * Find a duplicate of the specified page in other active VMs
5270 *
5271 * @returns VBox status code.
5272 * @param pVM Pointer to the VM.
5273 * @param pReq Pointer to the request packet.
5274 */
5275GMMR0DECL(int) GMMR0FindDuplicatePageReq(PVM pVM, PGMMFINDDUPLICATEPAGEREQ pReq)
5276{
5277 /*
5278 * Validate input and pass it on.
5279 */
5280 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
5281 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5282 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5283
5284 PGMM pGMM;
5285 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5286
5287 PGVM pGVM;
5288 int rc = GVMMR0ByVM(pVM, &pGVM);
5289 if (RT_FAILURE(rc))
5290 return rc;
5291
5292 /*
5293 * Take the semaphore and do some more validations.
5294 */
5295 rc = gmmR0MutexAcquire(pGMM);
5296 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5297 {
5298 uint8_t *pbChunk;
5299 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
5300 if (pChunk)
5301 {
5302 if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5303 {
5304 uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5305 PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
5306 if (pPage)
5307 {
5308 GMMFINDDUPPAGEINFO Args;
5309 Args.pGVM = pGVM;
5310 Args.pGMM = pGMM;
5311 Args.pSourcePage = pbSourcePage;
5312 Args.fFoundDuplicate = false;
5313 RTAvlU32DoWithAll(&pGMM->pChunks, true /* fFromLeft */, gmmR0FindDupPageInChunk, &Args);
5314
5315 pReq->fDuplicate = Args.fFoundDuplicate;
5316 }
5317 else
5318 {
5319 AssertFailed();
5320 rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
5321 }
5322 }
5323 else
5324 AssertFailed();
5325 }
5326 else
5327 AssertFailed();
5328 }
5329 else
5330 rc = VERR_GMM_IS_NOT_SANE;
5331
5332 gmmR0MutexRelease(pGMM);
5333 return rc;
5334}
5335
5336#endif /* VBOX_STRICT && HC_ARCH_BITS == 64 */
5337
5338
5339/**
5340 * Retrieves the GMM statistics visible to the caller.
5341 *
5342 * @returns VBox status code.
5343 *
5344 * @param pStats Where to put the statistics.
5345 * @param pSession The current session.
5346 * @param pVM Pointer to the VM to obtain statistics for. Optional.
5347 */
5348GMMR0DECL(int) GMMR0QueryStatistics(PGMMSTATS pStats, PSUPDRVSESSION pSession, PVM pVM)
5349{
5350 LogFlow(("GVMMR0QueryStatistics: pStats=%p pSession=%p pVM=%p\n", pStats, pSession, pVM));
5351
5352 /*
5353 * Validate input.
5354 */
5355 AssertPtrReturn(pSession, VERR_INVALID_POINTER);
5356 AssertPtrReturn(pStats, VERR_INVALID_POINTER);
5357 pStats->cMaxPages = 0; /* (crash before taking the mutex...) */
5358
5359 PGMM pGMM;
5360 GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5361
5362 /*
5363 * Resolve the VM handle, if not NULL, and lock the GMM.
5364 */
5365 int rc;
5366 PGVM pGVM;
5367 if (pVM)
5368 {
5369 rc = GVMMR0ByVM(pVM, &pGVM);
5370 if (RT_FAILURE(rc))
5371 return rc;
5372 }
5373 else
5374 pGVM = NULL;
5375
5376 rc = gmmR0MutexAcquire(pGMM);
5377 if (RT_FAILURE(rc))
5378 return rc;
5379
5380 /*
5381 * Copy out the GMM statistics.
5382 */
5383 pStats->cMaxPages = pGMM->cMaxPages;
5384 pStats->cReservedPages = pGMM->cReservedPages;
5385 pStats->cOverCommittedPages = pGMM->cOverCommittedPages;
5386 pStats->cAllocatedPages = pGMM->cAllocatedPages;
5387 pStats->cSharedPages = pGMM->cSharedPages;
5388 pStats->cDuplicatePages = pGMM->cDuplicatePages;
5389 pStats->cLeftBehindSharedPages = pGMM->cLeftBehindSharedPages;
5390 pStats->cBalloonedPages = pGMM->cBalloonedPages;
5391 pStats->cChunks = pGMM->cChunks;
5392 pStats->cFreedChunks = pGMM->cFreedChunks;
5393 pStats->cShareableModules = pGMM->cShareableModules;
5394 RT_ZERO(pStats->au64Reserved);
5395
5396 /*
5397 * Copy out the VM statistics.
5398 */
5399 if (pGVM)
5400 pStats->VMStats = pGVM->gmm.s.Stats;
5401 else
5402 RT_ZERO(pStats->VMStats);
5403
5404 gmmR0MutexRelease(pGMM);
5405 return rc;
5406}
5407
5408
5409/**
5410 * VMMR0 request wrapper for GMMR0QueryStatistics.
5411 *
5412 * @returns see GMMR0QueryStatistics.
5413 * @param pVM Pointer to the VM. Optional.
5414 * @param pReq Pointer to the request packet.
5415 */
5416GMMR0DECL(int) GMMR0QueryStatisticsReq(PVM pVM, PGMMQUERYSTATISTICSSREQ pReq)
5417{
5418 /*
5419 * Validate input and pass it on.
5420 */
5421 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5422 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5423
5424 return GMMR0QueryStatistics(&pReq->Stats, pReq->pSession, pVM);
5425}
5426
5427
5428/**
5429 * Resets the specified GMM statistics.
5430 *
5431 * @returns VBox status code.
5432 *
5433 * @param pStats Which statistics to reset, that is, non-zero fields
5434 * indicates which to reset.
5435 * @param pSession The current session.
5436 * @param pVM The VM to reset statistics for. Optional.
5437 */
5438GMMR0DECL(int) GMMR0ResetStatistics(PCGMMSTATS pStats, PSUPDRVSESSION pSession, PVM pVM)
5439{
5440 NOREF(pStats); NOREF(pSession); NOREF(pVM);
5441 /* Currently nothing we can reset at the moment. */
5442 return VINF_SUCCESS;
5443}
5444
5445
5446/**
5447 * VMMR0 request wrapper for GMMR0ResetStatistics.
5448 *
5449 * @returns see GMMR0ResetStatistics.
5450 * @param pVM Pointer to the VM. Optional.
5451 * @param pReq Pointer to the request packet.
5452 */
5453GMMR0DECL(int) GMMR0ResetStatisticsReq(PVM pVM, PGMMRESETSTATISTICSSREQ pReq)
5454{
5455 /*
5456 * Validate input and pass it on.
5457 */
5458 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5459 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5460
5461 return GMMR0ResetStatistics(&pReq->Stats, pReq->pSession, pVM);
5462}
5463
Note: See TracBrowser for help on using the repository browser.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette