VirtualBox

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 28974

Last change on this file since 28974 was 28974, checked in by vboxsync, 15 years ago

Dump GMM stats on PGMR3PhysAllocateHandyPages failure.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Id
File size: 122.0 KB
Line 
1/* $Id: GMMR0.cpp 28974 2010-05-03 13:09:44Z vboxsync $ */
2/** @file
3 * GMM - Global Memory Manager.
4 */
5
6/*
7 * Copyright (C) 2007 Oracle Corporation
8 *
9 * This file is part of VirtualBox Open Source Edition (OSE), as
10 * available from http://www.virtualbox.org. This file is free software;
11 * you can redistribute it and/or modify it under the terms of the GNU
12 * General Public License (GPL) as published by the Free Software
13 * Foundation, in version 2 as it comes in the "COPYING" file of the
14 * VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15 * hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16 */
17
18
19/** @page pg_gmm GMM - The Global Memory Manager
20 *
21 * As the name indicates, this component is responsible for global memory
22 * management. Currently only guest RAM is allocated from the GMM, but this
23 * may change to include shadow page tables and other bits later.
24 *
25 * Guest RAM is managed as individual pages, but allocated from the host OS
26 * in chunks for reasons of portability / efficiency. To minimize the memory
27 * footprint all tracking structure must be as small as possible without
28 * unnecessary performance penalties.
29 *
30 * The allocation chunks has fixed sized, the size defined at compile time
31 * by the #GMM_CHUNK_SIZE \#define.
32 *
33 * Each chunk is given an unquie ID. Each page also has a unique ID. The
34 * relation ship between the two IDs is:
35 * @code
36 * GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37 * idPage = (idChunk << GMM_CHUNK_SHIFT) | iPage;
38 * @endcode
39 * Where iPage is the index of the page within the chunk. This ID scheme
40 * permits for efficient chunk and page lookup, but it relies on the chunk size
41 * to be set at compile time. The chunks are organized in an AVL tree with their
42 * IDs being the keys.
43 *
44 * The physical address of each page in an allocation chunk is maintained by
45 * the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46 * need to duplicate this information (it'll cost 8-bytes per page if we did).
47 *
48 * So what do we need to track per page? Most importantly we need to know
49 * which state the page is in:
50 * - Private - Allocated for (eventually) backing one particular VM page.
51 * - Shared - Readonly page that is used by one or more VMs and treated
52 * as COW by PGM.
53 * - Free - Not used by anyone.
54 *
55 * For the page replacement operations (sharing, defragmenting and freeing)
56 * to be somewhat efficient, private pages needs to be associated with a
57 * particular page in a particular VM.
58 *
59 * Tracking the usage of shared pages is impractical and expensive, so we'll
60 * settle for a reference counting system instead.
61 *
62 * Free pages will be chained on LIFOs
63 *
64 * On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65 * systems a 32-bit bitfield will have to suffice because of address space
66 * limitations. The #GMMPAGE structure shows the details.
67 *
68 *
69 * @section sec_gmm_alloc_strat Page Allocation Strategy
70 *
71 * The strategy for allocating pages has to take fragmentation and shared
72 * pages into account, or we may end up with with 2000 chunks with only
73 * a few pages in each. Shared pages cannot easily be reallocated because
74 * of the inaccurate usage accounting (see above). Private pages can be
75 * reallocated by a defragmentation thread in the same manner that sharing
76 * is done.
77 *
78 * The first approach is to manage the free pages in two sets depending on
79 * whether they are mainly for the allocation of shared or private pages.
80 * In the initial implementation there will be almost no possibility for
81 * mixing shared and private pages in the same chunk (only if we're really
82 * stressed on memory), but when we implement forking of VMs and have to
83 * deal with lots of COW pages it'll start getting kind of interesting.
84 *
85 * The sets are lists of chunks with approximately the same number of
86 * free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87 * consists of 16 lists. So, the first list will contain the chunks with
88 * 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89 * moved between the lists as pages are freed up or allocated.
90 *
91 *
92 * @section sec_gmm_costs Costs
93 *
94 * The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95 * entails. In addition there is the chunk cost of approximately
96 * (sizeof(RT0MEMOBJ) + sizof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97 *
98 * On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99 * and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100 * The cost on Linux is identical, but here it's because of sizeof(struct page *).
101 *
102 *
103 * @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104 *
105 * In legacy mode the page source is locked user pages and not
106 * #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107 * by the VM that locked it. We will make no attempt at implementing
108 * page sharing on these systems, just do enough to make it all work.
109 *
110 *
111 * @subsection sub_gmm_locking Serializing
112 *
113 * One simple fast mutex will be employed in the initial implementation, not
114 * two as metioned in @ref subsec_pgmPhys_Serializing.
115 *
116 * @see @ref subsec_pgmPhys_Serializing
117 *
118 *
119 * @section sec_gmm_overcommit Memory Over-Commitment Management
120 *
121 * The GVM will have to do the system wide memory over-commitment
122 * management. My current ideas are:
123 * - Per VM oc policy that indicates how much to initially commit
124 * to it and what to do in a out-of-memory situation.
125 * - Prevent overtaxing the host.
126 *
127 * There are some challenges here, the main ones are configurability and
128 * security. Should we for instance permit anyone to request 100% memory
129 * commitment? Who should be allowed to do runtime adjustments of the
130 * config. And how to prevent these settings from being lost when the last
131 * VM process exits? The solution is probably to have an optional root
132 * daemon the will keep VMMR0.r0 in memory and enable the security measures.
133 *
134 *
135 *
136 * @section sec_gmm_numa NUMA
137 *
138 * NUMA considerations will be designed and implemented a bit later.
139 *
140 * The preliminary guesses is that we will have to try allocate memory as
141 * close as possible to the CPUs the VM is executed on (EMT and additional CPU
142 * threads). Which means it's mostly about allocation and sharing policies.
143 * Both the scheduler and allocator interface will to supply some NUMA info
144 * and we'll need to have a way to calc access costs.
145 *
146 */
147
148
149/*******************************************************************************
150* Header Files *
151*******************************************************************************/
152#define LOG_GROUP LOG_GROUP_GMM
153#include <VBox/gmm.h>
154#include "GMMR0Internal.h"
155#include <VBox/gvm.h>
156#include <VBox/log.h>
157#include <VBox/param.h>
158#include <VBox/err.h>
159#include <iprt/avl.h>
160#include <iprt/mem.h>
161#include <iprt/memobj.h>
162#include <iprt/semaphore.h>
163#include <iprt/string.h>
164
165
166/*******************************************************************************
167* Structures and Typedefs *
168*******************************************************************************/
169/** Pointer to set of free chunks. */
170typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
171
172/** Pointer to a GMM allocation chunk. */
173typedef struct GMMCHUNK *PGMMCHUNK;
174
175/**
176 * The per-page tracking structure employed by the GMM.
177 *
178 * On 32-bit hosts we'll some trickery is necessary to compress all
179 * the information into 32-bits. When the fSharedFree member is set,
180 * the 30th bit decides whether it's a free page or not.
181 *
182 * Because of the different layout on 32-bit and 64-bit hosts, macros
183 * are used to get and set some of the data.
184 */
185typedef union GMMPAGE
186{
187#if HC_ARCH_BITS == 64
188 /** Unsigned integer view. */
189 uint64_t u;
190
191 /** The common view. */
192 struct GMMPAGECOMMON
193 {
194 uint32_t uStuff1 : 32;
195 uint32_t uStuff2 : 30;
196 /** The page state. */
197 uint32_t u2State : 2;
198 } Common;
199
200 /** The view of a private page. */
201 struct GMMPAGEPRIVATE
202 {
203 /** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
204 uint32_t pfn;
205 /** The GVM handle. (64K VMs) */
206 uint32_t hGVM : 16;
207 /** Reserved. */
208 uint32_t u16Reserved : 14;
209 /** The page state. */
210 uint32_t u2State : 2;
211 } Private;
212
213 /** The view of a shared page. */
214 struct GMMPAGESHARED
215 {
216 /** The reference count. */
217 uint32_t cRefs;
218 /** Reserved. Checksum or something? Two hGVMs for forking? */
219 uint32_t u30Reserved : 30;
220 /** The page state. */
221 uint32_t u2State : 2;
222 } Shared;
223
224 /** The view of a free page. */
225 struct GMMPAGEFREE
226 {
227 /** The index of the next page in the free list. UINT16_MAX is NIL. */
228 uint16_t iNext;
229 /** Reserved. Checksum or something? */
230 uint16_t u16Reserved0;
231 /** Reserved. Checksum or something? */
232 uint32_t u30Reserved1 : 30;
233 /** The page state. */
234 uint32_t u2State : 2;
235 } Free;
236
237#else /* 32-bit */
238 /** Unsigned integer view. */
239 uint32_t u;
240
241 /** The common view. */
242 struct GMMPAGECOMMON
243 {
244 uint32_t uStuff : 30;
245 /** The page state. */
246 uint32_t u2State : 2;
247 } Common;
248
249 /** The view of a private page. */
250 struct GMMPAGEPRIVATE
251 {
252 /** The guest page frame number. (Max addressable: 2 ^ 36) */
253 uint32_t pfn : 24;
254 /** The GVM handle. (127 VMs) */
255 uint32_t hGVM : 7;
256 /** The top page state bit, MBZ. */
257 uint32_t fZero : 1;
258 } Private;
259
260 /** The view of a shared page. */
261 struct GMMPAGESHARED
262 {
263 /** The reference count. */
264 uint32_t cRefs : 30;
265 /** The page state. */
266 uint32_t u2State : 2;
267 } Shared;
268
269 /** The view of a free page. */
270 struct GMMPAGEFREE
271 {
272 /** The index of the next page in the free list. UINT16_MAX is NIL. */
273 uint32_t iNext : 16;
274 /** Reserved. Checksum or something? */
275 uint32_t u14Reserved : 14;
276 /** The page state. */
277 uint32_t u2State : 2;
278 } Free;
279#endif
280} GMMPAGE;
281AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
282/** Pointer to a GMMPAGE. */
283typedef GMMPAGE *PGMMPAGE;
284
285
286/** @name The Page States.
287 * @{ */
288/** A private page. */
289#define GMM_PAGE_STATE_PRIVATE 0
290/** A private page - alternative value used on the 32-bit implemenation.
291 * This will never be used on 64-bit hosts. */
292#define GMM_PAGE_STATE_PRIVATE_32 1
293/** A shared page. */
294#define GMM_PAGE_STATE_SHARED 2
295/** A free page. */
296#define GMM_PAGE_STATE_FREE 3
297/** @} */
298
299
300/** @def GMM_PAGE_IS_PRIVATE
301 *
302 * @returns true if private, false if not.
303 * @param pPage The GMM page.
304 */
305#if HC_ARCH_BITS == 64
306# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
307#else
308# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
309#endif
310
311/** @def GMM_PAGE_IS_SHARED
312 *
313 * @returns true if shared, false if not.
314 * @param pPage The GMM page.
315 */
316#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
317
318/** @def GMM_PAGE_IS_FREE
319 *
320 * @returns true if free, false if not.
321 * @param pPage The GMM page.
322 */
323#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
324
325/** @def GMM_PAGE_PFN_LAST
326 * The last valid guest pfn range.
327 * @remark Some of the values outside the range has special meaning,
328 * see GMM_PAGE_PFN_UNSHAREABLE.
329 */
330#if HC_ARCH_BITS == 64
331# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
332#else
333# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
334#endif
335AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
336
337/** @def GMM_PAGE_PFN_UNSHAREABLE
338 * Indicates that this page isn't used for normal guest memory and thus isn't shareable.
339 */
340#if HC_ARCH_BITS == 64
341# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
342#else
343# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
344#endif
345AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
346
347
348/**
349 * A GMM allocation chunk ring-3 mapping record.
350 *
351 * This should really be associated with a session and not a VM, but
352 * it's simpler to associated with a VM and cleanup with the VM object
353 * is destroyed.
354 */
355typedef struct GMMCHUNKMAP
356{
357 /** The mapping object. */
358 RTR0MEMOBJ MapObj;
359 /** The VM owning the mapping. */
360 PGVM pGVM;
361} GMMCHUNKMAP;
362/** Pointer to a GMM allocation chunk mapping. */
363typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
364
365typedef enum GMMCHUNKTYPE
366{
367 GMMCHUNKTYPE_INVALID = 0,
368 GMMCHUNKTYPE_NON_CONTINUOUS = 1, /* 4 kb pages */
369 GMMCHUNKTYPE_CONTINUOUS = 2, /* one 2 MB continuous physical range. */
370 GMMCHUNKTYPE_32BIT_HACK = 0x7fffffff
371} GMMCHUNKTYPE;
372
373
374/**
375 * A GMM allocation chunk.
376 */
377typedef struct GMMCHUNK
378{
379 /** The AVL node core.
380 * The Key is the chunk ID. */
381 AVLU32NODECORE Core;
382 /** The memory object.
383 * Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
384 * what the host can dish up with. */
385 RTR0MEMOBJ MemObj;
386 /** Pointer to the next chunk in the free list. */
387 PGMMCHUNK pFreeNext;
388 /** Pointer to the previous chunk in the free list. */
389 PGMMCHUNK pFreePrev;
390 /** Pointer to the free set this chunk belongs to. NULL for
391 * chunks with no free pages. */
392 PGMMCHUNKFREESET pSet;
393 /** Pointer to an array of mappings. */
394 PGMMCHUNKMAP paMappings;
395 /** The number of mappings. */
396 uint16_t cMappings;
397 /** The head of the list of free pages. UINT16_MAX is the NIL value. */
398 uint16_t iFreeHead;
399 /** The number of free pages. */
400 uint16_t cFree;
401 /** The GVM handle of the VM that first allocated pages from this chunk, this
402 * is used as a preference when there are several chunks to choose from.
403 * When in bound memory mode this isn't a preference any longer. */
404 uint16_t hGVM;
405 /** The number of private pages. */
406 uint16_t cPrivate;
407 /** The number of shared pages. */
408 uint16_t cShared;
409 /** Chunk type */
410 GMMCHUNKTYPE enmType;
411 /** The pages. */
412 GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
413} GMMCHUNK;
414
415
416/**
417 * An allocation chunk TLB entry.
418 */
419typedef struct GMMCHUNKTLBE
420{
421 /** The chunk id. */
422 uint32_t idChunk;
423 /** Pointer to the chunk. */
424 PGMMCHUNK pChunk;
425} GMMCHUNKTLBE;
426/** Pointer to an allocation chunk TLB entry. */
427typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
428
429
430/** The number of entries tin the allocation chunk TLB. */
431#define GMM_CHUNKTLB_ENTRIES 32
432/** Gets the TLB entry index for the given Chunk ID. */
433#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
434
435/**
436 * An allocation chunk TLB.
437 */
438typedef struct GMMCHUNKTLB
439{
440 /** The TLB entries. */
441 GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
442} GMMCHUNKTLB;
443/** Pointer to an allocation chunk TLB. */
444typedef GMMCHUNKTLB *PGMMCHUNKTLB;
445
446
447/** The GMMCHUNK::cFree shift count. */
448#define GMM_CHUNK_FREE_SET_SHIFT 4
449/** The GMMCHUNK::cFree mask for use when considering relinking a chunk. */
450#define GMM_CHUNK_FREE_SET_MASK 15
451/** The number of lists in set. */
452#define GMM_CHUNK_FREE_SET_LISTS (GMM_CHUNK_NUM_PAGES >> GMM_CHUNK_FREE_SET_SHIFT)
453
454/**
455 * A set of free chunks.
456 */
457typedef struct GMMCHUNKFREESET
458{
459 /** The number of free pages in the set. */
460 uint64_t cFreePages;
461 /** Chunks ordered by increasing number of free pages. */
462 PGMMCHUNK apLists[GMM_CHUNK_FREE_SET_LISTS];
463} GMMCHUNKFREESET;
464
465
466/**
467 * The GMM instance data.
468 */
469typedef struct GMM
470{
471 /** Magic / eye catcher. GMM_MAGIC */
472 uint32_t u32Magic;
473 /** The fast mutex protecting the GMM.
474 * More fine grained locking can be implemented later if necessary. */
475 RTSEMFASTMUTEX Mtx;
476 /** The chunk tree. */
477 PAVLU32NODECORE pChunks;
478 /** The chunk TLB. */
479 GMMCHUNKTLB ChunkTLB;
480 /** The private free set. */
481 GMMCHUNKFREESET Private;
482 /** The shared free set. */
483 GMMCHUNKFREESET Shared;
484
485 /** Shared module tree (global). */
486 PAVLGCPTRNODECORE pSharedModuleTree;
487
488 /** The maximum number of pages we're allowed to allocate.
489 * @gcfgm 64-bit GMM/MaxPages Direct.
490 * @gcfgm 32-bit GMM/PctPages Relative to the number of host pages. */
491 uint64_t cMaxPages;
492 /** The number of pages that has been reserved.
493 * The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
494 uint64_t cReservedPages;
495 /** The number of pages that we have over-committed in reservations. */
496 uint64_t cOverCommittedPages;
497 /** The number of actually allocated (committed if you like) pages. */
498 uint64_t cAllocatedPages;
499 /** The number of pages that are shared. A subset of cAllocatedPages. */
500 uint64_t cSharedPages;
501 /** The number of pages that are shared that has been left behind by
502 * VMs not doing proper cleanups. */
503 uint64_t cLeftBehindSharedPages;
504 /** The number of allocation chunks.
505 * (The number of pages we've allocated from the host can be derived from this.) */
506 uint32_t cChunks;
507 /** The number of current ballooned pages. */
508 uint64_t cBalloonedPages;
509
510 /** The legacy allocation mode indicator.
511 * This is determined at initialization time. */
512 bool fLegacyAllocationMode;
513 /** The bound memory mode indicator.
514 * When set, the memory will be bound to a specific VM and never
515 * shared. This is always set if fLegacyAllocationMode is set.
516 * (Also determined at initialization time.) */
517 bool fBoundMemoryMode;
518 /** The number of registered VMs. */
519 uint16_t cRegisteredVMs;
520
521 /** The previous allocated Chunk ID.
522 * Used as a hint to avoid scanning the whole bitmap. */
523 uint32_t idChunkPrev;
524 /** Chunk ID allocation bitmap.
525 * Bits of allocated IDs are set, free ones are clear.
526 * The NIL id (0) is marked allocated. */
527 uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
528} GMM;
529/** Pointer to the GMM instance. */
530typedef GMM *PGMM;
531
532/** The value of GMM::u32Magic (Katsuhiro Otomo). */
533#define GMM_MAGIC 0x19540414
534
535
536/*******************************************************************************
537* Global Variables *
538*******************************************************************************/
539/** Pointer to the GMM instance data. */
540static PGMM g_pGMM = NULL;
541
542/** Macro for obtaining and validating the g_pGMM pointer.
543 * On failure it will return from the invoking function with the specified return value.
544 *
545 * @param pGMM The name of the pGMM variable.
546 * @param rc The return value on failure. Use VERR_INTERNAL_ERROR for
547 * VBox status codes.
548 */
549#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
550 do { \
551 (pGMM) = g_pGMM; \
552 AssertPtrReturn((pGMM), (rc)); \
553 AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
554 } while (0)
555
556/** Macro for obtaining and validating the g_pGMM pointer, void function variant.
557 * On failure it will return from the invoking function.
558 *
559 * @param pGMM The name of the pGMM variable.
560 */
561#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
562 do { \
563 (pGMM) = g_pGMM; \
564 AssertPtrReturnVoid((pGMM)); \
565 AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
566 } while (0)
567
568
569/** @def GMM_CHECK_SANITY_UPON_ENTERING
570 * Checks the sanity of the GMM instance data before making changes.
571 *
572 * This is macro is a stub by default and must be enabled manually in the code.
573 *
574 * @returns true if sane, false if not.
575 * @param pGMM The name of the pGMM variable.
576 */
577#if defined(VBOX_STRICT) && 0
578# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
579#else
580# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
581#endif
582
583/** @def GMM_CHECK_SANITY_UPON_LEAVING
584 * Checks the sanity of the GMM instance data after making changes.
585 *
586 * This is macro is a stub by default and must be enabled manually in the code.
587 *
588 * @returns true if sane, false if not.
589 * @param pGMM The name of the pGMM variable.
590 */
591#if defined(VBOX_STRICT) && 0
592# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
593#else
594# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
595#endif
596
597/** @def GMM_CHECK_SANITY_IN_LOOPS
598 * Checks the sanity of the GMM instance in the allocation loops.
599 *
600 * This is macro is a stub by default and must be enabled manually in the code.
601 *
602 * @returns true if sane, false if not.
603 * @param pGMM The name of the pGMM variable.
604 */
605#if defined(VBOX_STRICT) && 0
606# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
607#else
608# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
609#endif
610
611
612/*******************************************************************************
613* Internal Functions *
614*******************************************************************************/
615static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
616static DECLCALLBACK(int) gmmR0CleanupVMScanChunk(PAVLU32NODECORE pNode, void *pvGMM);
617/*static*/ DECLCALLBACK(int) gmmR0CleanupVMDestroyChunk(PAVLU32NODECORE pNode, void *pvGVM);
618DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
619DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
620static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
621static void gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
622static void gmmR0FreeSharedPage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage);
623static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
624
625
626
627/**
628 * Initializes the GMM component.
629 *
630 * This is called when the VMMR0.r0 module is loaded and protected by the
631 * loader semaphore.
632 *
633 * @returns VBox status code.
634 */
635GMMR0DECL(int) GMMR0Init(void)
636{
637 LogFlow(("GMMInit:\n"));
638
639 /*
640 * Allocate the instance data and the lock(s).
641 */
642 PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
643 if (!pGMM)
644 return VERR_NO_MEMORY;
645 pGMM->u32Magic = GMM_MAGIC;
646 for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
647 pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
648 ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
649
650 int rc = RTSemFastMutexCreate(&pGMM->Mtx);
651 if (RT_SUCCESS(rc))
652 {
653 /*
654 * Check and see if RTR0MemObjAllocPhysNC works.
655 */
656#if 0 /* later, see #3170. */
657 RTR0MEMOBJ MemObj;
658 rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
659 if (RT_SUCCESS(rc))
660 {
661 rc = RTR0MemObjFree(MemObj, true);
662 AssertRC(rc);
663 }
664 else if (rc == VERR_NOT_SUPPORTED)
665 pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
666 else
667 SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
668#else
669# if defined(RT_OS_WINDOWS) || defined(RT_OS_SOLARIS) || defined(RT_OS_LINUX) || defined(RT_OS_FREEBSD)
670 pGMM->fLegacyAllocationMode = false;
671# if ARCH_BITS == 32
672 /* Don't reuse possibly partial chunks because of the virtual address space limitation. */
673 pGMM->fBoundMemoryMode = true;
674# else
675 pGMM->fBoundMemoryMode = false;
676# endif
677# else
678 pGMM->fLegacyAllocationMode = true;
679 pGMM->fBoundMemoryMode = true;
680# endif
681#endif
682
683 /*
684 * Query system page count and guess a reasonable cMaxPages value.
685 */
686 pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
687
688 g_pGMM = pGMM;
689 LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
690 return VINF_SUCCESS;
691 }
692
693 RTMemFree(pGMM);
694 SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
695 return rc;
696}
697
698
699/**
700 * Terminates the GMM component.
701 */
702GMMR0DECL(void) GMMR0Term(void)
703{
704 LogFlow(("GMMTerm:\n"));
705
706 /*
707 * Take care / be paranoid...
708 */
709 PGMM pGMM = g_pGMM;
710 if (!VALID_PTR(pGMM))
711 return;
712 if (pGMM->u32Magic != GMM_MAGIC)
713 {
714 SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
715 return;
716 }
717
718 /*
719 * Undo what init did and free all the resources we've acquired.
720 */
721 /* Destroy the fundamentals. */
722 g_pGMM = NULL;
723 pGMM->u32Magic++;
724 RTSemFastMutexDestroy(pGMM->Mtx);
725 pGMM->Mtx = NIL_RTSEMFASTMUTEX;
726
727 /* free any chunks still hanging around. */
728 RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
729
730 /* finally the instance data itself. */
731 RTMemFree(pGMM);
732 LogFlow(("GMMTerm: done\n"));
733}
734
735
736/**
737 * RTAvlU32Destroy callback.
738 *
739 * @returns 0
740 * @param pNode The node to destroy.
741 * @param pvGMM The GMM handle.
742 */
743static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
744{
745 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
746
747 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
748 SUPR0Printf("GMMR0Term: %p/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
749 pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappings);
750
751 int rc = RTR0MemObjFree(pChunk->MemObj, true /* fFreeMappings */);
752 if (RT_FAILURE(rc))
753 {
754 SUPR0Printf("GMMR0Term: %p/%#x: RTRMemObjFree(%p,true) -> %d (cMappings=%d)\n", pChunk,
755 pChunk->Core.Key, pChunk->MemObj, rc, pChunk->cMappings);
756 AssertRC(rc);
757 }
758 pChunk->MemObj = NIL_RTR0MEMOBJ;
759
760 RTMemFree(pChunk->paMappings);
761 pChunk->paMappings = NULL;
762
763 RTMemFree(pChunk);
764 NOREF(pvGMM);
765 return 0;
766}
767
768
769/**
770 * Initializes the per-VM data for the GMM.
771 *
772 * This is called from within the GVMM lock (from GVMMR0CreateVM)
773 * and should only initialize the data members so GMMR0CleanupVM
774 * can deal with them. We reserve no memory or anything here,
775 * that's done later in GMMR0InitVM.
776 *
777 * @param pGVM Pointer to the Global VM structure.
778 */
779GMMR0DECL(void) GMMR0InitPerVMData(PGVM pGVM)
780{
781 AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
782
783 pGVM->gmm.s.enmPolicy = GMMOCPOLICY_INVALID;
784 pGVM->gmm.s.enmPriority = GMMPRIORITY_INVALID;
785 pGVM->gmm.s.fMayAllocate = false;
786}
787
788
789/**
790 * Cleans up when a VM is terminating.
791 *
792 * @param pGVM Pointer to the Global VM structure.
793 */
794GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
795{
796 LogFlow(("GMMR0CleanupVM: pGVM=%p:{.pVM=%p, .hSelf=%#x}\n", pGVM, pGVM->pVM, pGVM->hSelf));
797
798 PGMM pGMM;
799 GMM_GET_VALID_INSTANCE_VOID(pGMM);
800
801 int rc = RTSemFastMutexRequest(pGMM->Mtx);
802 AssertRC(rc);
803 GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
804
805 /*
806 * The policy is 'INVALID' until the initial reservation
807 * request has been serviced.
808 */
809 if ( pGVM->gmm.s.enmPolicy > GMMOCPOLICY_INVALID
810 && pGVM->gmm.s.enmPolicy < GMMOCPOLICY_END)
811 {
812 /*
813 * If it's the last VM around, we can skip walking all the chunk looking
814 * for the pages owned by this VM and instead flush the whole shebang.
815 *
816 * This takes care of the eventuality that a VM has left shared page
817 * references behind (shouldn't happen of course, but you never know).
818 */
819 Assert(pGMM->cRegisteredVMs);
820 pGMM->cRegisteredVMs--;
821#if 0 /* disabled so it won't hide bugs. */
822 if (!pGMM->cRegisteredVMs)
823 {
824 RTAvlU32Destroy(&pGMM->pChunks, gmmR0CleanupVMDestroyChunk, pGMM);
825
826 for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
827 {
828 pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
829 pGMM->ChunkTLB.aEntries[i].pChunk = NULL;
830 }
831
832 memset(&pGMM->Private, 0, sizeof(pGMM->Private));
833 memset(&pGMM->Shared, 0, sizeof(pGMM->Shared));
834
835 memset(&pGMM->bmChunkId[0], 0, sizeof(pGMM->bmChunkId));
836 ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
837
838 pGMM->cReservedPages = 0;
839 pGMM->cOverCommittedPages = 0;
840 pGMM->cAllocatedPages = 0;
841 pGMM->cSharedPages = 0;
842 pGMM->cLeftBehindSharedPages = 0;
843 pGMM->cChunks = 0;
844 pGMM->cBalloonedPages = 0;
845 }
846 else
847#endif
848 {
849 /*
850 * Walk the entire pool looking for pages that belongs to this VM
851 * and left over mappings. (This'll only catch private pages, shared
852 * pages will be 'left behind'.)
853 */
854 uint64_t cPrivatePages = pGVM->gmm.s.cPrivatePages; /* save */
855 RTAvlU32DoWithAll(&pGMM->pChunks, true /* fFromLeft */, gmmR0CleanupVMScanChunk, pGVM);
856 if (pGVM->gmm.s.cPrivatePages)
857 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.cPrivatePages);
858 pGMM->cAllocatedPages -= cPrivatePages;
859
860 /* free empty chunks. */
861 if (cPrivatePages)
862 {
863 PGMMCHUNK pCur = pGMM->Private.apLists[RT_ELEMENTS(pGMM->Private.apLists) - 1];
864 while (pCur)
865 {
866 PGMMCHUNK pNext = pCur->pFreeNext;
867 if ( pCur->cFree == GMM_CHUNK_NUM_PAGES
868 && ( !pGMM->fBoundMemoryMode
869 || pCur->hGVM == pGVM->hSelf))
870 gmmR0FreeChunk(pGMM, pGVM, pCur);
871 pCur = pNext;
872 }
873 }
874
875 /* account for shared pages that weren't freed. */
876 if (pGVM->gmm.s.cSharedPages)
877 {
878 Assert(pGMM->cSharedPages >= pGVM->gmm.s.cSharedPages);
879 SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.cSharedPages);
880 pGMM->cLeftBehindSharedPages += pGVM->gmm.s.cSharedPages;
881 }
882
883 /*
884 * Update the over-commitment management statistics.
885 */
886 pGMM->cReservedPages -= pGVM->gmm.s.Reserved.cBasePages
887 + pGVM->gmm.s.Reserved.cFixedPages
888 + pGVM->gmm.s.Reserved.cShadowPages;
889 switch (pGVM->gmm.s.enmPolicy)
890 {
891 case GMMOCPOLICY_NO_OC:
892 break;
893 default:
894 /** @todo Update GMM->cOverCommittedPages */
895 break;
896 }
897 }
898 }
899
900 /* zap the GVM data. */
901 pGVM->gmm.s.enmPolicy = GMMOCPOLICY_INVALID;
902 pGVM->gmm.s.enmPriority = GMMPRIORITY_INVALID;
903 pGVM->gmm.s.fMayAllocate = false;
904
905 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
906 RTSemFastMutexRelease(pGMM->Mtx);
907
908 LogFlow(("GMMR0CleanupVM: returns\n"));
909}
910
911
912/**
913 * RTAvlU32DoWithAll callback.
914 *
915 * @returns 0
916 * @param pNode The node to search.
917 * @param pvGVM Pointer to the shared VM structure.
918 */
919static DECLCALLBACK(int) gmmR0CleanupVMScanChunk(PAVLU32NODECORE pNode, void *pvGVM)
920{
921 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
922 PGVM pGVM = (PGVM)pvGVM;
923
924 /*
925 * Look for pages belonging to the VM.
926 * (Perform some internal checks while we're scanning.)
927 */
928#ifndef VBOX_STRICT
929 if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
930#endif
931 {
932 unsigned cPrivate = 0;
933 unsigned cShared = 0;
934 unsigned cFree = 0;
935
936 gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
937
938 uint16_t hGVM = pGVM->hSelf;
939 unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
940 while (iPage-- > 0)
941 if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
942 {
943 if (pChunk->aPages[iPage].Private.hGVM == hGVM)
944 {
945 /*
946 * Free the page.
947 *
948 * The reason for not using gmmR0FreePrivatePage here is that we
949 * must *not* cause the chunk to be freed from under us - we're in
950 * an AVL tree walk here.
951 */
952 pChunk->aPages[iPage].u = 0;
953 pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
954 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
955 pChunk->iFreeHead = iPage;
956 pChunk->cPrivate--;
957 pChunk->cFree++;
958 pGVM->gmm.s.cPrivatePages--;
959 cFree++;
960 }
961 else
962 cPrivate++;
963 }
964 else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
965 cFree++;
966 else
967 cShared++;
968
969 gmmR0LinkChunk(pChunk, pChunk->cShared ? &g_pGMM->Shared : &g_pGMM->Private);
970
971 /*
972 * Did it add up?
973 */
974 if (RT_UNLIKELY( pChunk->cFree != cFree
975 || pChunk->cPrivate != cPrivate
976 || pChunk->cShared != cShared))
977 {
978 SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %p/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
979 pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
980 pChunk->cFree = cFree;
981 pChunk->cPrivate = cPrivate;
982 pChunk->cShared = cShared;
983 }
984 }
985
986 /*
987 * Look for the mapping belonging to the terminating VM.
988 */
989 for (unsigned i = 0; i < pChunk->cMappings; i++)
990 if (pChunk->paMappings[i].pGVM == pGVM)
991 {
992 RTR0MEMOBJ MemObj = pChunk->paMappings[i].MapObj;
993
994 pChunk->cMappings--;
995 if (i < pChunk->cMappings)
996 pChunk->paMappings[i] = pChunk->paMappings[pChunk->cMappings];
997 pChunk->paMappings[pChunk->cMappings].pGVM = NULL;
998 pChunk->paMappings[pChunk->cMappings].MapObj = NIL_RTR0MEMOBJ;
999
1000 int rc = RTR0MemObjFree(MemObj, false /* fFreeMappings (NA) */);
1001 if (RT_FAILURE(rc))
1002 {
1003 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: mapping #%x: RTRMemObjFree(%p,false) -> %d \n",
1004 pChunk, pChunk->Core.Key, i, MemObj, rc);
1005 AssertRC(rc);
1006 }
1007 break;
1008 }
1009
1010 /*
1011 * If not in bound memory mode, we should reset the hGVM field
1012 * if it has our handle in it.
1013 */
1014 if (pChunk->hGVM == pGVM->hSelf)
1015 {
1016 if (!g_pGMM->fBoundMemoryMode)
1017 pChunk->hGVM = NIL_GVM_HANDLE;
1018 else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1019 {
1020 SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1021 pChunk, pChunk->Core.Key, pChunk->cFree);
1022 AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1023
1024 gmmR0UnlinkChunk(pChunk);
1025 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1026 gmmR0LinkChunk(pChunk, pChunk->cShared ? &g_pGMM->Shared : &g_pGMM->Private);
1027 }
1028 }
1029
1030 return 0;
1031}
1032
1033
1034/**
1035 * RTAvlU32Destroy callback for GMMR0CleanupVM.
1036 *
1037 * @returns 0
1038 * @param pNode The node (allocation chunk) to destroy.
1039 * @param pvGVM Pointer to the shared VM structure.
1040 */
1041/*static*/ DECLCALLBACK(int) gmmR0CleanupVMDestroyChunk(PAVLU32NODECORE pNode, void *pvGVM)
1042{
1043 PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
1044 PGVM pGVM = (PGVM)pvGVM;
1045
1046 for (unsigned i = 0; i < pChunk->cMappings; i++)
1047 {
1048 if (pChunk->paMappings[i].pGVM != pGVM)
1049 SUPR0Printf("gmmR0CleanupVMDestroyChunk: %p/%#x: mapping #%x: pGVM=%p exepcted %p\n", pChunk,
1050 pChunk->Core.Key, i, pChunk->paMappings[i].pGVM, pGVM);
1051 int rc = RTR0MemObjFree(pChunk->paMappings[i].MapObj, false /* fFreeMappings (NA) */);
1052 if (RT_FAILURE(rc))
1053 {
1054 SUPR0Printf("gmmR0CleanupVMDestroyChunk: %p/%#x: mapping #%x: RTRMemObjFree(%p,false) -> %d \n", pChunk,
1055 pChunk->Core.Key, i, pChunk->paMappings[i].MapObj, rc);
1056 AssertRC(rc);
1057 }
1058 }
1059
1060 int rc = RTR0MemObjFree(pChunk->MemObj, true /* fFreeMappings */);
1061 if (RT_FAILURE(rc))
1062 {
1063 SUPR0Printf("gmmR0CleanupVMDestroyChunk: %p/%#x: RTRMemObjFree(%p,true) -> %d (cMappings=%d)\n", pChunk,
1064 pChunk->Core.Key, pChunk->MemObj, rc, pChunk->cMappings);
1065 AssertRC(rc);
1066 }
1067 pChunk->MemObj = NIL_RTR0MEMOBJ;
1068
1069 RTMemFree(pChunk->paMappings);
1070 pChunk->paMappings = NULL;
1071
1072 RTMemFree(pChunk);
1073 return 0;
1074}
1075
1076
1077/**
1078 * The initial resource reservations.
1079 *
1080 * This will make memory reservations according to policy and priority. If there aren't
1081 * sufficient resources available to sustain the VM this function will fail and all
1082 * future allocations requests will fail as well.
1083 *
1084 * These are just the initial reservations made very very early during the VM creation
1085 * process and will be adjusted later in the GMMR0UpdateReservation call after the
1086 * ring-3 init has completed.
1087 *
1088 * @returns VBox status code.
1089 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1090 * @retval VERR_GMM_
1091 *
1092 * @param pVM Pointer to the shared VM structure.
1093 * @param idCpu VCPU id
1094 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1095 * This does not include MMIO2 and similar.
1096 * @param cShadowPages The number of pages that may be allocated for shadow pageing structures.
1097 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1098 * hyper heap, MMIO2 and similar.
1099 * @param enmPolicy The OC policy to use on this VM.
1100 * @param enmPriority The priority in an out-of-memory situation.
1101 *
1102 * @thread The creator thread / EMT.
1103 */
1104GMMR0DECL(int) GMMR0InitialReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages,
1105 GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1106{
1107 LogFlow(("GMMR0InitialReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1108 pVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1109
1110 /*
1111 * Validate, get basics and take the semaphore.
1112 */
1113 PGMM pGMM;
1114 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
1115 PGVM pGVM;
1116 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1117 if (RT_FAILURE(rc))
1118 return rc;
1119
1120 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1121 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1122 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1123 AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1124 AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1125
1126 rc = RTSemFastMutexRequest(pGMM->Mtx);
1127 AssertRC(rc);
1128 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1129 {
1130 if ( !pGVM->gmm.s.Reserved.cBasePages
1131 && !pGVM->gmm.s.Reserved.cFixedPages
1132 && !pGVM->gmm.s.Reserved.cShadowPages)
1133 {
1134 /*
1135 * Check if we can accomodate this.
1136 */
1137 /* ... later ... */
1138 if (RT_SUCCESS(rc))
1139 {
1140 /*
1141 * Update the records.
1142 */
1143 pGVM->gmm.s.Reserved.cBasePages = cBasePages;
1144 pGVM->gmm.s.Reserved.cFixedPages = cFixedPages;
1145 pGVM->gmm.s.Reserved.cShadowPages = cShadowPages;
1146 pGVM->gmm.s.enmPolicy = enmPolicy;
1147 pGVM->gmm.s.enmPriority = enmPriority;
1148 pGVM->gmm.s.fMayAllocate = true;
1149
1150 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1151 pGMM->cRegisteredVMs++;
1152 }
1153 }
1154 else
1155 rc = VERR_WRONG_ORDER;
1156 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1157 }
1158 else
1159 rc = VERR_INTERNAL_ERROR_5;
1160 RTSemFastMutexRelease(pGMM->Mtx);
1161 LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1162 return rc;
1163}
1164
1165
1166/**
1167 * VMMR0 request wrapper for GMMR0InitialReservation.
1168 *
1169 * @returns see GMMR0InitialReservation.
1170 * @param pVM Pointer to the shared VM structure.
1171 * @param idCpu VCPU id
1172 * @param pReq The request packet.
1173 */
1174GMMR0DECL(int) GMMR0InitialReservationReq(PVM pVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1175{
1176 /*
1177 * Validate input and pass it on.
1178 */
1179 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1180 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1181 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1182
1183 return GMMR0InitialReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1184}
1185
1186
1187/**
1188 * This updates the memory reservation with the additional MMIO2 and ROM pages.
1189 *
1190 * @returns VBox status code.
1191 * @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1192 *
1193 * @param pVM Pointer to the shared VM structure.
1194 * @param idCpu VCPU id
1195 * @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1196 * This does not include MMIO2 and similar.
1197 * @param cShadowPages The number of pages that may be allocated for shadow pageing structures.
1198 * @param cFixedPages The number of pages that may be allocated for fixed objects like the
1199 * hyper heap, MMIO2 and similar.
1200 *
1201 * @thread EMT.
1202 */
1203GMMR0DECL(int) GMMR0UpdateReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages)
1204{
1205 LogFlow(("GMMR0UpdateReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1206 pVM, cBasePages, cShadowPages, cFixedPages));
1207
1208 /*
1209 * Validate, get basics and take the semaphore.
1210 */
1211 PGMM pGMM;
1212 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
1213 PGVM pGVM;
1214 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1215 if (RT_FAILURE(rc))
1216 return rc;
1217
1218 AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1219 AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1220 AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1221
1222 rc = RTSemFastMutexRequest(pGMM->Mtx);
1223 AssertRC(rc);
1224 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1225 {
1226 if ( pGVM->gmm.s.Reserved.cBasePages
1227 && pGVM->gmm.s.Reserved.cFixedPages
1228 && pGVM->gmm.s.Reserved.cShadowPages)
1229 {
1230 /*
1231 * Check if we can accomodate this.
1232 */
1233 /* ... later ... */
1234 if (RT_SUCCESS(rc))
1235 {
1236 /*
1237 * Update the records.
1238 */
1239 pGMM->cReservedPages -= pGVM->gmm.s.Reserved.cBasePages
1240 + pGVM->gmm.s.Reserved.cFixedPages
1241 + pGVM->gmm.s.Reserved.cShadowPages;
1242 pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1243
1244 pGVM->gmm.s.Reserved.cBasePages = cBasePages;
1245 pGVM->gmm.s.Reserved.cFixedPages = cFixedPages;
1246 pGVM->gmm.s.Reserved.cShadowPages = cShadowPages;
1247 }
1248 }
1249 else
1250 rc = VERR_WRONG_ORDER;
1251 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1252 }
1253 else
1254 rc = VERR_INTERNAL_ERROR_5;
1255 RTSemFastMutexRelease(pGMM->Mtx);
1256 LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1257 return rc;
1258}
1259
1260
1261/**
1262 * VMMR0 request wrapper for GMMR0UpdateReservation.
1263 *
1264 * @returns see GMMR0UpdateReservation.
1265 * @param pVM Pointer to the shared VM structure.
1266 * @param idCpu VCPU id
1267 * @param pReq The request packet.
1268 */
1269GMMR0DECL(int) GMMR0UpdateReservationReq(PVM pVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1270{
1271 /*
1272 * Validate input and pass it on.
1273 */
1274 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1275 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1276 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
1277
1278 return GMMR0UpdateReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1279}
1280
1281
1282/**
1283 * Performs sanity checks on a free set.
1284 *
1285 * @returns Error count.
1286 *
1287 * @param pGMM Pointer to the GMM instance.
1288 * @param pSet Pointer to the set.
1289 * @param pszSetName The set name.
1290 * @param pszFunction The function from which it was called.
1291 * @param uLine The line number.
1292 */
1293static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1294 const char *pszFunction, unsigned uLineNo)
1295{
1296 uint32_t cErrors = 0;
1297
1298 /*
1299 * Count the free pages in all the chunks and match it against pSet->cFreePages.
1300 */
1301 uint32_t cPages = 0;
1302 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1303 {
1304 for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1305 {
1306 /** @todo check that the chunk is hash into the right set. */
1307 cPages += pCur->cFree;
1308 }
1309 }
1310 if (RT_UNLIKELY(cPages != pSet->cFreePages))
1311 {
1312 SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1313 cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1314 cErrors++;
1315 }
1316
1317 return cErrors;
1318}
1319
1320
1321/**
1322 * Performs some sanity checks on the GMM while owning lock.
1323 *
1324 * @returns Error count.
1325 *
1326 * @param pGMM Pointer to the GMM instance.
1327 * @param pszFunction The function from which it is called.
1328 * @param uLineNo The line number.
1329 */
1330static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1331{
1332 uint32_t cErrors = 0;
1333
1334 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Private, "private", pszFunction, uLineNo);
1335 cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1336 /** @todo add more sanity checks. */
1337
1338 return cErrors;
1339}
1340
1341
1342/**
1343 * Looks up a chunk in the tree and fill in the TLB entry for it.
1344 *
1345 * This is not expected to fail and will bitch if it does.
1346 *
1347 * @returns Pointer to the allocation chunk, NULL if not found.
1348 * @param pGMM Pointer to the GMM instance.
1349 * @param idChunk The ID of the chunk to find.
1350 * @param pTlbe Pointer to the TLB entry.
1351 */
1352static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1353{
1354 PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1355 AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1356 pTlbe->idChunk = idChunk;
1357 pTlbe->pChunk = pChunk;
1358 return pChunk;
1359}
1360
1361
1362/**
1363 * Finds a allocation chunk.
1364 *
1365 * This is not expected to fail and will bitch if it does.
1366 *
1367 * @returns Pointer to the allocation chunk, NULL if not found.
1368 * @param pGMM Pointer to the GMM instance.
1369 * @param idChunk The ID of the chunk to find.
1370 */
1371DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1372{
1373 /*
1374 * Do a TLB lookup, branch if not in the TLB.
1375 */
1376 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1377 if ( pTlbe->idChunk != idChunk
1378 || !pTlbe->pChunk)
1379 return gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1380 return pTlbe->pChunk;
1381}
1382
1383
1384/**
1385 * Finds a page.
1386 *
1387 * This is not expected to fail and will bitch if it does.
1388 *
1389 * @returns Pointer to the page, NULL if not found.
1390 * @param pGMM Pointer to the GMM instance.
1391 * @param idPage The ID of the page to find.
1392 */
1393DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1394{
1395 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1396 if (RT_LIKELY(pChunk))
1397 return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1398 return NULL;
1399}
1400
1401
1402/**
1403 * Unlinks the chunk from the free list it's currently on (if any).
1404 *
1405 * @param pChunk The allocation chunk.
1406 */
1407DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1408{
1409 PGMMCHUNKFREESET pSet = pChunk->pSet;
1410 if (RT_LIKELY(pSet))
1411 {
1412 pSet->cFreePages -= pChunk->cFree;
1413
1414 PGMMCHUNK pPrev = pChunk->pFreePrev;
1415 PGMMCHUNK pNext = pChunk->pFreeNext;
1416 if (pPrev)
1417 pPrev->pFreeNext = pNext;
1418 else
1419 pSet->apLists[(pChunk->cFree - 1) >> GMM_CHUNK_FREE_SET_SHIFT] = pNext;
1420 if (pNext)
1421 pNext->pFreePrev = pPrev;
1422
1423 pChunk->pSet = NULL;
1424 pChunk->pFreeNext = NULL;
1425 pChunk->pFreePrev = NULL;
1426 }
1427 else
1428 {
1429 Assert(!pChunk->pFreeNext);
1430 Assert(!pChunk->pFreePrev);
1431 Assert(!pChunk->cFree);
1432 }
1433}
1434
1435
1436/**
1437 * Links the chunk onto the appropriate free list in the specified free set.
1438 *
1439 * If no free entries, it's not linked into any list.
1440 *
1441 * @param pChunk The allocation chunk.
1442 * @param pSet The free set.
1443 */
1444DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
1445{
1446 Assert(!pChunk->pSet);
1447 Assert(!pChunk->pFreeNext);
1448 Assert(!pChunk->pFreePrev);
1449
1450 if (pChunk->cFree > 0)
1451 {
1452 pChunk->pSet = pSet;
1453 pChunk->pFreePrev = NULL;
1454 unsigned iList = (pChunk->cFree - 1) >> GMM_CHUNK_FREE_SET_SHIFT;
1455 pChunk->pFreeNext = pSet->apLists[iList];
1456 if (pChunk->pFreeNext)
1457 pChunk->pFreeNext->pFreePrev = pChunk;
1458 pSet->apLists[iList] = pChunk;
1459
1460 pSet->cFreePages += pChunk->cFree;
1461 }
1462}
1463
1464
1465/**
1466 * Frees a Chunk ID.
1467 *
1468 * @param pGMM Pointer to the GMM instance.
1469 * @param idChunk The Chunk ID to free.
1470 */
1471static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
1472{
1473 AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
1474 AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
1475 ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
1476}
1477
1478
1479/**
1480 * Allocates a new Chunk ID.
1481 *
1482 * @returns The Chunk ID.
1483 * @param pGMM Pointer to the GMM instance.
1484 */
1485static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
1486{
1487 AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
1488 AssertCompile(NIL_GMM_CHUNKID == 0);
1489
1490 /*
1491 * Try the next sequential one.
1492 */
1493 int32_t idChunk = ++pGMM->idChunkPrev;
1494#if 0 /* test the fallback first */
1495 if ( idChunk <= GMM_CHUNKID_LAST
1496 && idChunk > NIL_GMM_CHUNKID
1497 && !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
1498 return idChunk;
1499#endif
1500
1501 /*
1502 * Scan sequentially from the last one.
1503 */
1504 if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
1505 && idChunk > NIL_GMM_CHUNKID)
1506 {
1507 idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk);
1508 if (idChunk > NIL_GMM_CHUNKID)
1509 {
1510 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1511 return pGMM->idChunkPrev = idChunk;
1512 }
1513 }
1514
1515 /*
1516 * Ok, scan from the start.
1517 * We're not racing anyone, so there is no need to expect failures or have restart loops.
1518 */
1519 idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
1520 AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
1521 AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1522
1523 return pGMM->idChunkPrev = idChunk;
1524}
1525
1526
1527/**
1528 * Registers a new chunk of memory.
1529 *
1530 * This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk. The caller
1531 * must own the global lock.
1532 *
1533 * @returns VBox status code.
1534 * @param pGMM Pointer to the GMM instance.
1535 * @param pSet Pointer to the set.
1536 * @param MemObj The memory object for the chunk.
1537 * @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
1538 * affinity.
1539 * @param enmChunkType Chunk type (continuous or non-continuous)
1540 * @param ppChunk Chunk address (out)
1541 */
1542static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ MemObj, uint16_t hGVM, GMMCHUNKTYPE enmChunkType, PGMMCHUNK *ppChunk = NULL)
1543{
1544 Assert(hGVM != NIL_GVM_HANDLE || pGMM->fBoundMemoryMode);
1545
1546 int rc;
1547 PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
1548 if (pChunk)
1549 {
1550 /*
1551 * Initialize it.
1552 */
1553 pChunk->MemObj = MemObj;
1554 pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1555 pChunk->hGVM = hGVM;
1556 pChunk->iFreeHead = 0;
1557 pChunk->enmType = enmChunkType;
1558 for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
1559 {
1560 pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1561 pChunk->aPages[iPage].Free.iNext = iPage + 1;
1562 }
1563 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
1564 pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
1565
1566 /*
1567 * Allocate a Chunk ID and insert it into the tree.
1568 * This has to be done behind the mutex of course.
1569 */
1570 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1571 {
1572 pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
1573 if ( pChunk->Core.Key != NIL_GMM_CHUNKID
1574 && pChunk->Core.Key <= GMM_CHUNKID_LAST
1575 && RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
1576 {
1577 pGMM->cChunks++;
1578 gmmR0LinkChunk(pChunk, pSet);
1579 LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
1580
1581 if (ppChunk)
1582 *ppChunk = pChunk;
1583
1584 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1585 return VINF_SUCCESS;
1586 }
1587
1588 /* bail out */
1589 rc = VERR_INTERNAL_ERROR;
1590 }
1591 else
1592 rc = VERR_INTERNAL_ERROR_5;
1593
1594 RTMemFree(pChunk);
1595 }
1596 else
1597 rc = VERR_NO_MEMORY;
1598 return rc;
1599}
1600
1601
1602/**
1603 * Allocate one new chunk and add it to the specified free set.
1604 *
1605 * @returns VBox status code.
1606 * @param pGMM Pointer to the GMM instance.
1607 * @param pSet Pointer to the set.
1608 * @param hGVM The affinity of the new chunk.
1609 * @param enmChunkType Chunk type (continuous or non-continuous)
1610 * @param ppChunk Chunk address (out)
1611 *
1612 * @remarks Called without owning the mutex.
1613 */
1614static int gmmR0AllocateOneChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, uint16_t hGVM, GMMCHUNKTYPE enmChunkType, PGMMCHUNK *ppChunk = NULL)
1615{
1616 /*
1617 * Allocate the memory.
1618 */
1619 RTR0MEMOBJ MemObj;
1620 int rc;
1621
1622 AssertCompile(GMM_CHUNK_SIZE == _2M);
1623 AssertReturn(enmChunkType == GMMCHUNKTYPE_NON_CONTINUOUS || enmChunkType == GMMCHUNKTYPE_CONTINUOUS, VERR_INVALID_PARAMETER);
1624
1625 /* Leave the lock temporarily as the allocation might take long. */
1626 RTSemFastMutexRelease(pGMM->Mtx);
1627 if (enmChunkType == GMMCHUNKTYPE_NON_CONTINUOUS)
1628 rc = RTR0MemObjAllocPhysNC(&MemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
1629 else
1630 rc = RTR0MemObjAllocPhysEx(&MemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS, GMM_CHUNK_SIZE);
1631
1632 /* Grab the lock again. */
1633 int rc2 = RTSemFastMutexRequest(pGMM->Mtx);
1634 AssertRCReturn(rc2, rc2);
1635
1636 if (RT_SUCCESS(rc))
1637 {
1638 rc = gmmR0RegisterChunk(pGMM, pSet, MemObj, hGVM, enmChunkType, ppChunk);
1639 if (RT_FAILURE(rc))
1640 RTR0MemObjFree(MemObj, false /* fFreeMappings */);
1641 }
1642 /** @todo Check that RTR0MemObjAllocPhysNC always returns VERR_NO_MEMORY on
1643 * allocation failure. */
1644 return rc;
1645}
1646
1647
1648/**
1649 * Attempts to allocate more pages until the requested amount is met.
1650 *
1651 * @returns VBox status code.
1652 * @param pGMM Pointer to the GMM instance data.
1653 * @param pGVM The calling VM.
1654 * @param pSet Pointer to the free set to grow.
1655 * @param cPages The number of pages needed.
1656 *
1657 * @remarks Called owning the mutex, but will leave it temporarily while
1658 * allocating the memory!
1659 */
1660static int gmmR0AllocateMoreChunks(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages)
1661{
1662 Assert(!pGMM->fLegacyAllocationMode);
1663
1664 if (!GMM_CHECK_SANITY_IN_LOOPS(pGMM))
1665 return VERR_INTERNAL_ERROR_4;
1666
1667 if (!pGMM->fBoundMemoryMode)
1668 {
1669 /*
1670 * Try steal free chunks from the other set first. (Only take 100% free chunks.)
1671 */
1672 PGMMCHUNKFREESET pOtherSet = pSet == &pGMM->Private ? &pGMM->Shared : &pGMM->Private;
1673 while ( pSet->cFreePages < cPages
1674 && pOtherSet->cFreePages >= GMM_CHUNK_NUM_PAGES)
1675 {
1676 PGMMCHUNK pChunk = pOtherSet->apLists[GMM_CHUNK_FREE_SET_LISTS - 1];
1677 while (pChunk && pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1678 pChunk = pChunk->pFreeNext;
1679 if (!pChunk)
1680 break;
1681
1682 gmmR0UnlinkChunk(pChunk);
1683 gmmR0LinkChunk(pChunk, pSet);
1684 }
1685
1686 /*
1687 * If we need still more pages, allocate new chunks.
1688 * Note! We will leave the mutex while doing the allocation,
1689 */
1690 while (pSet->cFreePages < cPages)
1691 {
1692 int rc = gmmR0AllocateOneChunk(pGMM, pSet, pGVM->hSelf, GMMCHUNKTYPE_NON_CONTINUOUS);
1693 if (RT_FAILURE(rc))
1694 return rc;
1695 if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1696 return VERR_INTERNAL_ERROR_5;
1697 }
1698 }
1699 else
1700 {
1701 /*
1702 * The memory is bound to the VM allocating it, so we have to count
1703 * the free pages carefully as well as making sure we brand them with
1704 * our VM handle.
1705 *
1706 * Note! We will leave the mutex while doing the allocation,
1707 */
1708 uint16_t const hGVM = pGVM->hSelf;
1709 for (;;)
1710 {
1711 /* Count and see if we've reached the goal. */
1712 uint32_t cPagesFound = 0;
1713 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1714 for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1715 if (pCur->hGVM == hGVM)
1716 {
1717 cPagesFound += pCur->cFree;
1718 if (cPagesFound >= cPages)
1719 break;
1720 }
1721 if (cPagesFound >= cPages)
1722 break;
1723
1724 /* Allocate more. */
1725 int rc = gmmR0AllocateOneChunk(pGMM, pSet, hGVM, GMMCHUNKTYPE_NON_CONTINUOUS);
1726 if (RT_FAILURE(rc))
1727 return rc;
1728 if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1729 return VERR_INTERNAL_ERROR_5;
1730 }
1731 }
1732
1733 return VINF_SUCCESS;
1734}
1735
1736
1737/**
1738 * Allocates one private page.
1739 *
1740 * Worker for gmmR0AllocatePages.
1741 *
1742 * @param pGMM Pointer to the GMM instance data.
1743 * @param hGVM The GVM handle of the VM requesting memory.
1744 * @param pChunk The chunk to allocate it from.
1745 * @param pPageDesc The page descriptor.
1746 */
1747static void gmmR0AllocatePage(PGMM pGMM, uint32_t hGVM, PGMMCHUNK pChunk, PGMMPAGEDESC pPageDesc)
1748{
1749 /* update the chunk stats. */
1750 if (pChunk->hGVM == NIL_GVM_HANDLE)
1751 pChunk->hGVM = hGVM;
1752 Assert(pChunk->cFree);
1753 pChunk->cFree--;
1754 pChunk->cPrivate++;
1755
1756 /* unlink the first free page. */
1757 const uint32_t iPage = pChunk->iFreeHead;
1758 AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
1759 PGMMPAGE pPage = &pChunk->aPages[iPage];
1760 Assert(GMM_PAGE_IS_FREE(pPage));
1761 pChunk->iFreeHead = pPage->Free.iNext;
1762 Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
1763 pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage,
1764 pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
1765
1766 /* make the page private. */
1767 pPage->u = 0;
1768 AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
1769 pPage->Private.hGVM = hGVM;
1770 AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
1771 AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
1772 if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
1773 pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
1774 else
1775 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
1776
1777 /* update the page descriptor. */
1778 pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->MemObj, iPage);
1779 Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
1780 pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) | iPage;
1781 pPageDesc->idSharedPage = NIL_GMM_PAGEID;
1782}
1783
1784
1785/**
1786 * Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
1787 *
1788 * @returns VBox status code:
1789 * @retval VINF_SUCCESS on success.
1790 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
1791 * gmmR0AllocateMoreChunks is necessary.
1792 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
1793 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
1794 * that is we're trying to allocate more than we've reserved.
1795 *
1796 * @param pGMM Pointer to the GMM instance data.
1797 * @param pGVM Pointer to the shared VM structure.
1798 * @param cPages The number of pages to allocate.
1799 * @param paPages Pointer to the page descriptors.
1800 * See GMMPAGEDESC for details on what is expected on input.
1801 * @param enmAccount The account to charge.
1802 */
1803static int gmmR0AllocatePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
1804{
1805 /*
1806 * Check allocation limits.
1807 */
1808 if (RT_UNLIKELY(pGMM->cAllocatedPages + cPages > pGMM->cMaxPages))
1809 return VERR_GMM_HIT_GLOBAL_LIMIT;
1810
1811 switch (enmAccount)
1812 {
1813 case GMMACCOUNT_BASE:
1814 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages + pGVM->gmm.s.cBalloonedPages + cPages > pGVM->gmm.s.Reserved.cBasePages))
1815 {
1816 Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
1817 pGVM->gmm.s.Reserved.cBasePages, pGVM->gmm.s.Allocated.cBasePages, pGVM->gmm.s.cBalloonedPages, cPages));
1818 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
1819 }
1820 break;
1821 case GMMACCOUNT_SHADOW:
1822 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cShadowPages + cPages > pGVM->gmm.s.Reserved.cShadowPages))
1823 {
1824 Log(("gmmR0AllocatePages:Shadow: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
1825 pGVM->gmm.s.Reserved.cShadowPages, pGVM->gmm.s.Allocated.cShadowPages, cPages));
1826 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
1827 }
1828 break;
1829 case GMMACCOUNT_FIXED:
1830 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cFixedPages + cPages > pGVM->gmm.s.Reserved.cFixedPages))
1831 {
1832 Log(("gmmR0AllocatePages:Fixed: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
1833 pGVM->gmm.s.Reserved.cFixedPages, pGVM->gmm.s.Allocated.cFixedPages, cPages));
1834 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
1835 }
1836 break;
1837 default:
1838 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
1839 }
1840
1841 /*
1842 * Check if we need to allocate more memory or not. In bound memory mode this
1843 * is a bit extra work but it's easier to do it upfront than bailing out later.
1844 */
1845 PGMMCHUNKFREESET pSet = &pGMM->Private;
1846 if (pSet->cFreePages < cPages)
1847 return VERR_GMM_SEED_ME;
1848 if (pGMM->fBoundMemoryMode)
1849 {
1850 uint16_t hGVM = pGVM->hSelf;
1851 uint32_t cPagesFound = 0;
1852 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1853 for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1854 if (pCur->hGVM == hGVM)
1855 {
1856 cPagesFound += pCur->cFree;
1857 if (cPagesFound >= cPages)
1858 break;
1859 }
1860 if (cPagesFound < cPages)
1861 return VERR_GMM_SEED_ME;
1862 }
1863
1864 /*
1865 * Pick the pages.
1866 * Try make some effort keeping VMs sharing private chunks.
1867 */
1868 uint16_t hGVM = pGVM->hSelf;
1869 uint32_t iPage = 0;
1870
1871 /* first round, pick from chunks with an affinity to the VM. */
1872 for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists) && iPage < cPages; i++)
1873 {
1874 PGMMCHUNK pCurFree = NULL;
1875 PGMMCHUNK pCur = pSet->apLists[i];
1876 while (pCur && iPage < cPages)
1877 {
1878 PGMMCHUNK pNext = pCur->pFreeNext;
1879
1880 if ( pCur->hGVM == hGVM
1881 && pCur->cFree < GMM_CHUNK_NUM_PAGES)
1882 {
1883 gmmR0UnlinkChunk(pCur);
1884 for (; pCur->cFree && iPage < cPages; iPage++)
1885 gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
1886 gmmR0LinkChunk(pCur, pSet);
1887 }
1888
1889 pCur = pNext;
1890 }
1891 }
1892
1893 if (iPage < cPages)
1894 {
1895 /* second round, pick pages from the 100% empty chunks we just skipped above. */
1896 PGMMCHUNK pCurFree = NULL;
1897 PGMMCHUNK pCur = pSet->apLists[RT_ELEMENTS(pSet->apLists) - 1];
1898 while (pCur && iPage < cPages)
1899 {
1900 PGMMCHUNK pNext = pCur->pFreeNext;
1901
1902 if ( pCur->cFree == GMM_CHUNK_NUM_PAGES
1903 && ( pCur->hGVM == hGVM
1904 || !pGMM->fBoundMemoryMode))
1905 {
1906 gmmR0UnlinkChunk(pCur);
1907 for (; pCur->cFree && iPage < cPages; iPage++)
1908 gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
1909 gmmR0LinkChunk(pCur, pSet);
1910 }
1911
1912 pCur = pNext;
1913 }
1914 }
1915
1916 if ( iPage < cPages
1917 && !pGMM->fBoundMemoryMode)
1918 {
1919 /* third round, disregard affinity. */
1920 unsigned i = RT_ELEMENTS(pSet->apLists);
1921 while (i-- > 0 && iPage < cPages)
1922 {
1923 PGMMCHUNK pCurFree = NULL;
1924 PGMMCHUNK pCur = pSet->apLists[i];
1925 while (pCur && iPage < cPages)
1926 {
1927 PGMMCHUNK pNext = pCur->pFreeNext;
1928
1929 if ( pCur->cFree > GMM_CHUNK_NUM_PAGES / 2
1930 && cPages >= GMM_CHUNK_NUM_PAGES / 2)
1931 pCur->hGVM = hGVM; /* change chunk affinity */
1932
1933 gmmR0UnlinkChunk(pCur);
1934 for (; pCur->cFree && iPage < cPages; iPage++)
1935 gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
1936 gmmR0LinkChunk(pCur, pSet);
1937
1938 pCur = pNext;
1939 }
1940 }
1941 }
1942
1943 /*
1944 * Update the account.
1945 */
1946 switch (enmAccount)
1947 {
1948 case GMMACCOUNT_BASE: pGVM->gmm.s.Allocated.cBasePages += iPage; break;
1949 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Allocated.cShadowPages += iPage; break;
1950 case GMMACCOUNT_FIXED: pGVM->gmm.s.Allocated.cFixedPages += iPage; break;
1951 default:
1952 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
1953 }
1954 pGVM->gmm.s.cPrivatePages += iPage;
1955 pGMM->cAllocatedPages += iPage;
1956
1957 AssertMsgReturn(iPage == cPages, ("%u != %u\n", iPage, cPages), VERR_INTERNAL_ERROR);
1958
1959 /*
1960 * Check if we've reached some threshold and should kick one or two VMs and tell
1961 * them to inflate their balloons a bit more... later.
1962 */
1963
1964 return VINF_SUCCESS;
1965}
1966
1967
1968/**
1969 * Updates the previous allocations and allocates more pages.
1970 *
1971 * The handy pages are always taken from the 'base' memory account.
1972 * The allocated pages are not cleared and will contains random garbage.
1973 *
1974 * @returns VBox status code:
1975 * @retval VINF_SUCCESS on success.
1976 * @retval VERR_NOT_OWNER if the caller is not an EMT.
1977 * @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
1978 * @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
1979 * private page.
1980 * @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
1981 * shared page.
1982 * @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
1983 * owned by the VM.
1984 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
1985 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
1986 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
1987 * that is we're trying to allocate more than we've reserved.
1988 *
1989 * @param pVM Pointer to the shared VM structure.
1990 * @param idCpu VCPU id
1991 * @param cPagesToUpdate The number of pages to update (starting from the head).
1992 * @param cPagesToAlloc The number of pages to allocate (starting from the head).
1993 * @param paPages The array of page descriptors.
1994 * See GMMPAGEDESC for details on what is expected on input.
1995 * @thread EMT.
1996 */
1997GMMR0DECL(int) GMMR0AllocateHandyPages(PVM pVM, VMCPUID idCpu, uint32_t cPagesToUpdate, uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
1998{
1999 LogFlow(("GMMR0AllocateHandyPages: pVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2000 pVM, cPagesToUpdate, cPagesToAlloc, paPages));
2001
2002 /*
2003 * Validate, get basics and take the semaphore.
2004 * (This is a relatively busy path, so make predictions where possible.)
2005 */
2006 PGMM pGMM;
2007 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2008 PGVM pGVM;
2009 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2010 if (RT_FAILURE(rc))
2011 return rc;
2012
2013 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2014 AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2015 || (cPagesToAlloc && cPagesToAlloc < 1024),
2016 ("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2017 VERR_INVALID_PARAMETER);
2018
2019 unsigned iPage = 0;
2020 for (; iPage < cPagesToUpdate; iPage++)
2021 {
2022 AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2023 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2024 || paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2025 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2026 ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2027 VERR_INVALID_PARAMETER);
2028 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2029 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
2030 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2031 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2032 /*|| paPages[iPage].idSharedPage == NIL_GMM_PAGEID*/,
2033 ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2034 }
2035
2036 for (; iPage < cPagesToAlloc; iPage++)
2037 {
2038 AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2039 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2040 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2041 }
2042
2043 rc = RTSemFastMutexRequest(pGMM->Mtx);
2044 AssertRC(rc);
2045 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2046 {
2047
2048 /* No allocations before the initial reservation has been made! */
2049 if (RT_LIKELY( pGVM->gmm.s.Reserved.cBasePages
2050 && pGVM->gmm.s.Reserved.cFixedPages
2051 && pGVM->gmm.s.Reserved.cShadowPages))
2052 {
2053 /*
2054 * Perform the updates.
2055 * Stop on the first error.
2056 */
2057 for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2058 {
2059 if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2060 {
2061 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2062 if (RT_LIKELY(pPage))
2063 {
2064 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2065 {
2066 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2067 {
2068 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2069 if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2070 pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2071 else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2072 pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2073 /* else: NIL_RTHCPHYS nothing */
2074
2075 paPages[iPage].idPage = NIL_GMM_PAGEID;
2076 paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2077 }
2078 else
2079 {
2080 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2081 iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2082 rc = VERR_GMM_NOT_PAGE_OWNER;
2083 break;
2084 }
2085 }
2086 else
2087 {
2088 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.*Rhxs\n", iPage, paPages[iPage].idPage, sizeof(*pPage), pPage));
2089 rc = VERR_GMM_PAGE_NOT_PRIVATE;
2090 break;
2091 }
2092 }
2093 else
2094 {
2095 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2096 rc = VERR_GMM_PAGE_NOT_FOUND;
2097 break;
2098 }
2099 }
2100
2101 if (paPages[iPage].idSharedPage != NIL_GMM_PAGEID)
2102 {
2103 PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
2104 if (RT_LIKELY(pPage))
2105 {
2106 if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2107 {
2108 AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2109 Assert(pPage->Shared.cRefs);
2110 Assert(pGVM->gmm.s.cSharedPages);
2111 Assert(pGVM->gmm.s.Allocated.cBasePages);
2112
2113 pGVM->gmm.s.cSharedPages--;
2114 pGVM->gmm.s.Allocated.cBasePages--;
2115 if (!--pPage->Shared.cRefs)
2116 gmmR0FreeSharedPage(pGMM, paPages[iPage].idSharedPage, pPage);
2117
2118 paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2119 }
2120 else
2121 {
2122 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
2123 rc = VERR_GMM_PAGE_NOT_SHARED;
2124 break;
2125 }
2126 }
2127 else
2128 {
2129 Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
2130 rc = VERR_GMM_PAGE_NOT_FOUND;
2131 break;
2132 }
2133 }
2134 }
2135
2136 /*
2137 * Join paths with GMMR0AllocatePages for the allocation.
2138 * Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2139 */
2140 while (RT_SUCCESS(rc))
2141 {
2142 rc = gmmR0AllocatePages(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
2143 if ( rc != VERR_GMM_SEED_ME
2144 || pGMM->fLegacyAllocationMode)
2145 break;
2146 rc = gmmR0AllocateMoreChunks(pGMM, pGVM, &pGMM->Private, cPagesToAlloc);
2147 }
2148 }
2149 else
2150 rc = VERR_WRONG_ORDER;
2151 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2152 }
2153 else
2154 rc = VERR_INTERNAL_ERROR_5;
2155 RTSemFastMutexRelease(pGMM->Mtx);
2156 LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
2157 return rc;
2158}
2159
2160
2161/**
2162 * Allocate one or more pages.
2163 *
2164 * This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
2165 * The allocated pages are not cleared and will contains random garbage.
2166 *
2167 * @returns VBox status code:
2168 * @retval VINF_SUCCESS on success.
2169 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2170 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2171 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2172 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2173 * that is we're trying to allocate more than we've reserved.
2174 *
2175 * @param pVM Pointer to the shared VM structure.
2176 * @param idCpu VCPU id
2177 * @param cPages The number of pages to allocate.
2178 * @param paPages Pointer to the page descriptors.
2179 * See GMMPAGEDESC for details on what is expected on input.
2180 * @param enmAccount The account to charge.
2181 *
2182 * @thread EMT.
2183 */
2184GMMR0DECL(int) GMMR0AllocatePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2185{
2186 LogFlow(("GMMR0AllocatePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
2187
2188 /*
2189 * Validate, get basics and take the semaphore.
2190 */
2191 PGMM pGMM;
2192 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2193 PGVM pGVM;
2194 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2195 if (RT_FAILURE(rc))
2196 return rc;
2197
2198 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2199 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
2200 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
2201
2202 for (unsigned iPage = 0; iPage < cPages; iPage++)
2203 {
2204 AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2205 || paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
2206 || ( enmAccount == GMMACCOUNT_BASE
2207 && paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2208 && !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
2209 ("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
2210 VERR_INVALID_PARAMETER);
2211 AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2212 AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2213 }
2214
2215 rc = RTSemFastMutexRequest(pGMM->Mtx);
2216 AssertRC(rc);
2217 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2218 {
2219
2220 /* No allocations before the initial reservation has been made! */
2221 if (RT_LIKELY( pGVM->gmm.s.Reserved.cBasePages
2222 && pGVM->gmm.s.Reserved.cFixedPages
2223 && pGVM->gmm.s.Reserved.cShadowPages))
2224 {
2225 /*
2226 * gmmR0AllocatePages seed loop.
2227 * Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2228 */
2229 while (RT_SUCCESS(rc))
2230 {
2231 rc = gmmR0AllocatePages(pGMM, pGVM, cPages, paPages, enmAccount);
2232 if ( rc != VERR_GMM_SEED_ME
2233 || pGMM->fLegacyAllocationMode)
2234 break;
2235 rc = gmmR0AllocateMoreChunks(pGMM, pGVM, &pGMM->Private, cPages);
2236 }
2237 }
2238 else
2239 rc = VERR_WRONG_ORDER;
2240 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2241 }
2242 else
2243 rc = VERR_INTERNAL_ERROR_5;
2244 RTSemFastMutexRelease(pGMM->Mtx);
2245 LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
2246 return rc;
2247}
2248
2249
2250/**
2251 * VMMR0 request wrapper for GMMR0AllocatePages.
2252 *
2253 * @returns see GMMR0AllocatePages.
2254 * @param pVM Pointer to the shared VM structure.
2255 * @param idCpu VCPU id
2256 * @param pReq The request packet.
2257 */
2258GMMR0DECL(int) GMMR0AllocatePagesReq(PVM pVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
2259{
2260 /*
2261 * Validate input and pass it on.
2262 */
2263 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2264 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2265 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
2266 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
2267 VERR_INVALID_PARAMETER);
2268 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
2269 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
2270 VERR_INVALID_PARAMETER);
2271
2272 return GMMR0AllocatePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
2273}
2274
2275/**
2276 * Allocate a large page to represent guest RAM
2277 *
2278 * The allocated pages are not cleared and will contains random garbage.
2279 *
2280 * @returns VBox status code:
2281 * @retval VINF_SUCCESS on success.
2282 * @retval VERR_NOT_OWNER if the caller is not an EMT.
2283 * @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2284 * @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2285 * @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2286 * that is we're trying to allocate more than we've reserved.
2287 * @returns see GMMR0AllocatePages.
2288 * @param pVM Pointer to the shared VM structure.
2289 * @param idCpu VCPU id
2290 * @param cbPage Large page size
2291 */
2292GMMR0DECL(int) GMMR0AllocateLargePage(PVM pVM, VMCPUID idCpu, uint32_t cbPage, uint32_t *pIdPage, RTHCPHYS *pHCPhys)
2293{
2294 LogFlow(("GMMR0AllocateLargePage: pVM=%p cbPage=%x\n", pVM, cbPage));
2295
2296 AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
2297 AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
2298 AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
2299
2300 /*
2301 * Validate, get basics and take the semaphore.
2302 */
2303 PGMM pGMM;
2304 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2305 PGVM pGVM;
2306 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2307 if (RT_FAILURE(rc))
2308 return rc;
2309
2310 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
2311 if (pGMM->fLegacyAllocationMode)
2312 return VERR_NOT_SUPPORTED;
2313
2314 *pHCPhys = NIL_RTHCPHYS;
2315 *pIdPage = NIL_GMM_PAGEID;
2316
2317 rc = RTSemFastMutexRequest(pGMM->Mtx);
2318 AssertRCReturn(rc, rc);
2319 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2320 {
2321 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
2322 PGMMCHUNK pChunk;
2323 GMMPAGEDESC PageDesc;
2324
2325 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages + pGVM->gmm.s.cBalloonedPages + cPages > pGVM->gmm.s.Reserved.cBasePages))
2326 {
2327 Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
2328 pGVM->gmm.s.Reserved.cBasePages, pGVM->gmm.s.Allocated.cBasePages, cPages));
2329 RTSemFastMutexRelease(pGMM->Mtx);
2330 return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2331 }
2332
2333 /* Allocate a new continous chunk. */
2334 rc = gmmR0AllocateOneChunk(pGMM, &pGMM->Private, pGVM->hSelf, GMMCHUNKTYPE_CONTINUOUS, &pChunk);
2335 if (RT_FAILURE(rc))
2336 {
2337 RTSemFastMutexRelease(pGMM->Mtx);
2338 return rc;
2339 }
2340
2341 /* Unlink the new chunk from the free list. */
2342 gmmR0UnlinkChunk(pChunk);
2343
2344 /* Allocate all pages. */
2345 gmmR0AllocatePage(pGMM, pGVM->hSelf, pChunk, &PageDesc);
2346 /* Return the first page as we'll use the whole chunk as one big page. */
2347 *pIdPage = PageDesc.idPage;
2348 *pHCPhys = PageDesc.HCPhysGCPhys;
2349
2350 for (unsigned i = 1; i < cPages; i++)
2351 gmmR0AllocatePage(pGMM, pGVM->hSelf, pChunk, &PageDesc);
2352
2353 /* Update accounting. */
2354 pGVM->gmm.s.Allocated.cBasePages += cPages;
2355 pGVM->gmm.s.cPrivatePages += cPages;
2356 pGMM->cAllocatedPages += cPages;
2357
2358 gmmR0LinkChunk(pChunk, &pGMM->Private);
2359 }
2360 else
2361 rc = VERR_INTERNAL_ERROR_5;
2362
2363 RTSemFastMutexRelease(pGMM->Mtx);
2364 LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
2365 return rc;
2366}
2367
2368
2369/**
2370 * Free a large page
2371 *
2372 * @returns VBox status code:
2373 * @param pVM Pointer to the shared VM structure.
2374 * @param idCpu VCPU id
2375 * @param idPage Large page id
2376 */
2377GMMR0DECL(int) GMMR0FreeLargePage(PVM pVM, VMCPUID idCpu, uint32_t idPage)
2378{
2379 LogFlow(("GMMR0FreeLargePage: pVM=%p idPage=%x\n", pVM, idPage));
2380
2381 /*
2382 * Validate, get basics and take the semaphore.
2383 */
2384 PGMM pGMM;
2385 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2386 PGVM pGVM;
2387 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2388 if (RT_FAILURE(rc))
2389 return rc;
2390
2391 /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
2392 if (pGMM->fLegacyAllocationMode)
2393 return VERR_NOT_SUPPORTED;
2394
2395 rc = RTSemFastMutexRequest(pGMM->Mtx);
2396 AssertRC(rc);
2397 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2398 {
2399 const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
2400
2401 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages < cPages))
2402 {
2403 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cBasePages, cPages));
2404 RTSemFastMutexRelease(pGMM->Mtx);
2405 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2406 }
2407
2408 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2409 if ( RT_LIKELY(pPage)
2410 && RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2411 {
2412 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
2413 Assert(pChunk);
2414 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
2415 Assert(pChunk->cPrivate > 0);
2416
2417 /* Release the memory immediately. */
2418 gmmR0FreeChunk(pGMM, NULL, pChunk);
2419
2420 /* Update accounting. */
2421 pGVM->gmm.s.Allocated.cBasePages -= cPages;
2422 pGVM->gmm.s.cPrivatePages -= cPages;
2423 pGMM->cAllocatedPages -= cPages;
2424 }
2425 else
2426 rc = VERR_GMM_PAGE_NOT_FOUND;
2427 }
2428 else
2429 rc = VERR_INTERNAL_ERROR_5;
2430
2431 RTSemFastMutexRelease(pGMM->Mtx);
2432 LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
2433 return rc;
2434}
2435
2436
2437/**
2438 * VMMR0 request wrapper for GMMR0FreeLargePage.
2439 *
2440 * @returns see GMMR0FreeLargePage.
2441 * @param pVM Pointer to the shared VM structure.
2442 * @param idCpu VCPU id
2443 * @param pReq The request packet.
2444 */
2445GMMR0DECL(int) GMMR0FreeLargePageReq(PVM pVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
2446{
2447 /*
2448 * Validate input and pass it on.
2449 */
2450 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2451 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2452 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
2453 ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
2454 VERR_INVALID_PARAMETER);
2455
2456 return GMMR0FreeLargePage(pVM, idCpu, pReq->idPage);
2457}
2458
2459/**
2460 * Frees a chunk, giving it back to the host OS.
2461 *
2462 * @param pGMM Pointer to the GMM instance.
2463 * @param pGVM This is set when called from GMMR0CleanupVM so we can
2464 * unmap and free the chunk in one go.
2465 * @param pChunk The chunk to free.
2466 */
2467static void gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
2468{
2469 Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
2470
2471 /*
2472 * Cleanup hack! Unmap the chunk from the callers address space.
2473 */
2474 if ( pChunk->cMappings
2475 && pGVM)
2476 gmmR0UnmapChunk(pGMM, pGVM, pChunk);
2477
2478 /*
2479 * If there are current mappings of the chunk, then request the
2480 * VMs to unmap them. Reposition the chunk in the free list so
2481 * it won't be a likely candidate for allocations.
2482 */
2483 if (pChunk->cMappings)
2484 {
2485 /** @todo R0 -> VM request */
2486 /* The chunk can be owned by more than one VM if fBoundMemoryMode is false! */
2487 }
2488 else
2489 {
2490 /*
2491 * Try free the memory object.
2492 */
2493 int rc = RTR0MemObjFree(pChunk->MemObj, false /* fFreeMappings */);
2494 if (RT_SUCCESS(rc))
2495 {
2496 pChunk->MemObj = NIL_RTR0MEMOBJ;
2497
2498 /*
2499 * Unlink it from everywhere.
2500 */
2501 gmmR0UnlinkChunk(pChunk);
2502
2503 PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
2504 Assert(pCore == &pChunk->Core); NOREF(pCore);
2505
2506 PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
2507 if (pTlbe->pChunk == pChunk)
2508 {
2509 pTlbe->idChunk = NIL_GMM_CHUNKID;
2510 pTlbe->pChunk = NULL;
2511 }
2512
2513 Assert(pGMM->cChunks > 0);
2514 pGMM->cChunks--;
2515
2516 /*
2517 * Free the Chunk ID and struct.
2518 */
2519 gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
2520 pChunk->Core.Key = NIL_GMM_CHUNKID;
2521
2522 RTMemFree(pChunk->paMappings);
2523 pChunk->paMappings = NULL;
2524
2525 RTMemFree(pChunk);
2526 }
2527 else
2528 AssertRC(rc);
2529 }
2530}
2531
2532
2533/**
2534 * Free page worker.
2535 *
2536 * The caller does all the statistic decrementing, we do all the incrementing.
2537 *
2538 * @param pGMM Pointer to the GMM instance data.
2539 * @param pChunk Pointer to the chunk this page belongs to.
2540 * @param idPage The Page ID.
2541 * @param pPage Pointer to the page.
2542 */
2543static void gmmR0FreePageWorker(PGMM pGMM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
2544{
2545 Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
2546 pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
2547
2548 /*
2549 * Put the page on the free list.
2550 */
2551 pPage->u = 0;
2552 pPage->Free.u2State = GMM_PAGE_STATE_FREE;
2553 Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) || pChunk->iFreeHead == UINT16_MAX);
2554 pPage->Free.iNext = pChunk->iFreeHead;
2555 pChunk->iFreeHead = pPage - &pChunk->aPages[0];
2556
2557 /*
2558 * Update statistics (the cShared/cPrivate stats are up to date already),
2559 * and relink the chunk if necessary.
2560 */
2561 if ((pChunk->cFree & GMM_CHUNK_FREE_SET_MASK) == 0)
2562 {
2563 gmmR0UnlinkChunk(pChunk);
2564 pChunk->cFree++;
2565 gmmR0LinkChunk(pChunk, pChunk->cShared ? &pGMM->Shared : &pGMM->Private);
2566 }
2567 else
2568 {
2569 pChunk->cFree++;
2570 pChunk->pSet->cFreePages++;
2571
2572 /*
2573 * If the chunk becomes empty, consider giving memory back to the host OS.
2574 *
2575 * The current strategy is to try give it back if there are other chunks
2576 * in this free list, meaning if there are at least 240 free pages in this
2577 * category. Note that since there are probably mappings of the chunk,
2578 * it won't be freed up instantly, which probably screws up this logic
2579 * a bit...
2580 */
2581 if (RT_UNLIKELY( pChunk->cFree == GMM_CHUNK_NUM_PAGES
2582 && pChunk->pFreeNext
2583 && pChunk->pFreePrev
2584 && !pGMM->fLegacyAllocationMode))
2585 gmmR0FreeChunk(pGMM, NULL, pChunk);
2586 }
2587}
2588
2589
2590/**
2591 * Frees a shared page, the page is known to exist and be valid and such.
2592 *
2593 * @param pGMM Pointer to the GMM instance.
2594 * @param idPage The Page ID
2595 * @param pPage The page structure.
2596 */
2597DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage)
2598{
2599 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
2600 Assert(pChunk);
2601 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
2602 Assert(pChunk->cShared > 0);
2603 Assert(pGMM->cSharedPages > 0);
2604 Assert(pGMM->cAllocatedPages > 0);
2605 Assert(!pPage->Shared.cRefs);
2606
2607 pChunk->cShared--;
2608 pGMM->cAllocatedPages--;
2609 pGMM->cSharedPages--;
2610 gmmR0FreePageWorker(pGMM, pChunk, idPage, pPage);
2611}
2612
2613
2614/**
2615 * Frees a private page, the page is known to exist and be valid and such.
2616 *
2617 * @param pGMM Pointer to the GMM instance.
2618 * @param idPage The Page ID
2619 * @param pPage The page structure.
2620 */
2621DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage)
2622{
2623 PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
2624 Assert(pChunk);
2625 Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
2626 Assert(pChunk->cPrivate > 0);
2627 Assert(pGMM->cAllocatedPages > 0);
2628
2629 pChunk->cPrivate--;
2630 pGMM->cAllocatedPages--;
2631 gmmR0FreePageWorker(pGMM, pChunk, idPage, pPage);
2632}
2633
2634
2635/**
2636 * Common worker for GMMR0FreePages and GMMR0BalloonedPages.
2637 *
2638 * @returns VBox status code:
2639 * @retval xxx
2640 *
2641 * @param pGMM Pointer to the GMM instance data.
2642 * @param pGVM Pointer to the shared VM structure.
2643 * @param cPages The number of pages to free.
2644 * @param paPages Pointer to the page descriptors.
2645 * @param enmAccount The account this relates to.
2646 */
2647static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
2648{
2649 /*
2650 * Check that the request isn't impossible wrt to the account status.
2651 */
2652 switch (enmAccount)
2653 {
2654 case GMMACCOUNT_BASE:
2655 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages < cPages))
2656 {
2657 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cBasePages, cPages));
2658 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2659 }
2660 break;
2661 case GMMACCOUNT_SHADOW:
2662 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cShadowPages < cPages))
2663 {
2664 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cShadowPages, cPages));
2665 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2666 }
2667 break;
2668 case GMMACCOUNT_FIXED:
2669 if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cFixedPages < cPages))
2670 {
2671 Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cFixedPages, cPages));
2672 return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2673 }
2674 break;
2675 default:
2676 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
2677 }
2678
2679 /*
2680 * Walk the descriptors and free the pages.
2681 *
2682 * Statistics (except the account) are being updated as we go along,
2683 * unlike the alloc code. Also, stop on the first error.
2684 */
2685 int rc = VINF_SUCCESS;
2686 uint32_t iPage;
2687 for (iPage = 0; iPage < cPages; iPage++)
2688 {
2689 uint32_t idPage = paPages[iPage].idPage;
2690 PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2691 if (RT_LIKELY(pPage))
2692 {
2693 if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2694 {
2695 if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2696 {
2697 Assert(pGVM->gmm.s.cPrivatePages);
2698 pGVM->gmm.s.cPrivatePages--;
2699 gmmR0FreePrivatePage(pGMM, idPage, pPage);
2700 }
2701 else
2702 {
2703 Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
2704 pPage->Private.hGVM, pGVM->hSelf));
2705 rc = VERR_GMM_NOT_PAGE_OWNER;
2706 break;
2707 }
2708 }
2709 else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2710 {
2711 Assert(pGVM->gmm.s.cSharedPages);
2712 pGVM->gmm.s.cSharedPages--;
2713 Assert(pPage->Shared.cRefs);
2714 if (!--pPage->Shared.cRefs)
2715 gmmR0FreeSharedPage(pGMM, idPage, pPage);
2716 }
2717 else
2718 {
2719 Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
2720 rc = VERR_GMM_PAGE_ALREADY_FREE;
2721 break;
2722 }
2723 }
2724 else
2725 {
2726 Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
2727 rc = VERR_GMM_PAGE_NOT_FOUND;
2728 break;
2729 }
2730 paPages[iPage].idPage = NIL_GMM_PAGEID;
2731 }
2732
2733 /*
2734 * Update the account.
2735 */
2736 switch (enmAccount)
2737 {
2738 case GMMACCOUNT_BASE: pGVM->gmm.s.Allocated.cBasePages -= iPage; break;
2739 case GMMACCOUNT_SHADOW: pGVM->gmm.s.Allocated.cShadowPages -= iPage; break;
2740 case GMMACCOUNT_FIXED: pGVM->gmm.s.Allocated.cFixedPages -= iPage; break;
2741 default:
2742 AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
2743 }
2744
2745 /*
2746 * Any threshold stuff to be done here?
2747 */
2748
2749 return rc;
2750}
2751
2752
2753/**
2754 * Free one or more pages.
2755 *
2756 * This is typically used at reset time or power off.
2757 *
2758 * @returns VBox status code:
2759 * @retval xxx
2760 *
2761 * @param pVM Pointer to the shared VM structure.
2762 * @param idCpu VCPU id
2763 * @param cPages The number of pages to allocate.
2764 * @param paPages Pointer to the page descriptors containing the Page IDs for each page.
2765 * @param enmAccount The account this relates to.
2766 * @thread EMT.
2767 */
2768GMMR0DECL(int) GMMR0FreePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
2769{
2770 LogFlow(("GMMR0FreePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
2771
2772 /*
2773 * Validate input and get the basics.
2774 */
2775 PGMM pGMM;
2776 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2777 PGVM pGVM;
2778 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2779 if (RT_FAILURE(rc))
2780 return rc;
2781
2782 AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2783 AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
2784 AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
2785
2786 for (unsigned iPage = 0; iPage < cPages; iPage++)
2787 AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2788 /*|| paPages[iPage].idPage == NIL_GMM_PAGEID*/,
2789 ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2790
2791 /*
2792 * Take the semaphore and call the worker function.
2793 */
2794 rc = RTSemFastMutexRequest(pGMM->Mtx);
2795 AssertRC(rc);
2796 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2797 {
2798 rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
2799 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2800 }
2801 else
2802 rc = VERR_INTERNAL_ERROR_5;
2803 RTSemFastMutexRelease(pGMM->Mtx);
2804 LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
2805 return rc;
2806}
2807
2808
2809/**
2810 * VMMR0 request wrapper for GMMR0FreePages.
2811 *
2812 * @returns see GMMR0FreePages.
2813 * @param pVM Pointer to the shared VM structure.
2814 * @param idCpu VCPU id
2815 * @param pReq The request packet.
2816 */
2817GMMR0DECL(int) GMMR0FreePagesReq(PVM pVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
2818{
2819 /*
2820 * Validate input and pass it on.
2821 */
2822 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2823 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2824 AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
2825 ("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
2826 VERR_INVALID_PARAMETER);
2827 AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages]),
2828 ("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages])),
2829 VERR_INVALID_PARAMETER);
2830
2831 return GMMR0FreePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
2832}
2833
2834
2835/**
2836 * Report back on a memory ballooning request.
2837 *
2838 * The request may or may not have been initiated by the GMM. If it was initiated
2839 * by the GMM it is important that this function is called even if no pages were
2840 * ballooned.
2841 *
2842 * @returns VBox status code:
2843 * @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
2844 * @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
2845 * @retval VERR_GMM_OVERCOMMITED_TRY_AGAIN_IN_A_BIT - reset condition
2846 * indicating that we won't necessarily have sufficient RAM to boot
2847 * the VM again and that it should pause until this changes (we'll try
2848 * balloon some other VM). (For standard deflate we have little choice
2849 * but to hope the VM won't use the memory that was returned to it.)
2850 *
2851 * @param pVM Pointer to the shared VM structure.
2852 * @param idCpu VCPU id
2853 * @param enmAction Inflate/deflate/reset
2854 * @param cBalloonedPages The number of pages that was ballooned.
2855 *
2856 * @thread EMT.
2857 */
2858GMMR0DECL(int) GMMR0BalloonedPages(PVM pVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
2859{
2860 LogFlow(("GMMR0BalloonedPages: pVM=%p enmAction=%d cBalloonedPages=%#x\n",
2861 pVM, enmAction, cBalloonedPages));
2862
2863 AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
2864
2865 /*
2866 * Validate input and get the basics.
2867 */
2868 PGMM pGMM;
2869 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2870 PGVM pGVM;
2871 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2872 if (RT_FAILURE(rc))
2873 return rc;
2874
2875 /*
2876 * Take the sempahore and do some more validations.
2877 */
2878 rc = RTSemFastMutexRequest(pGMM->Mtx);
2879 AssertRC(rc);
2880 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2881 {
2882 switch (enmAction)
2883 {
2884 case GMMBALLOONACTION_INFLATE:
2885 {
2886 if (pGVM->gmm.s.Allocated.cBasePages >= cBalloonedPages)
2887 {
2888 /*
2889 * Record the ballooned memory.
2890 */
2891 pGMM->cBalloonedPages += cBalloonedPages;
2892 if (pGVM->gmm.s.cReqBalloonedPages)
2893 {
2894 /* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
2895 AssertFailed();
2896
2897 pGVM->gmm.s.cBalloonedPages += cBalloonedPages;
2898 pGVM->gmm.s.cReqActuallyBalloonedPages += cBalloonedPages;
2899 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n", cBalloonedPages,
2900 pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages, pGVM->gmm.s.cReqBalloonedPages, pGVM->gmm.s.cReqActuallyBalloonedPages));
2901 }
2902 else
2903 {
2904 pGVM->gmm.s.cBalloonedPages += cBalloonedPages;
2905 Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
2906 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages));
2907 }
2908 }
2909 else
2910 rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2911 break;
2912 }
2913
2914 case GMMBALLOONACTION_DEFLATE:
2915 {
2916 /* Deflate. */
2917 if (pGVM->gmm.s.cBalloonedPages >= cBalloonedPages)
2918 {
2919 /*
2920 * Record the ballooned memory.
2921 */
2922 Assert(pGMM->cBalloonedPages >= cBalloonedPages);
2923 pGMM->cBalloonedPages -= cBalloonedPages;
2924 pGVM->gmm.s.cBalloonedPages -= cBalloonedPages;
2925 if (pGVM->gmm.s.cReqDeflatePages)
2926 {
2927 AssertFailed(); /* This is path is for later. */
2928 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
2929 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages, pGVM->gmm.s.cReqDeflatePages));
2930
2931 /*
2932 * Anything we need to do here now when the request has been completed?
2933 */
2934 pGVM->gmm.s.cReqDeflatePages = 0;
2935 }
2936 else
2937 Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
2938 cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages));
2939 }
2940 else
2941 rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
2942 break;
2943 }
2944
2945 case GMMBALLOONACTION_RESET:
2946 {
2947 /* Reset to an empty balloon. */
2948 Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.cBalloonedPages);
2949
2950 pGMM->cBalloonedPages -= pGVM->gmm.s.cBalloonedPages;
2951 pGVM->gmm.s.cBalloonedPages = 0;
2952 break;
2953 }
2954
2955 default:
2956 rc = VERR_INVALID_PARAMETER;
2957 break;
2958 }
2959 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2960 }
2961 else
2962 rc = VERR_INTERNAL_ERROR_5;
2963
2964 RTSemFastMutexRelease(pGMM->Mtx);
2965 LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
2966 return rc;
2967}
2968
2969
2970/**
2971 * VMMR0 request wrapper for GMMR0BalloonedPages.
2972 *
2973 * @returns see GMMR0BalloonedPages.
2974 * @param pVM Pointer to the shared VM structure.
2975 * @param idCpu VCPU id
2976 * @param pReq The request packet.
2977 */
2978GMMR0DECL(int) GMMR0BalloonedPagesReq(PVM pVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
2979{
2980 /*
2981 * Validate input and pass it on.
2982 */
2983 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2984 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2985 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
2986 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
2987 VERR_INVALID_PARAMETER);
2988
2989 return GMMR0BalloonedPages(pVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
2990}
2991
2992/**
2993 * Return memory statistics for the hypervisor
2994 *
2995 * @returns VBox status code:
2996 * @param pVM Pointer to the shared VM structure.
2997 * @param pReq The request packet.
2998 */
2999GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PVM pVM, PGMMMEMSTATSREQ pReq)
3000{
3001 /*
3002 * Validate input and pass it on.
3003 */
3004 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3005 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3006 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3007 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3008 VERR_INVALID_PARAMETER);
3009
3010 /*
3011 * Validate input and get the basics.
3012 */
3013 PGMM pGMM;
3014 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3015 pReq->cAllocPages = pGMM->cAllocatedPages;
3016 pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
3017 pReq->cBalloonedPages = pGMM->cBalloonedPages;
3018 pReq->cMaxPages = pGMM->cMaxPages;
3019 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3020
3021 return VINF_SUCCESS;
3022}
3023
3024/**
3025 * Return memory statistics for the VM
3026 *
3027 * @returns VBox status code:
3028 * @param pVM Pointer to the shared VM structure.
3029 * @parma idCpu Cpu id.
3030 * @param pReq The request packet.
3031 */
3032GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PVM pVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
3033{
3034 /*
3035 * Validate input and pass it on.
3036 */
3037 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3038 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3039 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3040 ("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3041 VERR_INVALID_PARAMETER);
3042
3043 /*
3044 * Validate input and get the basics.
3045 */
3046 PGMM pGMM;
3047 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3048 PGVM pGVM;
3049 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3050 if (RT_FAILURE(rc))
3051 return rc;
3052
3053 /*
3054 * Take the sempahore and do some more validations.
3055 */
3056 rc = RTSemFastMutexRequest(pGMM->Mtx);
3057 AssertRC(rc);
3058 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3059 {
3060 pReq->cAllocPages = pGVM->gmm.s.Allocated.cBasePages;
3061 pReq->cBalloonedPages = pGVM->gmm.s.cBalloonedPages;
3062 pReq->cMaxPages = pGVM->gmm.s.Reserved.cBasePages;
3063 pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
3064 }
3065 else
3066 rc = VERR_INTERNAL_ERROR_5;
3067
3068 RTSemFastMutexRelease(pGMM->Mtx);
3069 LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
3070 return rc;
3071}
3072
3073/**
3074 * Unmaps a chunk previously mapped into the address space of the current process.
3075 *
3076 * @returns VBox status code.
3077 * @param pGMM Pointer to the GMM instance data.
3078 * @param pGVM Pointer to the Global VM structure.
3079 * @param pChunk Pointer to the chunk to be unmapped.
3080 */
3081static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
3082{
3083 if (!pGMM->fLegacyAllocationMode)
3084 {
3085 /*
3086 * Find the mapping and try unmapping it.
3087 */
3088 for (uint32_t i = 0; i < pChunk->cMappings; i++)
3089 {
3090 Assert(pChunk->paMappings[i].pGVM && pChunk->paMappings[i].MapObj != NIL_RTR0MEMOBJ);
3091 if (pChunk->paMappings[i].pGVM == pGVM)
3092 {
3093 /* unmap */
3094 int rc = RTR0MemObjFree(pChunk->paMappings[i].MapObj, false /* fFreeMappings (NA) */);
3095 if (RT_SUCCESS(rc))
3096 {
3097 /* update the record. */
3098 pChunk->cMappings--;
3099 if (i < pChunk->cMappings)
3100 pChunk->paMappings[i] = pChunk->paMappings[pChunk->cMappings];
3101 pChunk->paMappings[pChunk->cMappings].MapObj = NIL_RTR0MEMOBJ;
3102 pChunk->paMappings[pChunk->cMappings].pGVM = NULL;
3103 }
3104 return rc;
3105 }
3106 }
3107 }
3108 else if (pChunk->hGVM == pGVM->hSelf)
3109 return VINF_SUCCESS;
3110
3111 Log(("gmmR0MapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3112 return VERR_GMM_CHUNK_NOT_MAPPED;
3113}
3114
3115
3116/**
3117 * Maps a chunk into the user address space of the current process.
3118 *
3119 * @returns VBox status code.
3120 * @param pGMM Pointer to the GMM instance data.
3121 * @param pGVM Pointer to the Global VM structure.
3122 * @param pChunk Pointer to the chunk to be mapped.
3123 * @param ppvR3 Where to store the ring-3 address of the mapping.
3124 * In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3125 * contain the address of the existing mapping.
3126 */
3127static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
3128{
3129 /*
3130 * If we're in legacy mode this is simple.
3131 */
3132 if (pGMM->fLegacyAllocationMode)
3133 {
3134 if (pChunk->hGVM != pGVM->hSelf)
3135 {
3136 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3137 return VERR_GMM_CHUNK_NOT_FOUND;
3138 }
3139
3140 *ppvR3 = RTR0MemObjAddressR3(pChunk->MemObj);
3141 return VINF_SUCCESS;
3142 }
3143
3144 /*
3145 * Check to see if the chunk is already mapped.
3146 */
3147 for (uint32_t i = 0; i < pChunk->cMappings; i++)
3148 {
3149 Assert(pChunk->paMappings[i].pGVM && pChunk->paMappings[i].MapObj != NIL_RTR0MEMOBJ);
3150 if (pChunk->paMappings[i].pGVM == pGVM)
3151 {
3152 *ppvR3 = RTR0MemObjAddressR3(pChunk->paMappings[i].MapObj);
3153 Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3154 return VERR_GMM_CHUNK_ALREADY_MAPPED;
3155 }
3156 }
3157
3158 /*
3159 * Do the mapping.
3160 */
3161 RTR0MEMOBJ MapObj;
3162 int rc = RTR0MemObjMapUser(&MapObj, pChunk->MemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
3163 if (RT_SUCCESS(rc))
3164 {
3165 /* reallocate the array? */
3166 if ((pChunk->cMappings & 1 /*7*/) == 0)
3167 {
3168 void *pvMappings = RTMemRealloc(pChunk->paMappings, (pChunk->cMappings + 2 /*8*/) * sizeof(pChunk->paMappings[0]));
3169 if (RT_UNLIKELY(!pvMappings))
3170 {
3171 rc = RTR0MemObjFree(MapObj, false /* fFreeMappings (NA) */);
3172 AssertRC(rc);
3173 return VERR_NO_MEMORY;
3174 }
3175 pChunk->paMappings = (PGMMCHUNKMAP)pvMappings;
3176 }
3177
3178 /* insert new entry */
3179 pChunk->paMappings[pChunk->cMappings].MapObj = MapObj;
3180 pChunk->paMappings[pChunk->cMappings].pGVM = pGVM;
3181 pChunk->cMappings++;
3182
3183 *ppvR3 = RTR0MemObjAddressR3(MapObj);
3184 }
3185
3186 return rc;
3187}
3188
3189
3190/**
3191 * Map a chunk and/or unmap another chunk.
3192 *
3193 * The mapping and unmapping applies to the current process.
3194 *
3195 * This API does two things because it saves a kernel call per mapping when
3196 * when the ring-3 mapping cache is full.
3197 *
3198 * @returns VBox status code.
3199 * @param pVM The VM.
3200 * @param idCpu VCPU id
3201 * @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
3202 * @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
3203 * @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
3204 * @thread EMT
3205 */
3206GMMR0DECL(int) GMMR0MapUnmapChunk(PVM pVM, VMCPUID idCpu, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
3207{
3208 LogFlow(("GMMR0MapUnmapChunk: pVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
3209 pVM, idChunkMap, idChunkUnmap, ppvR3));
3210
3211 /*
3212 * Validate input and get the basics.
3213 */
3214 PGMM pGMM;
3215 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3216 PGVM pGVM;
3217 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3218 if (RT_FAILURE(rc))
3219 return rc;
3220
3221 AssertCompile(NIL_GMM_CHUNKID == 0);
3222 AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
3223 AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
3224
3225 if ( idChunkMap == NIL_GMM_CHUNKID
3226 && idChunkUnmap == NIL_GMM_CHUNKID)
3227 return VERR_INVALID_PARAMETER;
3228
3229 if (idChunkMap != NIL_GMM_CHUNKID)
3230 {
3231 AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
3232 *ppvR3 = NIL_RTR3PTR;
3233 }
3234
3235 /*
3236 * Take the semaphore and do the work.
3237 *
3238 * The unmapping is done last since it's easier to undo a mapping than
3239 * undoing an unmapping. The ring-3 mapping cache cannot not be so big
3240 * that it pushes the user virtual address space to within a chunk of
3241 * it it's limits, so, no problem here.
3242 */
3243 rc = RTSemFastMutexRequest(pGMM->Mtx);
3244 AssertRC(rc);
3245 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3246 {
3247 PGMMCHUNK pMap = NULL;
3248 if (idChunkMap != NIL_GVM_HANDLE)
3249 {
3250 pMap = gmmR0GetChunk(pGMM, idChunkMap);
3251 if (RT_LIKELY(pMap))
3252 rc = gmmR0MapChunk(pGMM, pGVM, pMap, ppvR3);
3253 else
3254 {
3255 Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
3256 rc = VERR_GMM_CHUNK_NOT_FOUND;
3257 }
3258 }
3259
3260 if ( idChunkUnmap != NIL_GMM_CHUNKID
3261 && RT_SUCCESS(rc))
3262 {
3263 PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
3264 if (RT_LIKELY(pUnmap))
3265 rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap);
3266 else
3267 {
3268 Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
3269 rc = VERR_GMM_CHUNK_NOT_FOUND;
3270 }
3271
3272 if (RT_FAILURE(rc) && pMap)
3273 gmmR0UnmapChunk(pGMM, pGVM, pMap);
3274 }
3275
3276 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3277 }
3278 else
3279 rc = VERR_INTERNAL_ERROR_5;
3280 RTSemFastMutexRelease(pGMM->Mtx);
3281
3282 LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
3283 return rc;
3284}
3285
3286
3287/**
3288 * VMMR0 request wrapper for GMMR0MapUnmapChunk.
3289 *
3290 * @returns see GMMR0MapUnmapChunk.
3291 * @param pVM Pointer to the shared VM structure.
3292 * @param idCpu VCPU id
3293 * @param pReq The request packet.
3294 */
3295GMMR0DECL(int) GMMR0MapUnmapChunkReq(PVM pVM, VMCPUID idCpu, PGMMMAPUNMAPCHUNKREQ pReq)
3296{
3297 /*
3298 * Validate input and pass it on.
3299 */
3300 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3301 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3302 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
3303
3304 return GMMR0MapUnmapChunk(pVM, idCpu, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
3305}
3306
3307
3308/**
3309 * Legacy mode API for supplying pages.
3310 *
3311 * The specified user address points to a allocation chunk sized block that
3312 * will be locked down and used by the GMM when the GM asks for pages.
3313 *
3314 * @returns VBox status code.
3315 * @param pVM The VM.
3316 * @param idCpu VCPU id
3317 * @param pvR3 Pointer to the chunk size memory block to lock down.
3318 */
3319GMMR0DECL(int) GMMR0SeedChunk(PVM pVM, VMCPUID idCpu, RTR3PTR pvR3)
3320{
3321 /*
3322 * Validate input and get the basics.
3323 */
3324 PGMM pGMM;
3325 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3326 PGVM pGVM;
3327 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3328 if (RT_FAILURE(rc))
3329 return rc;
3330
3331 AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
3332 AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
3333
3334 if (!pGMM->fLegacyAllocationMode)
3335 {
3336 Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
3337 return VERR_NOT_SUPPORTED;
3338 }
3339
3340 /*
3341 * Lock the memory before taking the semaphore.
3342 */
3343 RTR0MEMOBJ MemObj;
3344 rc = RTR0MemObjLockUser(&MemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ | RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
3345 if (RT_SUCCESS(rc))
3346 {
3347 /* Grab the lock. */
3348 rc = RTSemFastMutexRequest(pGMM->Mtx);
3349 AssertRCReturn(rc, rc);
3350
3351 /*
3352 * Add a new chunk with our hGVM.
3353 */
3354 rc = gmmR0RegisterChunk(pGMM, &pGMM->Private, MemObj, pGVM->hSelf, GMMCHUNKTYPE_NON_CONTINUOUS);
3355 RTSemFastMutexRelease(pGMM->Mtx);
3356
3357 if (RT_FAILURE(rc))
3358 RTR0MemObjFree(MemObj, false /* fFreeMappings */);
3359 }
3360
3361 LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
3362 return rc;
3363}
3364
3365
3366/**
3367 * Registers a new shared module for the VM
3368 *
3369 * @returns VBox status code.
3370 * @param pVM VM handle
3371 * @param idCpu VCPU id
3372 * @param pszModuleName Module name
3373 * @param pszVersion Module version
3374 * @param GCBaseAddr Module base address
3375 * @param cbModule Module size
3376 * @param cRegions Number of shared region descriptors
3377 * @param pRegions Shared region(s)
3378 */
3379GMMR0DECL(int) GMMR0RegisterSharedModule(PVM pVM, VMCPUID idCpu, char *pszModuleName, char *pszVersion, RTGCPTR GCBaseAddr, uint32_t cbModule,
3380 unsigned cRegions, VMMDEVSHAREDREGIONDESC *pRegions)
3381{
3382#ifdef VBOX_WITH_PAGE_SHARING
3383 /*
3384 * Validate input and get the basics.
3385 */
3386 PGMM pGMM;
3387 GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3388 PGVM pGVM;
3389 int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3390 if (RT_FAILURE(rc))
3391 return rc;
3392
3393 /*
3394 * Take the sempahore and do some more validations.
3395 */
3396 rc = RTSemFastMutexRequest(pGMM->Mtx);
3397 AssertRC(rc);
3398 if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3399 {
3400 /* Check if this module was already globally registered. */
3401 PGMMSHAREDMODULE pRec = (PGMMSHAREDMODULE)RTAvlGCPtrGet(&pGMM->pSharedModuleTree, GCBaseAddr);
3402 if (!pRec)
3403 {
3404 pRec = (PGMMSHAREDMODULE)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULE, aRegions[cRegions]));
3405 pRec->Core.Key = GCBaseAddr;
3406 pRec->cbModule = cbModule;
3407 /* Input limit already safe; no need to check again. */
3408 strcpy(pRec->szName, pszModuleName);
3409 strcpy(pRec->szVersion, pszVersion);
3410
3411 pRec->cRegions = cRegions;
3412
3413 for (unsigned i = 0; i < cRegions; i++)
3414 pRec->aRegions[i] = pRegions[i];
3415
3416 /** @todo references to pages */
3417 }
3418 else
3419 {
3420 }
3421
3422 GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3423 }
3424 else
3425 rc = VERR_INTERNAL_ERROR_5;
3426
3427 RTSemFastMutexRelease(pGMM->Mtx);
3428 return rc;
3429#else
3430 return VERR_NOT_IMPLEMENTED;
3431#endif
3432}
3433
3434
3435/**
3436 * VMMR0 request wrapper for GMMR0RegisterSharedModule.
3437 *
3438 * @returns see GMMR0RegisterSharedModule.
3439 * @param pVM Pointer to the shared VM structure.
3440 * @param idCpu VCPU id
3441 * @param pReq The request packet.
3442 */
3443GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
3444{
3445 /*
3446 * Validate input and pass it on.
3447 */
3448 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3449 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3450 AssertMsgReturn(pReq->Hdr.cbReq >= sizeof(*pReq) && pReq->Hdr.cbReq == RT_UOFFSETOF(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
3451
3452 return GMMR0RegisterSharedModule(pVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
3453}
3454
3455/**
3456 * Unregisters a shared module for the VM
3457 *
3458 * @returns VBox status code.
3459 * @param pVM VM handle
3460 * @param idCpu VCPU id
3461 * @param pszModuleName Module name
3462 * @param pszVersion Module version
3463 * @param GCBaseAddr Module base address
3464 * @param cbModule Module size
3465 */
3466GMMR0DECL(int) GMMR0UnregisterSharedModule(PVM pVM, VMCPUID idCpu, char *pszModuleName, char *pszVersion, RTGCPTR GCBaseAddr, uint32_t cbModule)
3467{
3468#ifdef VBOX_WITH_PAGE_SHARING
3469 return VERR_NOT_IMPLEMENTED;
3470#else
3471 return VERR_NOT_IMPLEMENTED;
3472#endif
3473}
3474
3475/**
3476 * VMMR0 request wrapper for GMMR0UnregisterSharedModule.
3477 *
3478 * @returns see GMMR0UnregisterSharedModule.
3479 * @param pVM Pointer to the shared VM structure.
3480 * @param idCpu VCPU id
3481 * @param pReq The request packet.
3482 */
3483GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
3484{
3485 /*
3486 * Validate input and pass it on.
3487 */
3488 AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3489 AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3490 AssertMsgReturn(pReq->Hdr.cbReq == sizeof(*pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
3491
3492 return GMMR0UnregisterSharedModule(pVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
3493}
3494
3495
3496/**
3497 * Checks registered modules for shared pages
3498 *
3499 * @returns VBox status code.
3500 * @param pVM VM handle
3501 * @param idCpu VCPU id
3502 */
3503GMMR0DECL(int) GMMR0CheckSharedModules(PVM pVM, VMCPUID idCpu)
3504{
3505#ifdef VBOX_WITH_PAGE_SHARING
3506 return VERR_NOT_IMPLEMENTED;
3507#else
3508 return VERR_NOT_IMPLEMENTED;
3509#endif
3510}
Note: See TracBrowser for help on using the repository browser.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette