GMMR0.cpp@ 37206

Last change on this file since 37206 was 37206, checked in by vboxsync, 14 years ago
GMMR0: Simplified the cleanup, let the VMs work in parallel since. (fixed redo-from-start bug)
Property svn:eol-style set to `native` Property svn:keywords set to `Id`
File size: 167.0 KB

Line
1	/* $Id: GMMR0.cpp 37206 2011-05-24 18:43:32Z vboxsync $ */
2	/** @file
3	* GMM - Global Memory Manager.
4	*/
5
6	/*
7	* Copyright (C) 2007-2011 Oracle Corporation
8	*
9	* This file is part of VirtualBox Open Source Edition (OSE), as
10	* available from http://www.virtualbox.org. This file is free software;
11	* you can redistribute it and/or modify it under the terms of the GNU
12	* General Public License (GPL) as published by the Free Software
13	* Foundation, in version 2 as it comes in the "COPYING" file of the
14	* VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15	* hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16	*/
17
18
19	/** @page pg_gmm GMM - The Global Memory Manager
20	*
21	* As the name indicates, this component is responsible for global memory
22	* management. Currently only guest RAM is allocated from the GMM, but this
23	* may change to include shadow page tables and other bits later.
24	*
25	* Guest RAM is managed as individual pages, but allocated from the host OS
26	* in chunks for reasons of portability / efficiency. To minimize the memory
27	* footprint all tracking structure must be as small as possible without
28	* unnecessary performance penalties.
29	*
30	* The allocation chunks has fixed sized, the size defined at compile time
31	* by the #GMM_CHUNK_SIZE \#define.
32	*
33	* Each chunk is given an unique ID. Each page also has a unique ID. The
34	* relation ship between the two IDs is:
35	* @code
36	* GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37	* idPage = (idChunk << GMM_CHUNK_SHIFT) \| iPage;
38	* @endcode
39	* Where iPage is the index of the page within the chunk. This ID scheme
40	* permits for efficient chunk and page lookup, but it relies on the chunk size
41	* to be set at compile time. The chunks are organized in an AVL tree with their
42	* IDs being the keys.
43	*
44	* The physical address of each page in an allocation chunk is maintained by
45	* the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46	* need to duplicate this information (it'll cost 8-bytes per page if we did).
47	*
48	* So what do we need to track per page? Most importantly we need to know
49	* which state the page is in:
50	* - Private - Allocated for (eventually) backing one particular VM page.
51	* - Shared - Readonly page that is used by one or more VMs and treated
52	* as COW by PGM.
53	* - Free - Not used by anyone.
54	*
55	* For the page replacement operations (sharing, defragmenting and freeing)
56	* to be somewhat efficient, private pages needs to be associated with a
57	* particular page in a particular VM.
58	*
59	* Tracking the usage of shared pages is impractical and expensive, so we'll
60	* settle for a reference counting system instead.
61	*
62	* Free pages will be chained on LIFOs
63	*
64	* On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65	* systems a 32-bit bitfield will have to suffice because of address space
66	* limitations. The #GMMPAGE structure shows the details.
67	*
68	*
69	* @section sec_gmm_alloc_strat Page Allocation Strategy
70	*
71	* The strategy for allocating pages has to take fragmentation and shared
72	* pages into account, or we may end up with with 2000 chunks with only
73	* a few pages in each. Shared pages cannot easily be reallocated because
74	* of the inaccurate usage accounting (see above). Private pages can be
75	* reallocated by a defragmentation thread in the same manner that sharing
76	* is done.
77	*
78	* The first approach is to manage the free pages in two sets depending on
79	* whether they are mainly for the allocation of shared or private pages.
80	* In the initial implementation there will be almost no possibility for
81	* mixing shared and private pages in the same chunk (only if we're really
82	* stressed on memory), but when we implement forking of VMs and have to
83	* deal with lots of COW pages it'll start getting kind of interesting.
84	*
85	* The sets are lists of chunks with approximately the same number of
86	* free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87	* consists of 16 lists. So, the first list will contain the chunks with
88	* 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89	* moved between the lists as pages are freed up or allocated.
90	*
91	*
92	* @section sec_gmm_costs Costs
93	*
94	* The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95	* entails. In addition there is the chunk cost of approximately
96	* (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97	*
98	* On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99	* and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100	* The cost on Linux is identical, but here it's because of sizeof(struct page *).
101	*
102	*
103	* @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104	*
105	* In legacy mode the page source is locked user pages and not
106	* #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107	* by the VM that locked it. We will make no attempt at implementing
108	* page sharing on these systems, just do enough to make it all work.
109	*
110	*
111	* @subsection sub_gmm_locking Serializing
112	*
113	* One simple fast mutex will be employed in the initial implementation, not
114	* two as mentioned in @ref subsec_pgmPhys_Serializing.
115	*
116	* @see @ref subsec_pgmPhys_Serializing
117	*
118	*
119	* @section sec_gmm_overcommit Memory Over-Commitment Management
120	*
121	* The GVM will have to do the system wide memory over-commitment
122	* management. My current ideas are:
123	* - Per VM oc policy that indicates how much to initially commit
124	* to it and what to do in a out-of-memory situation.
125	* - Prevent overtaxing the host.
126	*
127	* There are some challenges here, the main ones are configurability and
128	* security. Should we for instance permit anyone to request 100% memory
129	* commitment? Who should be allowed to do runtime adjustments of the
130	* config. And how to prevent these settings from being lost when the last
131	* VM process exits? The solution is probably to have an optional root
132	* daemon the will keep VMMR0.r0 in memory and enable the security measures.
133	*
134	*
135	*
136	* @section sec_gmm_numa NUMA
137	*
138	* NUMA considerations will be designed and implemented a bit later.
139	*
140	* The preliminary guesses is that we will have to try allocate memory as
141	* close as possible to the CPUs the VM is executed on (EMT and additional CPU
142	* threads). Which means it's mostly about allocation and sharing policies.
143	* Both the scheduler and allocator interface will to supply some NUMA info
144	* and we'll need to have a way to calc access costs.
145	*
146	*/
147
148
149	/*******************************************************************************
150	* Header Files *
151	*******************************************************************************/
152	#define LOG_GROUP LOG_GROUP_GMM
153	#include <VBox/rawpci.h>
154	#include <VBox/vmm/vm.h>
155	#include <VBox/vmm/gmm.h>
156	#include "GMMR0Internal.h"
157	#include <VBox/vmm/gvm.h>
158	#include <VBox/vmm/pgm.h>
159	#include <VBox/log.h>
160	#include <VBox/param.h>
161	#include <VBox/err.h>
162	#include <iprt/asm.h>
163	#include <iprt/avl.h>
164	#include <iprt/list.h>
165	#include <iprt/mem.h>
166	#include <iprt/memobj.h>
167	#include <iprt/semaphore.h>
168	#include <iprt/string.h>
169	#include <iprt/time.h>
170
171
172	/*******************************************************************************
173	* Structures and Typedefs *
174	*******************************************************************************/
175	/** Pointer to set of free chunks. */
176	typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
177
178	/** Pointer to a GMM allocation chunk. */
179	typedef struct GMMCHUNK *PGMMCHUNK;
180
181	/**
182	* The per-page tracking structure employed by the GMM.
183	*
184	* On 32-bit hosts we'll some trickery is necessary to compress all
185	* the information into 32-bits. When the fSharedFree member is set,
186	* the 30th bit decides whether it's a free page or not.
187	*
188	* Because of the different layout on 32-bit and 64-bit hosts, macros
189	* are used to get and set some of the data.
190	*/
191	typedef union GMMPAGE
192	{
193	#if HC_ARCH_BITS == 64
194	/** Unsigned integer view. */
195	uint64_t u;
196
197	/** The common view. */
198	struct GMMPAGECOMMON
199	{
200	uint32_t uStuff1 : 32;
201	uint32_t uStuff2 : 30;
202	/** The page state. */
203	uint32_t u2State : 2;
204	} Common;
205
206	/** The view of a private page. */
207	struct GMMPAGEPRIVATE
208	{
209	/** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
210	uint32_t pfn;
211	/** The GVM handle. (64K VMs) */
212	uint32_t hGVM : 16;
213	/** Reserved. */
214	uint32_t u16Reserved : 14;
215	/** The page state. */
216	uint32_t u2State : 2;
217	} Private;
218
219	/** The view of a shared page. */
220	struct GMMPAGESHARED
221	{
222	/** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
223	uint32_t pfn;
224	/** The reference count (64K VMs). */
225	uint32_t cRefs : 16;
226	/** Reserved. Checksum or something? Two hGVMs for forking? */
227	uint32_t u14Reserved : 14;
228	/** The page state. */
229	uint32_t u2State : 2;
230	} Shared;
231
232	/** The view of a free page. */
233	struct GMMPAGEFREE
234	{
235	/** The index of the next page in the free list. UINT16_MAX is NIL. */
236	uint16_t iNext;
237	/** Reserved. Checksum or something? */
238	uint16_t u16Reserved0;
239	/** Reserved. Checksum or something? */
240	uint32_t u30Reserved1 : 30;
241	/** The page state. */
242	uint32_t u2State : 2;
243	} Free;
244
245	#else /* 32-bit */
246	/** Unsigned integer view. */
247	uint32_t u;
248
249	/** The common view. */
250	struct GMMPAGECOMMON
251	{
252	uint32_t uStuff : 30;
253	/** The page state. */
254	uint32_t u2State : 2;
255	} Common;
256
257	/** The view of a private page. */
258	struct GMMPAGEPRIVATE
259	{
260	/** The guest page frame number. (Max addressable: 2 ^ 36) */
261	uint32_t pfn : 24;
262	/** The GVM handle. (127 VMs) */
263	uint32_t hGVM : 7;
264	/** The top page state bit, MBZ. */
265	uint32_t fZero : 1;
266	} Private;
267
268	/** The view of a shared page. */
269	struct GMMPAGESHARED
270	{
271	/** The reference count. */
272	uint32_t cRefs : 30;
273	/** The page state. */
274	uint32_t u2State : 2;
275	} Shared;
276
277	/** The view of a free page. */
278	struct GMMPAGEFREE
279	{
280	/** The index of the next page in the free list. UINT16_MAX is NIL. */
281	uint32_t iNext : 16;
282	/** Reserved. Checksum or something? */
283	uint32_t u14Reserved : 14;
284	/** The page state. */
285	uint32_t u2State : 2;
286	} Free;
287	#endif
288	} GMMPAGE;
289	AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
290	/** Pointer to a GMMPAGE. */
291	typedef GMMPAGE *PGMMPAGE;
292
293
294	/** @name The Page States.
295	* @{ */
296	/** A private page. */
297	#define GMM_PAGE_STATE_PRIVATE 0
298	/** A private page - alternative value used on the 32-bit implementation.
299	* This will never be used on 64-bit hosts. */
300	#define GMM_PAGE_STATE_PRIVATE_32 1
301	/** A shared page. */
302	#define GMM_PAGE_STATE_SHARED 2
303	/** A free page. */
304	#define GMM_PAGE_STATE_FREE 3
305	/** @} */
306
307
308	/** @def GMM_PAGE_IS_PRIVATE
309	*
310	* @returns true if private, false if not.
311	* @param pPage The GMM page.
312	*/
313	#if HC_ARCH_BITS == 64
314	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
315	#else
316	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
317	#endif
318
319	/** @def GMM_PAGE_IS_SHARED
320	*
321	* @returns true if shared, false if not.
322	* @param pPage The GMM page.
323	*/
324	#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
325
326	/** @def GMM_PAGE_IS_FREE
327	*
328	* @returns true if free, false if not.
329	* @param pPage The GMM page.
330	*/
331	#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
332
333	/** @def GMM_PAGE_PFN_LAST
334	* The last valid guest pfn range.
335	* @remark Some of the values outside the range has special meaning,
336	* see GMM_PAGE_PFN_UNSHAREABLE.
337	*/
338	#if HC_ARCH_BITS == 64
339	# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
340	#else
341	# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
342	#endif
343	AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
344
345	/** @def GMM_PAGE_PFN_UNSHAREABLE
346	* Indicates that this page isn't used for normal guest memory and thus isn't shareable.
347	*/
348	#if HC_ARCH_BITS == 64
349	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
350	#else
351	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
352	#endif
353	AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
354
355
356	/**
357	* A GMM allocation chunk ring-3 mapping record.
358	*
359	* This should really be associated with a session and not a VM, but
360	* it's simpler to associated with a VM and cleanup with the VM object
361	* is destroyed.
362	*/
363	typedef struct GMMCHUNKMAP
364	{
365	/** The mapping object. */
366	RTR0MEMOBJ hMapObj;
367	/** The VM owning the mapping. */
368	PGVM pGVM;
369	} GMMCHUNKMAP;
370	/** Pointer to a GMM allocation chunk mapping. */
371	typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
372
373
374	/**
375	* A GMM allocation chunk.
376	*/
377	typedef struct GMMCHUNK
378	{
379	/** The AVL node core.
380	* The Key is the chunk ID. (Giant mtx.) */
381	AVLU32NODECORE Core;
382	/** The memory object.
383	* Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
384	* what the host can dish up with. (Chunk mtx protects mapping accesses
385	* and related frees.) */
386	RTR0MEMOBJ hMemObj;
387	/** Pointer to the next chunk in the free list. (Giant mtx.) */
388	PGMMCHUNK pFreeNext;
389	/** Pointer to the previous chunk in the free list. (Giant mtx.) */
390	PGMMCHUNK pFreePrev;
391	/** Pointer to the free set this chunk belongs to. NULL for
392	* chunks with no free pages. (Giant mtx.) */
393	PGMMCHUNKFREESET pSet;
394	/** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
395	RTLISTNODE ListNode;
396	/** Pointer to an array of mappings. (Chunk mtx.) */
397	PGMMCHUNKMAP paMappingsX;
398	/** The number of mappings. (Chunk mtx.) */
399	uint16_t cMappingsX;
400	/** The mapping lock this chunk is using using. UINT16_MAX if nobody is
401	* mapping or freeing anything. (Giant mtx.) */
402	uint8_t volatile iChunkMtx;
403	/** Flags field reserved for future use (like eliminating enmType).
404	* (Giant mtx.) */
405	uint8_t fFlags;
406	/** The head of the list of free pages. UINT16_MAX is the NIL value.
407	* (Giant mtx.) */
408	uint16_t iFreeHead;
409	/** The number of free pages. (Giant mtx.) */
410	uint16_t cFree;
411	/** The GVM handle of the VM that first allocated pages from this chunk, this
412	* is used as a preference when there are several chunks to choose from.
413	* When in bound memory mode this isn't a preference any longer. (Giant
414	* mtx.) */
415	uint16_t hGVM;
416	/** The ID of the NUMA node the memory mostly resides on. (Reserved for
417	* future use.) (Giant mtx.) */
418	uint16_t idNumaNode;
419	/** The number of private pages. (Giant mtx.) */
420	uint16_t cPrivate;
421	/** The number of shared pages. (Giant mtx.) */
422	uint16_t cShared;
423	/** The pages. (Giant mtx.) */
424	GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
425	} GMMCHUNK;
426
427	/** Indicates that the NUMA properies of the memory is unknown. */
428	#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
429
430	/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
431	* @{ */
432	/** Indicates that the chunk is a large page (2MB). */
433	#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
434	/** @} */
435
436
437	/**
438	* An allocation chunk TLB entry.
439	*/
440	typedef struct GMMCHUNKTLBE
441	{
442	/** The chunk id. */
443	uint32_t idChunk;
444	/** Pointer to the chunk. */
445	PGMMCHUNK pChunk;
446	} GMMCHUNKTLBE;
447	/** Pointer to an allocation chunk TLB entry. */
448	typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
449
450
451	/** The number of entries tin the allocation chunk TLB. */
452	#define GMM_CHUNKTLB_ENTRIES 32
453	/** Gets the TLB entry index for the given Chunk ID. */
454	#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
455
456	/**
457	* An allocation chunk TLB.
458	*/
459	typedef struct GMMCHUNKTLB
460	{
461	/** The TLB entries. */
462	GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
463	} GMMCHUNKTLB;
464	/** Pointer to an allocation chunk TLB. */
465	typedef GMMCHUNKTLB *PGMMCHUNKTLB;
466
467
468	/** The GMMCHUNK::cFree shift count. */
469	#define GMM_CHUNK_FREE_SET_SHIFT 4
470
471
472	/**
473	* A set of free chunks.
474	*/
475	typedef struct GMMCHUNKFREESET
476	{
477	/** The number of free pages in the set. */
478	uint64_t cFreePages;
479	/** The generation ID for the set. This is incremented whenever
480	* something is linked or unlinked from this set. */
481	uint64_t idGeneration;
482	/** Chunks ordered by increasing number of free pages. */
483	PGMMCHUNK apLists[GMM_CHUNK_NUM_PAGES >> GMM_CHUNK_FREE_SET_SHIFT];
484	} GMMCHUNKFREESET;
485
486
487	/**
488	* The GMM instance data.
489	*/
490	typedef struct GMM
491	{
492	/** Magic / eye catcher. GMM_MAGIC */
493	uint32_t u32Magic;
494	/** The number of threads waiting on the mutex. */
495	uint32_t cMtxContenders;
496	/** The fast mutex protecting the GMM.
497	* More fine grained locking can be implemented later if necessary. */
498	RTSEMFASTMUTEX hMtx;
499	#ifdef VBOX_STRICT
500	/** The current mutex owner. */
501	RTNATIVETHREAD hMtxOwner;
502	#endif
503	/** The chunk tree. */
504	PAVLU32NODECORE pChunks;
505	/** The chunk TLB. */
506	GMMCHUNKTLB ChunkTLB;
507	/** The private free set. */
508	GMMCHUNKFREESET Private;
509	/** The shared free set. */
510	GMMCHUNKFREESET Shared;
511
512	/** Shared module tree (global). */
513	/** @todo separate trees for distinctly different guest OSes. */
514	PAVLGCPTRNODECORE pGlobalSharedModuleTree;
515
516	/** The chunk list. For simplifying the cleanup process. */
517	RTLISTNODE ChunkList;
518
519	/** The maximum number of pages we're allowed to allocate.
520	* @gcfgm 64-bit GMM/MaxPages Direct.
521	* @gcfgm 32-bit GMM/PctPages Relative to the number of host pages. */
522	uint64_t cMaxPages;
523	/** The number of pages that has been reserved.
524	* The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
525	uint64_t cReservedPages;
526	/** The number of pages that we have over-committed in reservations. */
527	uint64_t cOverCommittedPages;
528	/** The number of actually allocated (committed if you like) pages. */
529	uint64_t cAllocatedPages;
530	/** The number of pages that are shared. A subset of cAllocatedPages. */
531	uint64_t cSharedPages;
532	/** The number of pages that are actually shared between VMs. */
533	uint64_t cDuplicatePages;
534	/** The number of pages that are shared that has been left behind by
535	* VMs not doing proper cleanups. */
536	uint64_t cLeftBehindSharedPages;
537	/** The number of allocation chunks.
538	* (The number of pages we've allocated from the host can be derived from this.) */
539	uint32_t cChunks;
540	/** The number of current ballooned pages. */
541	uint64_t cBalloonedPages;
542
543	/** The legacy allocation mode indicator.
544	* This is determined at initialization time. */
545	bool fLegacyAllocationMode;
546	/** The bound memory mode indicator.
547	* When set, the memory will be bound to a specific VM and never
548	* shared. This is always set if fLegacyAllocationMode is set.
549	* (Also determined at initialization time.) */
550	bool fBoundMemoryMode;
551	/** The number of registered VMs. */
552	uint16_t cRegisteredVMs;
553
554	/** The number of freed chunks ever. This is used a list generation to
555	* avoid restarting the cleanup scanning when the list wasn't modified. */
556	uint32_t cFreedChunks;
557	/** The previous allocated Chunk ID.
558	* Used as a hint to avoid scanning the whole bitmap. */
559	uint32_t idChunkPrev;
560	/** Chunk ID allocation bitmap.
561	* Bits of allocated IDs are set, free ones are clear.
562	* The NIL id (0) is marked allocated. */
563	uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
564
565	/** The index of the next mutex to use. */
566	uint32_t iNextChunkMtx;
567	/** Chunk locks for reducing lock contention without having to allocate
568	* one lock per chunk. */
569	struct
570	{
571	/** The mutex */
572	RTSEMFASTMUTEX hMtx;
573	/** The number of threads currently using this mutex. */
574	uint32_t volatile cUsers;
575	} aChunkMtx[64];
576	} GMM;
577	/** Pointer to the GMM instance. */
578	typedef GMM *PGMM;
579
580	/** The value of GMM::u32Magic (Katsuhiro Otomo). */
581	#define GMM_MAGIC UINT32_C(0x19540414)
582
583
584	/**
585	* GMM chunk mutex state.
586	*
587	* This is returned by gmmR0ChunkMutexAcquire and is used by the other
588	* gmmR0ChunkMutex* methods.
589	*/
590	typedef struct GMMR0CHUNKMTXSTATE
591	{
592	PGMM pGMM;
593	/** The index of the chunk mutex. */
594	uint8_t iChunkMtx;
595	/** The relevant flags (GMMR0CHUNK_MTX_XXX). */
596	uint8_t fFlags;
597	} GMMR0CHUNKMTXSTATE;
598	/** Pointer to a chunk mutex state. */
599	typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
600
601	/** @name GMMR0CHUNK_MTX_XXX
602	* @{ */
603	#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
604	#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
605	#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
606	#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
607	#define GMMR0CHUNK_MTX_END UINT32_C(4)
608	/** @} */
609
610
611	/*******************************************************************************
612	* Global Variables *
613	*******************************************************************************/
614	/** Pointer to the GMM instance data. */
615	static PGMM g_pGMM = NULL;
616
617	/** Macro for obtaining and validating the g_pGMM pointer.
618	* On failure it will return from the invoking function with the specified return value.
619	*
620	* @param pGMM The name of the pGMM variable.
621	* @param rc The return value on failure. Use VERR_INTERNAL_ERROR for
622	* VBox status codes.
623	*/
624	#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
625	do { \
626	(pGMM) = g_pGMM; \
627	AssertPtrReturn((pGMM), (rc)); \
628	AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
629	} while (0)
630
631	/** Macro for obtaining and validating the g_pGMM pointer, void function variant.
632	* On failure it will return from the invoking function.
633	*
634	* @param pGMM The name of the pGMM variable.
635	*/
636	#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
637	do { \
638	(pGMM) = g_pGMM; \
639	AssertPtrReturnVoid((pGMM)); \
640	AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
641	} while (0)
642
643
644	/** @def GMM_CHECK_SANITY_UPON_ENTERING
645	* Checks the sanity of the GMM instance data before making changes.
646	*
647	* This is macro is a stub by default and must be enabled manually in the code.
648	*
649	* @returns true if sane, false if not.
650	* @param pGMM The name of the pGMM variable.
651	*/
652	#if defined(VBOX_STRICT) && 0
653	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
654	#else
655	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
656	#endif
657
658	/** @def GMM_CHECK_SANITY_UPON_LEAVING
659	* Checks the sanity of the GMM instance data after making changes.
660	*
661	* This is macro is a stub by default and must be enabled manually in the code.
662	*
663	* @returns true if sane, false if not.
664	* @param pGMM The name of the pGMM variable.
665	*/
666	#if defined(VBOX_STRICT) && 0
667	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
668	#else
669	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
670	#endif
671
672	/** @def GMM_CHECK_SANITY_IN_LOOPS
673	* Checks the sanity of the GMM instance in the allocation loops.
674	*
675	* This is macro is a stub by default and must be enabled manually in the code.
676	*
677	* @returns true if sane, false if not.
678	* @param pGMM The name of the pGMM variable.
679	*/
680	#if defined(VBOX_STRICT) && 0
681	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
682	#else
683	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
684	#endif
685
686
687	/*******************************************************************************
688	* Internal Functions *
689	*******************************************************************************/
690	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
691	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
692	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
693	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
694	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
695	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
696	static void gmmR0FreeSharedPage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage);
697	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
698	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
699
700
701
702	/**
703	* Initializes the GMM component.
704	*
705	* This is called when the VMMR0.r0 module is loaded and protected by the
706	* loader semaphore.
707	*
708	* @returns VBox status code.
709	*/
710	GMMR0DECL(int) GMMR0Init(void)
711	{
712	LogFlow(("GMMInit:\n"));
713
714	/*
715	* Allocate the instance data and the locks.
716	*/
717	PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
718	if (!pGMM)
719	return VERR_NO_MEMORY;
720
721	pGMM->u32Magic = GMM_MAGIC;
722	for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
723	pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
724	RTListInit(&pGMM->ChunkList);
725	ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
726
727	int rc = RTSemFastMutexCreate(&pGMM->hMtx);
728	if (RT_SUCCESS(rc))
729	{
730	unsigned iMtx;
731	for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
732	{
733	rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
734	if (RT_FAILURE(rc))
735	break;
736	}
737	if (RT_SUCCESS(rc))
738	{
739	/*
740	* Check and see if RTR0MemObjAllocPhysNC works.
741	*/
742	#if 0 /* later, see #3170. */
743	RTR0MEMOBJ MemObj;
744	rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
745	if (RT_SUCCESS(rc))
746	{
747	rc = RTR0MemObjFree(MemObj, true);
748	AssertRC(rc);
749	}
750	else if (rc == VERR_NOT_SUPPORTED)
751	pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
752	else
753	SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
754	#else
755	# if defined(RT_OS_WINDOWS) \|\| (defined(RT_OS_SOLARIS) && ARCH_BITS == 64) \|\| defined(RT_OS_LINUX) \|\| defined(RT_OS_FREEBSD)
756	pGMM->fLegacyAllocationMode = false;
757	# if ARCH_BITS == 32
758	/* Don't reuse possibly partial chunks because of the virtual
759	address space limitation. */
760	pGMM->fBoundMemoryMode = true;
761	# else
762	pGMM->fBoundMemoryMode = false;
763	# endif
764	# else
765	pGMM->fLegacyAllocationMode = true;
766	pGMM->fBoundMemoryMode = true;
767	# endif
768	#endif
769
770	/*
771	* Query system page count and guess a reasonable cMaxPages value.
772	*/
773	pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
774
775	g_pGMM = pGMM;
776	LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
777	return VINF_SUCCESS;
778	}
779
780	/*
781	* Bail out.
782	*/
783	while (iMtx-- > 0)
784	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
785	RTSemFastMutexDestroy(pGMM->hMtx);
786	}
787
788	pGMM->u32Magic = 0;
789	RTMemFree(pGMM);
790	SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
791	return rc;
792	}
793
794
795	/**
796	* Terminates the GMM component.
797	*/
798	GMMR0DECL(void) GMMR0Term(void)
799	{
800	LogFlow(("GMMTerm:\n"));
801
802	/*
803	* Take care / be paranoid...
804	*/
805	PGMM pGMM = g_pGMM;
806	if (!VALID_PTR(pGMM))
807	return;
808	if (pGMM->u32Magic != GMM_MAGIC)
809	{
810	SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
811	return;
812	}
813
814	/*
815	* Undo what init did and free all the resources we've acquired.
816	*/
817	/* Destroy the fundamentals. */
818	g_pGMM = NULL;
819	pGMM->u32Magic = ~GMM_MAGIC;
820	RTSemFastMutexDestroy(pGMM->hMtx);
821	pGMM->hMtx = NIL_RTSEMFASTMUTEX;
822
823	/* Free any chunks still hanging around. */
824	RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
825
826	/* Destroy the chunk locks. */
827	for (unsigned iMtx = 0; iMtx++ < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
828	{
829	Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
830	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
831	pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
832	}
833
834	/* Finally the instance data itself. */
835	RTMemFree(pGMM);
836	LogFlow(("GMMTerm: done\n"));
837	}
838
839
840	/**
841	* RTAvlU32Destroy callback.
842	*
843	* @returns 0
844	* @param pNode The node to destroy.
845	* @param pvGMM The GMM handle.
846	*/
847	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
848	{
849	PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
850
851	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
852	SUPR0Printf("GMMR0Term: %p/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
853	pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
854
855	int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
856	if (RT_FAILURE(rc))
857	{
858	SUPR0Printf("GMMR0Term: %p/%#x: RTRMemObjFree(%p,true) -> %d (cMappings=%d)\n", pChunk,
859	pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
860	AssertRC(rc);
861	}
862	pChunk->hMemObj = NIL_RTR0MEMOBJ;
863
864	RTMemFree(pChunk->paMappingsX);
865	pChunk->paMappingsX = NULL;
866
867	RTMemFree(pChunk);
868	NOREF(pvGMM);
869	return 0;
870	}
871
872
873	/**
874	* Initializes the per-VM data for the GMM.
875	*
876	* This is called from within the GVMM lock (from GVMMR0CreateVM)
877	* and should only initialize the data members so GMMR0CleanupVM
878	* can deal with them. We reserve no memory or anything here,
879	* that's done later in GMMR0InitVM.
880	*
881	* @param pGVM Pointer to the Global VM structure.
882	*/
883	GMMR0DECL(void) GMMR0InitPerVMData(PGVM pGVM)
884	{
885	AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
886
887	pGVM->gmm.s.enmPolicy = GMMOCPOLICY_INVALID;
888	pGVM->gmm.s.enmPriority = GMMPRIORITY_INVALID;
889	pGVM->gmm.s.fMayAllocate = false;
890	}
891
892
893	/**
894	* Acquires the GMM giant lock.
895	*
896	* @returns Assert status code from RTSemFastMutexRequest.
897	* @param pGMM Pointer to the GMM instance.
898	*/
899	static int gmmR0MutexAcquire(PGMM pGMM)
900	{
901	ASMAtomicIncU32(&pGMM->cMtxContenders);
902	int rc = RTSemFastMutexRequest(pGMM->hMtx);
903	ASMAtomicDecU32(&pGMM->cMtxContenders);
904	AssertRC(rc);
905	#ifdef VBOX_STRICT
906	pGMM->hMtxOwner = RTThreadNativeSelf();
907	#endif
908	return rc;
909	}
910
911
912	/**
913	* Releases the GMM giant lock.
914	*
915	* @returns Assert status code from RTSemFastMutexRequest.
916	* @param pGMM Pointer to the GMM instance.
917	*/
918	static int gmmR0MutexRelease(PGMM pGMM)
919	{
920	#ifdef VBOX_STRICT
921	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
922	#endif
923	int rc = RTSemFastMutexRelease(pGMM->hMtx);
924	AssertRC(rc);
925	return rc;
926	}
927
928
929	/**
930	* Yields the GMM giant lock if there is contention and a certain minimum time
931	* has elapsed since we took it.
932	*
933	* @returns @c true if the mutex was yielded, @c false if not.
934	* @param pGMM Pointer to the GMM instance.
935	* @param puLockNanoTS Where the lock acquisition time stamp is kept
936	* (in/out).
937	*/
938	static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
939	{
940	/*
941	* If nobody is contending the mutex, don't bother checking the time.
942	*/
943	if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
944	return false;
945
946	/*
947	* Don't yield if we haven't executed for at least 2 milliseconds.
948	*/
949	uint64_t uNanoNow = RTTimeSystemNanoTS();
950	if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
951	return false;
952
953	/*
954	* Yield the mutex.
955	*/
956	#ifdef VBOX_STRICT
957	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
958	#endif
959	ASMAtomicIncU32(&pGMM->cMtxContenders);
960	int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
961
962	RTThreadYield();
963
964	int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
965	*puLockNanoTS = RTTimeSystemNanoTS();
966	ASMAtomicDecU32(&pGMM->cMtxContenders);
967	#ifdef VBOX_STRICT
968	pGMM->hMtxOwner = RTThreadNativeSelf();
969	#endif
970
971	return true;
972	}
973
974
975	/**
976	* Acquires a chunk lock.
977	*
978	* The caller must own the giant lock.
979	*
980	* @returns Assert status code from RTSemFastMutexRequest.
981	* @param pMtxState The chunk mutex state info. (Avoids
982	* passing the same flags and stuff around
983	* for subsequent release and drop-giant
984	* calls.)
985	* @param pGMM Pointer to the GMM instance.
986	* @param pChunk Pointer to the chunk.
987	* @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
988	*/
989	static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
990	{
991	Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
992	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
993
994	pMtxState->pGMM = pGMM;
995	pMtxState->fFlags = (uint8_t)fFlags;
996
997	/*
998	* Get the lock index and reference the lock.
999	*/
1000	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1001	uint32_t iChunkMtx = pChunk->iChunkMtx;
1002	if (iChunkMtx == UINT8_MAX)
1003	{
1004	iChunkMtx = pGMM->iNextChunkMtx++;
1005	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1006
1007	/* Try get an unused one... */
1008	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1009	{
1010	iChunkMtx = pGMM->iNextChunkMtx++;
1011	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1012	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1013	{
1014	iChunkMtx = pGMM->iNextChunkMtx++;
1015	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1016	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1017	{
1018	iChunkMtx = pGMM->iNextChunkMtx++;
1019	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1020	}
1021	}
1022	}
1023
1024	pChunk->iChunkMtx = iChunkMtx;
1025	}
1026	AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1027	pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1028	ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1029
1030	/*
1031	* Drop the giant?
1032	*/
1033	if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1034	{
1035	/** @todo GMM life cycle cleanup (we may race someone
1036	* destroying and cleaning up GMM)? */
1037	gmmR0MutexRelease(pGMM);
1038	}
1039
1040	/*
1041	* Take the chunk mutex.
1042	*/
1043	int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1044	AssertRC(rc);
1045	return rc;
1046	}
1047
1048
1049	/**
1050	* Releases the GMM giant lock.
1051	*
1052	* @returns Assert status code from RTSemFastMutexRequest.
1053	* @param pGMM Pointer to the GMM instance.
1054	* @param pChunk Pointer to the chunk if it's still
1055	* alive, NULL if it isn't. This is used to deassociate
1056	* the chunk from the mutex on the way out so a new one
1057	* can be selected next time, thus avoiding contented
1058	* mutexes.
1059	*/
1060	static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1061	{
1062	PGMM pGMM = pMtxState->pGMM;
1063
1064	/*
1065	* Release the chunk mutex and reacquire the giant if requested.
1066	*/
1067	int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1068	AssertRC(rc);
1069	if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1070	rc = gmmR0MutexAcquire(pGMM);
1071	else
1072	Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1073
1074	/*
1075	* Drop the chunk mutex user reference and deassociate it from the chunk
1076	* when possible.
1077	*/
1078	if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1079	&& pChunk
1080	&& RT_SUCCESS(rc) )
1081	{
1082	if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1083	pChunk->iChunkMtx = UINT8_MAX;
1084	else
1085	{
1086	rc = gmmR0MutexAcquire(pGMM);
1087	if (RT_SUCCESS(rc))
1088	{
1089	if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1090	pChunk->iChunkMtx = UINT8_MAX;
1091	rc = gmmR0MutexRelease(pGMM);
1092	}
1093	}
1094	}
1095
1096	pMtxState->pGMM = NULL;
1097	return rc;
1098	}
1099
1100
1101	/**
1102	* Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1103	* chunk locked.
1104	*
1105	* This only works if gmmR0ChunkMutexAcquire was called with
1106	* GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1107	* mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1108	*
1109	* @returns VBox status code (assuming success is ok).
1110	* @param pMtxState Pointer to the chunk mutex state.
1111	*/
1112	static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1113	{
1114	AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_INTERNAL_ERROR_2);
1115	Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1116	pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1117	/** @todo GMM life cycle cleanup (we may race someone
1118	* destroying and cleaning up GMM)? */
1119	return gmmR0MutexRelease(pMtxState->pGMM);
1120	}
1121
1122
1123	/**
1124	* Cleans up when a VM is terminating.
1125	*
1126	* @param pGVM Pointer to the Global VM structure.
1127	*/
1128	GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1129	{
1130	LogFlow(("GMMR0CleanupVM: pGVM=%p:{.pVM=%p, .hSelf=%#x}\n", pGVM, pGVM->pVM, pGVM->hSelf));
1131
1132	PGMM pGMM;
1133	GMM_GET_VALID_INSTANCE_VOID(pGMM);
1134
1135	#ifdef VBOX_WITH_PAGE_SHARING
1136	/*
1137	* Clean up all registered shared modules first.
1138	*/
1139	gmmR0SharedModuleCleanup(pGMM, pGVM);
1140	#endif
1141
1142	gmmR0MutexAcquire(pGMM);
1143	uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1144	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1145
1146	/*
1147	* The policy is 'INVALID' until the initial reservation
1148	* request has been serviced.
1149	*/
1150	if ( pGVM->gmm.s.enmPolicy > GMMOCPOLICY_INVALID
1151	&& pGVM->gmm.s.enmPolicy < GMMOCPOLICY_END)
1152	{
1153	/*
1154	* If it's the last VM around, we can skip walking all the chunk looking
1155	* for the pages owned by this VM and instead flush the whole shebang.
1156	*
1157	* This takes care of the eventuality that a VM has left shared page
1158	* references behind (shouldn't happen of course, but you never know).
1159	*/
1160	Assert(pGMM->cRegisteredVMs);
1161	pGMM->cRegisteredVMs--;
1162
1163	/*
1164	* Walk the entire pool looking for pages that belong to this VM
1165	* and leftover mappings. (This'll only catch private pages,
1166	* shared pages will be 'left behind'.)
1167	*/
1168	uint64_t cPrivatePages = pGVM->gmm.s.cPrivatePages; /* save */
1169
1170	unsigned iCountDown = 64;
1171	bool fRedoFromStart;
1172	PGMMCHUNK pChunk;
1173	do
1174	{
1175	fRedoFromStart = false;
1176	RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1177	{
1178	uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1179	if (gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1180	{
1181	/* We left the giant mutex, so reset the yield counters. */
1182	uLockNanoTS = RTTimeSystemNanoTS();
1183	iCountDown = 64;
1184	}
1185	else
1186	{
1187	/* Didn't leave it, so do normal yielding. */
1188	if (!iCountDown)
1189	gmmR0MutexYield(pGMM, &uLockNanoTS);
1190	else
1191	iCountDown--;
1192	}
1193	if (pGMM->cFreedChunks != cFreeChunksOld)
1194	break;
1195	}
1196	} while (fRedoFromStart);
1197
1198	if (pGVM->gmm.s.cPrivatePages)
1199	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.cPrivatePages);
1200
1201	pGMM->cAllocatedPages -= cPrivatePages;
1202
1203	/*
1204	* Free empty chunks.
1205	*/
1206	do
1207	{
1208	fRedoFromStart = false;
1209	iCountDown = 10240;
1210	pChunk = pGMM->Private.apLists[RT_ELEMENTS(pGMM->Private.apLists) - 1];
1211	while (pChunk)
1212	{
1213	PGMMCHUNK pNext = pChunk->pFreeNext;
1214	if ( pChunk->cFree == GMM_CHUNK_NUM_PAGES
1215	&& ( !pGMM->fBoundMemoryMode
1216	\|\| pChunk->hGVM == pGVM->hSelf))
1217	{
1218	uint64_t const idGenerationOld = pGMM->Private.idGeneration;
1219	if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /fRelaxedSem/))
1220	{
1221	/* We've left the giant mutex, restart? (+1 for our unlink) */
1222	fRedoFromStart = pGMM->Private.idGeneration != idGenerationOld + 1;
1223	if (fRedoFromStart)
1224	break;
1225	uLockNanoTS = RTTimeSystemNanoTS();
1226	iCountDown = 10240;
1227	}
1228	}
1229
1230	/* Advance and maybe yield the lock. */
1231	pChunk = pNext;
1232	if (--iCountDown == 0)
1233	{
1234	uint64_t const idGenerationOld = pGMM->Private.idGeneration;
1235	fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1236	&& pGMM->Private.idGeneration != idGenerationOld;
1237	if (fRedoFromStart)
1238	break;
1239	iCountDown = 10240;
1240	}
1241	}
1242	} while (fRedoFromStart);
1243
1244	/*
1245	* Account for shared pages that weren't freed.
1246	*/
1247	if (pGVM->gmm.s.cSharedPages)
1248	{
1249	Assert(pGMM->cSharedPages >= pGVM->gmm.s.cSharedPages);
1250	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.cSharedPages);
1251	pGMM->cLeftBehindSharedPages += pGVM->gmm.s.cSharedPages;
1252	}
1253
1254	/*
1255	* Clean up balloon statistics in case the VM process crashed.
1256	*/
1257	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.cBalloonedPages);
1258	pGMM->cBalloonedPages -= pGVM->gmm.s.cBalloonedPages;
1259
1260	/*
1261	* Update the over-commitment management statistics.
1262	*/
1263	pGMM->cReservedPages -= pGVM->gmm.s.Reserved.cBasePages
1264	+ pGVM->gmm.s.Reserved.cFixedPages
1265	+ pGVM->gmm.s.Reserved.cShadowPages;
1266	switch (pGVM->gmm.s.enmPolicy)
1267	{
1268	case GMMOCPOLICY_NO_OC:
1269	break;
1270	default:
1271	/** @todo Update GMM->cOverCommittedPages */
1272	break;
1273	}
1274	}
1275
1276	/* zap the GVM data. */
1277	pGVM->gmm.s.enmPolicy = GMMOCPOLICY_INVALID;
1278	pGVM->gmm.s.enmPriority = GMMPRIORITY_INVALID;
1279	pGVM->gmm.s.fMayAllocate = false;
1280
1281	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1282	gmmR0MutexRelease(pGMM);
1283
1284	LogFlow(("GMMR0CleanupVM: returns\n"));
1285	}
1286
1287
1288	/**
1289	* Scan one chunk for private pages belonging to the specified VM.
1290	*
1291	* @note This function may drop the gian mutex!
1292	*
1293	* @returns @c true if we've temporarily dropped the giant mutex, @c false if
1294	* we didn't.
1295	* @param pGMM Pointer to the GMM instance.
1296	* @param pGVM The global VM handle.
1297	* @param pChunk The chunk to scan.
1298	*/
1299	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1300	{
1301	/*
1302	* Look for pages belonging to the VM.
1303	* (Perform some internal checks while we're scanning.)
1304	*/
1305	#ifndef VBOX_STRICT
1306	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1307	#endif
1308	{
1309	unsigned cPrivate = 0;
1310	unsigned cShared = 0;
1311	unsigned cFree = 0;
1312
1313	gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1314
1315	uint16_t hGVM = pGVM->hSelf;
1316	unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1317	while (iPage-- > 0)
1318	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1319	{
1320	if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1321	{
1322	/*
1323	* Free the page.
1324	*
1325	* The reason for not using gmmR0FreePrivatePage here is that we
1326	* must not cause the chunk to be freed from under us - we're in
1327	* an AVL tree walk here.
1328	*/
1329	pChunk->aPages[iPage].u = 0;
1330	pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1331	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1332	pChunk->iFreeHead = iPage;
1333	pChunk->cPrivate--;
1334	pChunk->cFree++;
1335	pGVM->gmm.s.cPrivatePages--;
1336	cFree++;
1337	}
1338	else
1339	cPrivate++;
1340	}
1341	else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1342	cFree++;
1343	else
1344	cShared++;
1345
1346	gmmR0LinkChunk(pChunk, pChunk->cShared ? &g_pGMM->Shared : &g_pGMM->Private);
1347
1348	/*
1349	* Did it add up?
1350	*/
1351	if (RT_UNLIKELY( pChunk->cFree != cFree
1352	\|\| pChunk->cPrivate != cPrivate
1353	\|\| pChunk->cShared != cShared))
1354	{
1355	SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %p/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1356	pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1357	pChunk->cFree = cFree;
1358	pChunk->cPrivate = cPrivate;
1359	pChunk->cShared = cShared;
1360	}
1361	}
1362
1363	/*
1364	* If not in bound memory mode, we should reset the hGVM field
1365	* if it has our handle in it.
1366	*/
1367	if (pChunk->hGVM == pGVM->hSelf)
1368	{
1369	if (!g_pGMM->fBoundMemoryMode)
1370	pChunk->hGVM = NIL_GVM_HANDLE;
1371	else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1372	{
1373	SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1374	pChunk, pChunk->Core.Key, pChunk->cFree);
1375	AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1376
1377	gmmR0UnlinkChunk(pChunk);
1378	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1379	gmmR0LinkChunk(pChunk, pChunk->cShared ? &g_pGMM->Shared : &g_pGMM->Private);
1380	}
1381	}
1382
1383	/*
1384	* Look for a mapping belonging to the terminating VM.
1385	*/
1386	GMMR0CHUNKMTXSTATE MtxState;
1387	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1388	unsigned cMappings = pChunk->cMappingsX;
1389	for (unsigned i = 0; i < cMappings; i++)
1390	if (pChunk->paMappingsX[i].pGVM == pGVM)
1391	{
1392	gmmR0ChunkMutexDropGiant(&MtxState);
1393
1394	RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1395
1396	cMappings--;
1397	if (i < cMappings)
1398	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1399	pChunk->paMappingsX[cMappings].pGVM = NULL;
1400	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1401	Assert(pChunk->cMappingsX - 1U == cMappings);
1402	pChunk->cMappingsX = cMappings;
1403
1404	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1405	if (RT_FAILURE(rc))
1406	{
1407	SUPR0Printf("gmmR0CleanupVMScanChunk: %p/%#x: mapping #%x: RTRMemObjFree(%p,false) -> %d \n",
1408	pChunk, pChunk->Core.Key, i, hMemObj, rc);
1409	AssertRC(rc);
1410	}
1411
1412	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1413	return true;
1414	}
1415
1416	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1417	return false;
1418	}
1419
1420
1421	/**
1422	* The initial resource reservations.
1423	*
1424	* This will make memory reservations according to policy and priority. If there aren't
1425	* sufficient resources available to sustain the VM this function will fail and all
1426	* future allocations requests will fail as well.
1427	*
1428	* These are just the initial reservations made very very early during the VM creation
1429	* process and will be adjusted later in the GMMR0UpdateReservation call after the
1430	* ring-3 init has completed.
1431	*
1432	* @returns VBox status code.
1433	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1434	* @retval VERR_GMM_
1435	*
1436	* @param pVM Pointer to the shared VM structure.
1437	* @param idCpu VCPU id
1438	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1439	* This does not include MMIO2 and similar.
1440	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1441	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1442	* hyper heap, MMIO2 and similar.
1443	* @param enmPolicy The OC policy to use on this VM.
1444	* @param enmPriority The priority in an out-of-memory situation.
1445	*
1446	* @thread The creator thread / EMT.
1447	*/
1448	GMMR0DECL(int) GMMR0InitialReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages,
1449	GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1450	{
1451	LogFlow(("GMMR0InitialReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1452	pVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1453
1454	/*
1455	* Validate, get basics and take the semaphore.
1456	*/
1457	PGMM pGMM;
1458	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
1459	PGVM pGVM;
1460	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1461	if (RT_FAILURE(rc))
1462	return rc;
1463
1464	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1465	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1466	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1467	AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1468	AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1469
1470	gmmR0MutexAcquire(pGMM);
1471	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1472	{
1473	if ( !pGVM->gmm.s.Reserved.cBasePages
1474	&& !pGVM->gmm.s.Reserved.cFixedPages
1475	&& !pGVM->gmm.s.Reserved.cShadowPages)
1476	{
1477	/*
1478	* Check if we can accommodate this.
1479	*/
1480	/* ... later ... */
1481	if (RT_SUCCESS(rc))
1482	{
1483	/*
1484	* Update the records.
1485	*/
1486	pGVM->gmm.s.Reserved.cBasePages = cBasePages;
1487	pGVM->gmm.s.Reserved.cFixedPages = cFixedPages;
1488	pGVM->gmm.s.Reserved.cShadowPages = cShadowPages;
1489	pGVM->gmm.s.enmPolicy = enmPolicy;
1490	pGVM->gmm.s.enmPriority = enmPriority;
1491	pGVM->gmm.s.fMayAllocate = true;
1492
1493	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1494	pGMM->cRegisteredVMs++;
1495	}
1496	}
1497	else
1498	rc = VERR_WRONG_ORDER;
1499	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1500	}
1501	else
1502	rc = VERR_INTERNAL_ERROR_5;
1503	gmmR0MutexRelease(pGMM);
1504	LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1505	return rc;
1506	}
1507
1508
1509	/**
1510	* VMMR0 request wrapper for GMMR0InitialReservation.
1511	*
1512	* @returns see GMMR0InitialReservation.
1513	* @param pVM Pointer to the shared VM structure.
1514	* @param idCpu VCPU id
1515	* @param pReq The request packet.
1516	*/
1517	GMMR0DECL(int) GMMR0InitialReservationReq(PVM pVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1518	{
1519	/*
1520	* Validate input and pass it on.
1521	*/
1522	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1523	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1524	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1525
1526	return GMMR0InitialReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1527	}
1528
1529
1530	/**
1531	* This updates the memory reservation with the additional MMIO2 and ROM pages.
1532	*
1533	* @returns VBox status code.
1534	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1535	*
1536	* @param pVM Pointer to the shared VM structure.
1537	* @param idCpu VCPU id
1538	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1539	* This does not include MMIO2 and similar.
1540	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1541	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1542	* hyper heap, MMIO2 and similar.
1543	*
1544	* @thread EMT.
1545	*/
1546	GMMR0DECL(int) GMMR0UpdateReservation(PVM pVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages, uint32_t cFixedPages)
1547	{
1548	LogFlow(("GMMR0UpdateReservation: pVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1549	pVM, cBasePages, cShadowPages, cFixedPages));
1550
1551	/*
1552	* Validate, get basics and take the semaphore.
1553	*/
1554	PGMM pGMM;
1555	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
1556	PGVM pGVM;
1557	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
1558	if (RT_FAILURE(rc))
1559	return rc;
1560
1561	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1562	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1563	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1564
1565	gmmR0MutexAcquire(pGMM);
1566	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1567	{
1568	if ( pGVM->gmm.s.Reserved.cBasePages
1569	&& pGVM->gmm.s.Reserved.cFixedPages
1570	&& pGVM->gmm.s.Reserved.cShadowPages)
1571	{
1572	/*
1573	* Check if we can accommodate this.
1574	*/
1575	/* ... later ... */
1576	if (RT_SUCCESS(rc))
1577	{
1578	/*
1579	* Update the records.
1580	*/
1581	pGMM->cReservedPages -= pGVM->gmm.s.Reserved.cBasePages
1582	+ pGVM->gmm.s.Reserved.cFixedPages
1583	+ pGVM->gmm.s.Reserved.cShadowPages;
1584	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1585
1586	pGVM->gmm.s.Reserved.cBasePages = cBasePages;
1587	pGVM->gmm.s.Reserved.cFixedPages = cFixedPages;
1588	pGVM->gmm.s.Reserved.cShadowPages = cShadowPages;
1589	}
1590	}
1591	else
1592	rc = VERR_WRONG_ORDER;
1593	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1594	}
1595	else
1596	rc = VERR_INTERNAL_ERROR_5;
1597	gmmR0MutexRelease(pGMM);
1598	LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1599	return rc;
1600	}
1601
1602
1603	/**
1604	* VMMR0 request wrapper for GMMR0UpdateReservation.
1605	*
1606	* @returns see GMMR0UpdateReservation.
1607	* @param pVM Pointer to the shared VM structure.
1608	* @param idCpu VCPU id
1609	* @param pReq The request packet.
1610	*/
1611	GMMR0DECL(int) GMMR0UpdateReservationReq(PVM pVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1612	{
1613	/*
1614	* Validate input and pass it on.
1615	*/
1616	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
1617	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1618	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1619
1620	return GMMR0UpdateReservation(pVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1621	}
1622
1623
1624	/**
1625	* Performs sanity checks on a free set.
1626	*
1627	* @returns Error count.
1628	*
1629	* @param pGMM Pointer to the GMM instance.
1630	* @param pSet Pointer to the set.
1631	* @param pszSetName The set name.
1632	* @param pszFunction The function from which it was called.
1633	* @param uLine The line number.
1634	*/
1635	static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1636	const char *pszFunction, unsigned uLineNo)
1637	{
1638	uint32_t cErrors = 0;
1639
1640	/*
1641	* Count the free pages in all the chunks and match it against pSet->cFreePages.
1642	*/
1643	uint32_t cPages = 0;
1644	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1645	{
1646	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1647	{
1648	/** @todo check that the chunk is hash into the right set. */
1649	cPages += pCur->cFree;
1650	}
1651	}
1652	if (RT_UNLIKELY(cPages != pSet->cFreePages))
1653	{
1654	SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1655	cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1656	cErrors++;
1657	}
1658
1659	return cErrors;
1660	}
1661
1662
1663	/**
1664	* Performs some sanity checks on the GMM while owning lock.
1665	*
1666	* @returns Error count.
1667	*
1668	* @param pGMM Pointer to the GMM instance.
1669	* @param pszFunction The function from which it is called.
1670	* @param uLineNo The line number.
1671	*/
1672	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1673	{
1674	uint32_t cErrors = 0;
1675
1676	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Private, "private", pszFunction, uLineNo);
1677	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1678	/** @todo add more sanity checks. */
1679
1680	return cErrors;
1681	}
1682
1683
1684	/**
1685	* Looks up a chunk in the tree and fill in the TLB entry for it.
1686	*
1687	* This is not expected to fail and will bitch if it does.
1688	*
1689	* @returns Pointer to the allocation chunk, NULL if not found.
1690	* @param pGMM Pointer to the GMM instance.
1691	* @param idChunk The ID of the chunk to find.
1692	* @param pTlbe Pointer to the TLB entry.
1693	*/
1694	static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1695	{
1696	PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1697	AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1698	pTlbe->idChunk = idChunk;
1699	pTlbe->pChunk = pChunk;
1700	return pChunk;
1701	}
1702
1703
1704	/**
1705	* Finds a allocation chunk.
1706	*
1707	* This is not expected to fail and will bitch if it does.
1708	*
1709	* @returns Pointer to the allocation chunk, NULL if not found.
1710	* @param pGMM Pointer to the GMM instance.
1711	* @param idChunk The ID of the chunk to find.
1712	*/
1713	DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1714	{
1715	/*
1716	* Do a TLB lookup, branch if not in the TLB.
1717	*/
1718	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1719	if ( pTlbe->idChunk != idChunk
1720	\|\| !pTlbe->pChunk)
1721	return gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1722	return pTlbe->pChunk;
1723	}
1724
1725
1726	/**
1727	* Finds a page.
1728	*
1729	* This is not expected to fail and will bitch if it does.
1730	*
1731	* @returns Pointer to the page, NULL if not found.
1732	* @param pGMM Pointer to the GMM instance.
1733	* @param idPage The ID of the page to find.
1734	*/
1735	DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1736	{
1737	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1738	if (RT_LIKELY(pChunk))
1739	return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1740	return NULL;
1741	}
1742
1743
1744	/**
1745	* Gets the host physical address for a page given by it's ID.
1746	*
1747	* @returns The host physical address or NIL_RTHCPHYS.
1748	* @param pGMM Pointer to the GMM instance.
1749	* @param idPage The ID of the page to find.
1750	*/
1751	DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1752	{
1753	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1754	if (RT_LIKELY(pChunk))
1755	return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1756	return NIL_RTHCPHYS;
1757	}
1758
1759
1760	/**
1761	* Selects the appropriate free list given the number of free pages.
1762	*
1763	* @returns Free list index.
1764	* @param
1765	*/
1766	DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1767	{
1768	return (cFree - 1) >> GMM_CHUNK_FREE_SET_SHIFT;
1769	}
1770
1771
1772	/**
1773	* Unlinks the chunk from the free list it's currently on (if any).
1774	*
1775	* @param pChunk The allocation chunk.
1776	*/
1777	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1778	{
1779	PGMMCHUNKFREESET pSet = pChunk->pSet;
1780	if (RT_LIKELY(pSet))
1781	{
1782	pSet->cFreePages -= pChunk->cFree;
1783	pSet->idGeneration++;
1784
1785	PGMMCHUNK pPrev = pChunk->pFreePrev;
1786	PGMMCHUNK pNext = pChunk->pFreeNext;
1787	if (pPrev)
1788	pPrev->pFreeNext = pNext;
1789	else
1790	pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
1791	if (pNext)
1792	pNext->pFreePrev = pPrev;
1793
1794	pChunk->pSet = NULL;
1795	pChunk->pFreeNext = NULL;
1796	pChunk->pFreePrev = NULL;
1797	}
1798	else
1799	{
1800	Assert(!pChunk->pFreeNext);
1801	Assert(!pChunk->pFreePrev);
1802	Assert(!pChunk->cFree);
1803	}
1804	}
1805
1806
1807	/**
1808	* Links the chunk onto the appropriate free list in the specified free set.
1809	*
1810	* If no free entries, it's not linked into any list.
1811	*
1812	* @param pChunk The allocation chunk.
1813	* @param pSet The free set.
1814	*/
1815	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
1816	{
1817	Assert(!pChunk->pSet);
1818	Assert(!pChunk->pFreeNext);
1819	Assert(!pChunk->pFreePrev);
1820
1821	if (pChunk->cFree > 0)
1822	{
1823	pChunk->pSet = pSet;
1824	pChunk->pFreePrev = NULL;
1825	unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
1826	pChunk->pFreeNext = pSet->apLists[iList];
1827	if (pChunk->pFreeNext)
1828	pChunk->pFreeNext->pFreePrev = pChunk;
1829	pSet->apLists[iList] = pChunk;
1830
1831	pSet->cFreePages += pChunk->cFree;
1832	pSet->idGeneration++;
1833	}
1834	}
1835
1836
1837	/**
1838	* Frees a Chunk ID.
1839	*
1840	* @param pGMM Pointer to the GMM instance.
1841	* @param idChunk The Chunk ID to free.
1842	*/
1843	static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
1844	{
1845	AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
1846	AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
1847	ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
1848	}
1849
1850
1851	/**
1852	* Allocates a new Chunk ID.
1853	*
1854	* @returns The Chunk ID.
1855	* @param pGMM Pointer to the GMM instance.
1856	*/
1857	static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
1858	{
1859	AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
1860	AssertCompile(NIL_GMM_CHUNKID == 0);
1861
1862	/*
1863	* Try the next sequential one.
1864	*/
1865	int32_t idChunk = ++pGMM->idChunkPrev;
1866	#if 0 /** @todo enable this code */
1867	if ( idChunk <= GMM_CHUNKID_LAST
1868	&& idChunk > NIL_GMM_CHUNKID
1869	&& !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
1870	return idChunk;
1871	#endif
1872
1873	/*
1874	* Scan sequentially from the last one.
1875	*/
1876	if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
1877	&& idChunk > NIL_GMM_CHUNKID)
1878	{
1879	idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk);
1880	if (idChunk > NIL_GMM_CHUNKID)
1881	{
1882	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1883	return pGMM->idChunkPrev = idChunk;
1884	}
1885	}
1886
1887	/*
1888	* Ok, scan from the start.
1889	* We're not racing anyone, so there is no need to expect failures or have restart loops.
1890	*/
1891	idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
1892	AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
1893	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
1894
1895	return pGMM->idChunkPrev = idChunk;
1896	}
1897
1898
1899	/**
1900	* Registers a new chunk of memory.
1901	*
1902	* This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk.
1903	*
1904	* @returns VBox status code. On success, the giant GMM lock will be held, the
1905	* caller must release it (ugly).
1906	* @param pGMM Pointer to the GMM instance.
1907	* @param pSet Pointer to the set.
1908	* @param MemObj The memory object for the chunk.
1909	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
1910	* affinity.
1911	* @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
1912	* @param ppChunk Chunk address (out). Optional.
1913	*
1914	* @remarks The caller must not own the giant GMM mutex.
1915	* The giant GMM mutex will be acquired and returned acquired in
1916	* the success path. On failure, no locks will be held.
1917	*/
1918	static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ MemObj, uint16_t hGVM, uint16_t fChunkFlags,
1919	PGMMCHUNK *ppChunk)
1920	{
1921	Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
1922	Assert(hGVM != NIL_GVM_HANDLE \|\| pGMM->fBoundMemoryMode);
1923	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
1924
1925	int rc;
1926	PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
1927	if (pChunk)
1928	{
1929	/*
1930	* Initialize it.
1931	*/
1932	pChunk->hMemObj = MemObj;
1933	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1934	pChunk->hGVM = hGVM;
1935	/pChunk->iFreeHead = 0;/
1936	pChunk->idNumaNode = GMM_CHUNK_NUMA_ID_UNKNOWN;
1937	pChunk->iChunkMtx = UINT8_MAX;
1938	pChunk->fFlags = fChunkFlags;
1939	for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
1940	{
1941	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1942	pChunk->aPages[iPage].Free.iNext = iPage + 1;
1943	}
1944	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
1945	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
1946
1947	/*
1948	* Allocate a Chunk ID and insert it into the tree.
1949	* This has to be done behind the mutex of course.
1950	*/
1951	rc = gmmR0MutexAcquire(pGMM);
1952	if (RT_SUCCESS(rc))
1953	{
1954	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1955	{
1956	pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
1957	if ( pChunk->Core.Key != NIL_GMM_CHUNKID
1958	&& pChunk->Core.Key <= GMM_CHUNKID_LAST
1959	&& RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
1960	{
1961	pGMM->cChunks++;
1962	RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
1963	gmmR0LinkChunk(pChunk, pSet);
1964	LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
1965
1966	if (ppChunk)
1967	*ppChunk = pChunk;
1968	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1969	return VINF_SUCCESS;
1970	}
1971
1972	/* bail out */
1973	rc = VERR_INTERNAL_ERROR;
1974	}
1975	else
1976	rc = VERR_INTERNAL_ERROR_5;
1977	gmmR0MutexRelease(pGMM);
1978	}
1979
1980	RTMemFree(pChunk);
1981	}
1982	else
1983	rc = VERR_NO_MEMORY;
1984	return rc;
1985	}
1986
1987
1988	/**
1989	* Allocate one new chunk and add it to the specified free set.
1990	*
1991	* @returns VBox status code.
1992	* @param pGMM Pointer to the GMM instance.
1993	* @param pSet Pointer to the set.
1994	* @param hGVM The affinity of the new chunk.
1995	*
1996	* @remarks The giant mutex will be temporarily abandond during the allocation.
1997	*/
1998	static int gmmR0AllocateOneChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, uint16_t hGVM)
1999	{
2000	/*
2001	* Allocate the memory.
2002	*
2003	* Note! We leave the giant GMM lock temporarily as the allocation might
2004	* take a long time. gmmR0RegisterChunk reacquires it (ugly).
2005	*/
2006	gmmR0MutexRelease(pGMM);
2007
2008	RTR0MEMOBJ hMemObj;
2009	int rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2010	/** @todo Check that RTR0MemObjAllocPhysNC always returns VERR_NO_MEMORY on
2011	* allocation failure. */
2012	if (RT_SUCCESS(rc))
2013	{
2014	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, hGVM, 0 /fChunkFlags/, NULL);
2015	if (RT_SUCCESS(rc))
2016	return rc;
2017
2018	RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
2019	}
2020
2021	int rc2 = gmmR0MutexAcquire(pGMM);
2022	AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2023	return rc;
2024	}
2025
2026
2027	/**
2028	* Attempts to allocate more pages until the requested amount is met.
2029	*
2030	* @returns VBox status code.
2031	* @param pGMM Pointer to the GMM instance data.
2032	* @param pGVM The calling VM.
2033	* @param pSet Pointer to the free set to grow.
2034	* @param cPages The number of pages needed.
2035	*
2036	* @remarks Called owning the mutex, but will leave it temporarily while
2037	* allocating the memory!
2038	*/
2039	static int gmmR0AllocateMoreChunks(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages)
2040	{
2041	Assert(!pGMM->fLegacyAllocationMode);
2042
2043	if (!GMM_CHECK_SANITY_IN_LOOPS(pGMM))
2044	return VERR_INTERNAL_ERROR_4;
2045
2046	if (!pGMM->fBoundMemoryMode)
2047	{
2048	/*
2049	* Try steal free chunks from the other set first. (Only take 100% free chunks.)
2050	*/
2051	PGMMCHUNKFREESET pOtherSet = pSet == &pGMM->Private ? &pGMM->Shared : &pGMM->Private;
2052	while ( pSet->cFreePages < cPages
2053	&& pOtherSet->cFreePages >= GMM_CHUNK_NUM_PAGES)
2054	{
2055	PGMMCHUNK pChunk = pOtherSet->apLists[RT_ELEMENTS(pOtherSet->apLists) - 1];
2056	while (pChunk && pChunk->cFree != GMM_CHUNK_NUM_PAGES)
2057	pChunk = pChunk->pFreeNext;
2058	if (!pChunk)
2059	break;
2060
2061	gmmR0UnlinkChunk(pChunk);
2062	gmmR0LinkChunk(pChunk, pSet);
2063	}
2064
2065	/*
2066	* If we need still more pages, allocate new chunks.
2067	* Note! We will leave the mutex while doing the allocation,
2068	*/
2069	while (pSet->cFreePages < cPages)
2070	{
2071	int rc = gmmR0AllocateOneChunk(pGMM, pSet, pGVM->hSelf);
2072	if (RT_FAILURE(rc))
2073	return rc;
2074	if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2075	return VERR_INTERNAL_ERROR_5;
2076	}
2077	}
2078	else
2079	{
2080	/*
2081	* The memory is bound to the VM allocating it, so we have to count
2082	* the free pages carefully as well as making sure we brand them with
2083	* our VM handle.
2084	*
2085	* Note! We will leave the mutex while doing the allocation,
2086	*/
2087	uint16_t const hGVM = pGVM->hSelf;
2088	for (;;)
2089	{
2090	/* Count and see if we've reached the goal. */
2091	uint32_t cPagesFound = 0;
2092	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
2093	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
2094	if (pCur->hGVM == hGVM)
2095	{
2096	cPagesFound += pCur->cFree;
2097	if (cPagesFound >= cPages)
2098	break;
2099	}
2100	if (cPagesFound >= cPages)
2101	break;
2102
2103	/* Allocate more. */
2104	int rc = gmmR0AllocateOneChunk(pGMM, pSet, hGVM);
2105	if (RT_FAILURE(rc))
2106	return rc;
2107	if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2108	return VERR_INTERNAL_ERROR_5;
2109	}
2110	}
2111
2112	return VINF_SUCCESS;
2113	}
2114
2115
2116	/**
2117	* Allocates one private page.
2118	*
2119	* Worker for gmmR0AllocatePages.
2120	*
2121	* @param pGMM Pointer to the GMM instance data.
2122	* @param hGVM The GVM handle of the VM requesting memory.
2123	* @param pChunk The chunk to allocate it from.
2124	* @param pPageDesc The page descriptor.
2125	*/
2126	static void gmmR0AllocatePage(PGMM pGMM, uint32_t hGVM, PGMMCHUNK pChunk, PGMMPAGEDESC pPageDesc)
2127	{
2128	/* update the chunk stats. */
2129	if (pChunk->hGVM == NIL_GVM_HANDLE)
2130	pChunk->hGVM = hGVM;
2131	Assert(pChunk->cFree);
2132	pChunk->cFree--;
2133	pChunk->cPrivate++;
2134
2135	/* unlink the first free page. */
2136	const uint32_t iPage = pChunk->iFreeHead;
2137	AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2138	PGMMPAGE pPage = &pChunk->aPages[iPage];
2139	Assert(GMM_PAGE_IS_FREE(pPage));
2140	pChunk->iFreeHead = pPage->Free.iNext;
2141	Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2142	pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage,
2143	pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2144
2145	/* make the page private. */
2146	pPage->u = 0;
2147	AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2148	pPage->Private.hGVM = hGVM;
2149	AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2150	AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2151	if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2152	pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2153	else
2154	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2155
2156	/* update the page descriptor. */
2157	pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2158	Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
2159	pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage;
2160	pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2161	}
2162
2163
2164	/**
2165	* Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2166	*
2167	* @returns VBox status code:
2168	* @retval VINF_SUCCESS on success.
2169	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
2170	* gmmR0AllocateMoreChunks is necessary.
2171	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2172	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2173	* that is we're trying to allocate more than we've reserved.
2174	*
2175	* @param pGMM Pointer to the GMM instance data.
2176	* @param pGVM Pointer to the shared VM structure.
2177	* @param cPages The number of pages to allocate.
2178	* @param paPages Pointer to the page descriptors.
2179	* See GMMPAGEDESC for details on what is expected on input.
2180	* @param enmAccount The account to charge.
2181	*/
2182	static int gmmR0AllocatePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2183	{
2184	/*
2185	* Check allocation limits.
2186	*/
2187	if (RT_UNLIKELY(pGMM->cAllocatedPages + cPages > pGMM->cMaxPages))
2188	return VERR_GMM_HIT_GLOBAL_LIMIT;
2189
2190	switch (enmAccount)
2191	{
2192	case GMMACCOUNT_BASE:
2193	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages + pGVM->gmm.s.cBalloonedPages + cPages > pGVM->gmm.s.Reserved.cBasePages))
2194	{
2195	Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2196	pGVM->gmm.s.Reserved.cBasePages, pGVM->gmm.s.Allocated.cBasePages, pGVM->gmm.s.cBalloonedPages, cPages));
2197	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2198	}
2199	break;
2200	case GMMACCOUNT_SHADOW:
2201	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cShadowPages + cPages > pGVM->gmm.s.Reserved.cShadowPages))
2202	{
2203	Log(("gmmR0AllocatePages:Shadow: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
2204	pGVM->gmm.s.Reserved.cShadowPages, pGVM->gmm.s.Allocated.cShadowPages, cPages));
2205	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2206	}
2207	break;
2208	case GMMACCOUNT_FIXED:
2209	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cFixedPages + cPages > pGVM->gmm.s.Reserved.cFixedPages))
2210	{
2211	Log(("gmmR0AllocatePages:Fixed: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
2212	pGVM->gmm.s.Reserved.cFixedPages, pGVM->gmm.s.Allocated.cFixedPages, cPages));
2213	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2214	}
2215	break;
2216	default:
2217	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
2218	}
2219
2220	/*
2221	* Check if we need to allocate more memory or not. In bound memory mode this
2222	* is a bit extra work but it's easier to do it upfront than bailing out later.
2223	*/
2224	PGMMCHUNKFREESET pSet = &pGMM->Private;
2225	if (pSet->cFreePages < cPages)
2226	return VERR_GMM_SEED_ME;
2227	if (pGMM->fBoundMemoryMode)
2228	{
2229	uint16_t hGVM = pGVM->hSelf;
2230	uint32_t cPagesFound = 0;
2231	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
2232	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
2233	if (pCur->hGVM == hGVM)
2234	{
2235	cPagesFound += pCur->cFree;
2236	if (cPagesFound >= cPages)
2237	break;
2238	}
2239	if (cPagesFound < cPages)
2240	return VERR_GMM_SEED_ME;
2241	}
2242
2243	/*
2244	* Pick the pages.
2245	* Try make some effort keeping VMs sharing private chunks.
2246	*/
2247	uint16_t hGVM = pGVM->hSelf;
2248	uint32_t iPage = 0;
2249
2250	/* first round, pick from chunks with an affinity to the VM. */
2251	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists) && iPage < cPages; i++)
2252	{
2253	PGMMCHUNK pCurFree = NULL;
2254	PGMMCHUNK pCur = pSet->apLists[i];
2255	while (pCur && iPage < cPages)
2256	{
2257	PGMMCHUNK pNext = pCur->pFreeNext;
2258
2259	if ( pCur->hGVM == hGVM
2260	&& pCur->cFree < GMM_CHUNK_NUM_PAGES)
2261	{
2262	gmmR0UnlinkChunk(pCur);
2263	for (; pCur->cFree && iPage < cPages; iPage++)
2264	gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
2265	gmmR0LinkChunk(pCur, pSet);
2266	}
2267
2268	pCur = pNext;
2269	}
2270	}
2271
2272	if (iPage < cPages)
2273	{
2274	/* second round, pick pages from the 100% empty chunks we just skipped above. */
2275	PGMMCHUNK pCurFree = NULL;
2276	PGMMCHUNK pCur = pSet->apLists[RT_ELEMENTS(pSet->apLists) - 1];
2277	while (pCur && iPage < cPages)
2278	{
2279	PGMMCHUNK pNext = pCur->pFreeNext;
2280
2281	if ( pCur->cFree == GMM_CHUNK_NUM_PAGES
2282	&& ( pCur->hGVM == hGVM
2283	\|\| !pGMM->fBoundMemoryMode))
2284	{
2285	gmmR0UnlinkChunk(pCur);
2286	for (; pCur->cFree && iPage < cPages; iPage++)
2287	gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
2288	gmmR0LinkChunk(pCur, pSet);
2289	}
2290
2291	pCur = pNext;
2292	}
2293	}
2294
2295	if ( iPage < cPages
2296	&& !pGMM->fBoundMemoryMode)
2297	{
2298	/* third round, disregard affinity. */
2299	unsigned i = RT_ELEMENTS(pSet->apLists);
2300	while (i-- > 0 && iPage < cPages)
2301	{
2302	PGMMCHUNK pCurFree = NULL;
2303	PGMMCHUNK pCur = pSet->apLists[i];
2304	while (pCur && iPage < cPages)
2305	{
2306	PGMMCHUNK pNext = pCur->pFreeNext;
2307
2308	if ( pCur->cFree > GMM_CHUNK_NUM_PAGES / 2
2309	&& cPages >= GMM_CHUNK_NUM_PAGES / 2)
2310	pCur->hGVM = hGVM; /* change chunk affinity */
2311
2312	gmmR0UnlinkChunk(pCur);
2313	for (; pCur->cFree && iPage < cPages; iPage++)
2314	gmmR0AllocatePage(pGMM, hGVM, pCur, &paPages[iPage]);
2315	gmmR0LinkChunk(pCur, pSet);
2316
2317	pCur = pNext;
2318	}
2319	}
2320	}
2321
2322	/*
2323	* Update the account.
2324	*/
2325	switch (enmAccount)
2326	{
2327	case GMMACCOUNT_BASE: pGVM->gmm.s.Allocated.cBasePages += iPage; break;
2328	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Allocated.cShadowPages += iPage; break;
2329	case GMMACCOUNT_FIXED: pGVM->gmm.s.Allocated.cFixedPages += iPage; break;
2330	default:
2331	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
2332	}
2333	pGVM->gmm.s.cPrivatePages += iPage;
2334	pGMM->cAllocatedPages += iPage;
2335
2336	AssertMsgReturn(iPage == cPages, ("%u != %u\n", iPage, cPages), VERR_INTERNAL_ERROR);
2337
2338	/*
2339	* Check if we've reached some threshold and should kick one or two VMs and tell
2340	* them to inflate their balloons a bit more... later.
2341	*/
2342
2343	return VINF_SUCCESS;
2344	}
2345
2346
2347	/**
2348	* Updates the previous allocations and allocates more pages.
2349	*
2350	* The handy pages are always taken from the 'base' memory account.
2351	* The allocated pages are not cleared and will contains random garbage.
2352	*
2353	* @returns VBox status code:
2354	* @retval VINF_SUCCESS on success.
2355	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2356	* @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2357	* @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2358	* private page.
2359	* @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2360	* shared page.
2361	* @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2362	* owned by the VM.
2363	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2364	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2365	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2366	* that is we're trying to allocate more than we've reserved.
2367	*
2368	* @param pVM Pointer to the shared VM structure.
2369	* @param idCpu VCPU id
2370	* @param cPagesToUpdate The number of pages to update (starting from the head).
2371	* @param cPagesToAlloc The number of pages to allocate (starting from the head).
2372	* @param paPages The array of page descriptors.
2373	* See GMMPAGEDESC for details on what is expected on input.
2374	* @thread EMT.
2375	*/
2376	GMMR0DECL(int) GMMR0AllocateHandyPages(PVM pVM, VMCPUID idCpu, uint32_t cPagesToUpdate, uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2377	{
2378	LogFlow(("GMMR0AllocateHandyPages: pVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2379	pVM, cPagesToUpdate, cPagesToAlloc, paPages));
2380
2381	/*
2382	* Validate, get basics and take the semaphore.
2383	* (This is a relatively busy path, so make predictions where possible.)
2384	*/
2385	PGMM pGMM;
2386	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2387	PGVM pGVM;
2388	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2389	if (RT_FAILURE(rc))
2390	return rc;
2391
2392	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2393	AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2394	\|\| (cPagesToAlloc && cPagesToAlloc < 1024),
2395	("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2396	VERR_INVALID_PARAMETER);
2397
2398	unsigned iPage = 0;
2399	for (; iPage < cPagesToUpdate; iPage++)
2400	{
2401	AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2402	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2403	\|\| paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2404	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2405	("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2406	VERR_INVALID_PARAMETER);
2407	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2408	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
2409	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2410	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2411	/\|\| paPages[iPage].idSharedPage == NIL_GMM_PAGEID/,
2412	("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2413	}
2414
2415	for (; iPage < cPagesToAlloc; iPage++)
2416	{
2417	AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2418	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2419	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2420	}
2421
2422	gmmR0MutexAcquire(pGMM);
2423	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2424	{
2425	/* No allocations before the initial reservation has been made! */
2426	if (RT_LIKELY( pGVM->gmm.s.Reserved.cBasePages
2427	&& pGVM->gmm.s.Reserved.cFixedPages
2428	&& pGVM->gmm.s.Reserved.cShadowPages))
2429	{
2430	/*
2431	* Perform the updates.
2432	* Stop on the first error.
2433	*/
2434	for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2435	{
2436	if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2437	{
2438	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2439	if (RT_LIKELY(pPage))
2440	{
2441	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2442	{
2443	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2444	{
2445	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2446	if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2447	pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2448	else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2449	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2450	/* else: NIL_RTHCPHYS nothing */
2451
2452	paPages[iPage].idPage = NIL_GMM_PAGEID;
2453	paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2454	}
2455	else
2456	{
2457	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2458	iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2459	rc = VERR_GMM_NOT_PAGE_OWNER;
2460	break;
2461	}
2462	}
2463	else
2464	{
2465	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(pPage), pPage, pPage->Common.u2State));
2466	rc = VERR_GMM_PAGE_NOT_PRIVATE;
2467	break;
2468	}
2469	}
2470	else
2471	{
2472	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2473	rc = VERR_GMM_PAGE_NOT_FOUND;
2474	break;
2475	}
2476	}
2477
2478	if (paPages[iPage].idSharedPage != NIL_GMM_PAGEID)
2479	{
2480	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
2481	if (RT_LIKELY(pPage))
2482	{
2483	if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2484	{
2485	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2486	Assert(pPage->Shared.cRefs);
2487	Assert(pGVM->gmm.s.cSharedPages);
2488	Assert(pGVM->gmm.s.Allocated.cBasePages);
2489
2490	Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
2491	pGVM->gmm.s.cSharedPages--;
2492	pGVM->gmm.s.Allocated.cBasePages--;
2493	if (!--pPage->Shared.cRefs)
2494	{
2495	gmmR0FreeSharedPage(pGMM, paPages[iPage].idSharedPage, pPage);
2496	}
2497	else
2498	{
2499	Assert(pGMM->cDuplicatePages);
2500	pGMM->cDuplicatePages--;
2501	}
2502
2503	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2504	}
2505	else
2506	{
2507	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
2508	rc = VERR_GMM_PAGE_NOT_SHARED;
2509	break;
2510	}
2511	}
2512	else
2513	{
2514	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
2515	rc = VERR_GMM_PAGE_NOT_FOUND;
2516	break;
2517	}
2518	}
2519	}
2520
2521	/*
2522	* Join paths with GMMR0AllocatePages for the allocation.
2523	* Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2524	*/
2525	while (RT_SUCCESS(rc))
2526	{
2527	rc = gmmR0AllocatePages(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
2528	if ( rc != VERR_GMM_SEED_ME
2529	\|\| pGMM->fLegacyAllocationMode)
2530	break;
2531	rc = gmmR0AllocateMoreChunks(pGMM, pGVM, &pGMM->Private, cPagesToAlloc);
2532	}
2533	}
2534	else
2535	rc = VERR_WRONG_ORDER;
2536	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2537	}
2538	else
2539	rc = VERR_INTERNAL_ERROR_5;
2540	gmmR0MutexRelease(pGMM);
2541	LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
2542	return rc;
2543	}
2544
2545
2546	/**
2547	* Allocate one or more pages.
2548	*
2549	* This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
2550	* The allocated pages are not cleared and will contains random garbage.
2551	*
2552	* @returns VBox status code:
2553	* @retval VINF_SUCCESS on success.
2554	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2555	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2556	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2557	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2558	* that is we're trying to allocate more than we've reserved.
2559	*
2560	* @param pVM Pointer to the shared VM structure.
2561	* @param idCpu VCPU id
2562	* @param cPages The number of pages to allocate.
2563	* @param paPages Pointer to the page descriptors.
2564	* See GMMPAGEDESC for details on what is expected on input.
2565	* @param enmAccount The account to charge.
2566	*
2567	* @thread EMT.
2568	*/
2569	GMMR0DECL(int) GMMR0AllocatePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2570	{
2571	LogFlow(("GMMR0AllocatePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
2572
2573	/*
2574	* Validate, get basics and take the semaphore.
2575	*/
2576	PGMM pGMM;
2577	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2578	PGVM pGVM;
2579	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2580	if (RT_FAILURE(rc))
2581	return rc;
2582
2583	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2584	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
2585	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
2586
2587	for (unsigned iPage = 0; iPage < cPages; iPage++)
2588	{
2589	AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2590	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
2591	\|\| ( enmAccount == GMMACCOUNT_BASE
2592	&& paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2593	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
2594	("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
2595	VERR_INVALID_PARAMETER);
2596	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2597	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2598	}
2599
2600	gmmR0MutexAcquire(pGMM);
2601	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2602	{
2603
2604	/* No allocations before the initial reservation has been made! */
2605	if (RT_LIKELY( pGVM->gmm.s.Reserved.cBasePages
2606	&& pGVM->gmm.s.Reserved.cFixedPages
2607	&& pGVM->gmm.s.Reserved.cShadowPages))
2608	{
2609	/*
2610	* gmmR0AllocatePages seed loop.
2611	* Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
2612	*/
2613	while (RT_SUCCESS(rc))
2614	{
2615	rc = gmmR0AllocatePages(pGMM, pGVM, cPages, paPages, enmAccount);
2616	if ( rc != VERR_GMM_SEED_ME
2617	\|\| pGMM->fLegacyAllocationMode)
2618	break;
2619	rc = gmmR0AllocateMoreChunks(pGMM, pGVM, &pGMM->Private, cPages);
2620	}
2621	}
2622	else
2623	rc = VERR_WRONG_ORDER;
2624	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2625	}
2626	else
2627	rc = VERR_INTERNAL_ERROR_5;
2628	gmmR0MutexRelease(pGMM);
2629	LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
2630	return rc;
2631	}
2632
2633
2634	/**
2635	* VMMR0 request wrapper for GMMR0AllocatePages.
2636	*
2637	* @returns see GMMR0AllocatePages.
2638	* @param pVM Pointer to the shared VM structure.
2639	* @param idCpu VCPU id
2640	* @param pReq The request packet.
2641	*/
2642	GMMR0DECL(int) GMMR0AllocatePagesReq(PVM pVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
2643	{
2644	/*
2645	* Validate input and pass it on.
2646	*/
2647	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2648	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2649	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
2650	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
2651	VERR_INVALID_PARAMETER);
2652	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
2653	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
2654	VERR_INVALID_PARAMETER);
2655
2656	return GMMR0AllocatePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
2657	}
2658
2659
2660	/**
2661	* Allocate a large page to represent guest RAM
2662	*
2663	* The allocated pages are not cleared and will contains random garbage.
2664	*
2665	* @returns VBox status code:
2666	* @retval VINF_SUCCESS on success.
2667	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2668	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2669	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2670	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2671	* that is we're trying to allocate more than we've reserved.
2672	* @returns see GMMR0AllocatePages.
2673	* @param pVM Pointer to the shared VM structure.
2674	* @param idCpu VCPU id
2675	* @param cbPage Large page size
2676	*/
2677	GMMR0DECL(int) GMMR0AllocateLargePage(PVM pVM, VMCPUID idCpu, uint32_t cbPage, uint32_t pIdPage, RTHCPHYS pHCPhys)
2678	{
2679	LogFlow(("GMMR0AllocateLargePage: pVM=%p cbPage=%x\n", pVM, cbPage));
2680
2681	AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
2682	AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
2683	AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
2684
2685	/*
2686	* Validate, get basics and take the semaphore.
2687	*/
2688	PGMM pGMM;
2689	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2690	PGVM pGVM;
2691	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2692	if (RT_FAILURE(rc))
2693	return rc;
2694
2695	/* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
2696	if (pGMM->fLegacyAllocationMode)
2697	return VERR_NOT_SUPPORTED;
2698
2699	*pHCPhys = NIL_RTHCPHYS;
2700	*pIdPage = NIL_GMM_PAGEID;
2701
2702	gmmR0MutexAcquire(pGMM);
2703	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2704	{
2705	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
2706	if (RT_UNLIKELY( pGVM->gmm.s.Allocated.cBasePages + pGVM->gmm.s.cBalloonedPages + cPages
2707	> pGVM->gmm.s.Reserved.cBasePages))
2708	{
2709	Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
2710	pGVM->gmm.s.Reserved.cBasePages, pGVM->gmm.s.Allocated.cBasePages, cPages));
2711	gmmR0MutexRelease(pGMM);
2712	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2713	}
2714
2715	/*
2716	* Allocate a new large page chunk.
2717	*
2718	* Note! We leave the giant GMM lock temporarily as the allocation might
2719	* take a long time. gmmR0RegisterChunk will retake it (ugly).
2720	*/
2721	AssertCompile(GMM_CHUNK_SIZE == _2M);
2722	gmmR0MutexRelease(pGMM);
2723
2724	RTR0MEMOBJ hMemObj;
2725	rc = RTR0MemObjAllocPhysEx(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS, GMM_CHUNK_SIZE);
2726	if (RT_SUCCESS(rc))
2727	{
2728	PGMMCHUNK pChunk;
2729	rc = gmmR0RegisterChunk(pGMM, &pGMM->Private, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_LARGE_PAGE, &pChunk);
2730	if (RT_SUCCESS(rc))
2731	{
2732	/*
2733	* Allocate all the pages in the chunk.
2734	*/
2735	/* Unlink the new chunk from the free list. */
2736	gmmR0UnlinkChunk(pChunk);
2737
2738	/** @todo rewrite this to skip the looping. */
2739	/* Allocate all pages. */
2740	GMMPAGEDESC PageDesc;
2741	gmmR0AllocatePage(pGMM, pGVM->hSelf, pChunk, &PageDesc);
2742
2743	/* Return the first page as we'll use the whole chunk as one big page. */
2744	*pIdPage = PageDesc.idPage;
2745	*pHCPhys = PageDesc.HCPhysGCPhys;
2746
2747	for (unsigned i = 1; i < cPages; i++)
2748	gmmR0AllocatePage(pGMM, pGVM->hSelf, pChunk, &PageDesc);
2749
2750	/* Update accounting. */
2751	pGVM->gmm.s.Allocated.cBasePages += cPages;
2752	pGVM->gmm.s.cPrivatePages += cPages;
2753	pGMM->cAllocatedPages += cPages;
2754
2755	gmmR0LinkChunk(pChunk, &pGMM->Private);
2756	gmmR0MutexRelease(pGMM);
2757	}
2758	else
2759	RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
2760	}
2761	}
2762	else
2763	{
2764	gmmR0MutexRelease(pGMM);
2765	rc = VERR_INTERNAL_ERROR_5;
2766	}
2767
2768	LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
2769	return rc;
2770	}
2771
2772
2773	/**
2774	* Free a large page
2775	*
2776	* @returns VBox status code:
2777	* @param pVM Pointer to the shared VM structure.
2778	* @param idCpu VCPU id
2779	* @param idPage Large page id
2780	*/
2781	GMMR0DECL(int) GMMR0FreeLargePage(PVM pVM, VMCPUID idCpu, uint32_t idPage)
2782	{
2783	LogFlow(("GMMR0FreeLargePage: pVM=%p idPage=%x\n", pVM, idPage));
2784
2785	/*
2786	* Validate, get basics and take the semaphore.
2787	*/
2788	PGMM pGMM;
2789	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
2790	PGVM pGVM;
2791	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
2792	if (RT_FAILURE(rc))
2793	return rc;
2794
2795	/* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
2796	if (pGMM->fLegacyAllocationMode)
2797	return VERR_NOT_SUPPORTED;
2798
2799	gmmR0MutexAcquire(pGMM);
2800	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2801	{
2802	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
2803
2804	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages < cPages))
2805	{
2806	Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cBasePages, cPages));
2807	gmmR0MutexRelease(pGMM);
2808	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
2809	}
2810
2811	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2812	if (RT_LIKELY( pPage
2813	&& GMM_PAGE_IS_PRIVATE(pPage)))
2814	{
2815	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
2816	Assert(pChunk);
2817	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
2818	Assert(pChunk->cPrivate > 0);
2819
2820	/* Release the memory immediately. */
2821	gmmR0FreeChunk(pGMM, NULL, pChunk, false /fRelaxedSem/); /** @todo this can be relaxed too! */
2822
2823	/* Update accounting. */
2824	pGVM->gmm.s.Allocated.cBasePages -= cPages;
2825	pGVM->gmm.s.cPrivatePages -= cPages;
2826	pGMM->cAllocatedPages -= cPages;
2827	}
2828	else
2829	rc = VERR_GMM_PAGE_NOT_FOUND;
2830	}
2831	else
2832	rc = VERR_INTERNAL_ERROR_5;
2833
2834	gmmR0MutexRelease(pGMM);
2835	LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
2836	return rc;
2837	}
2838
2839
2840	/**
2841	* VMMR0 request wrapper for GMMR0FreeLargePage.
2842	*
2843	* @returns see GMMR0FreeLargePage.
2844	* @param pVM Pointer to the shared VM structure.
2845	* @param idCpu VCPU id
2846	* @param pReq The request packet.
2847	*/
2848	GMMR0DECL(int) GMMR0FreeLargePageReq(PVM pVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
2849	{
2850	/*
2851	* Validate input and pass it on.
2852	*/
2853	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
2854	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
2855	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
2856	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
2857	VERR_INVALID_PARAMETER);
2858
2859	return GMMR0FreeLargePage(pVM, idCpu, pReq->idPage);
2860	}
2861
2862
2863	/**
2864	* Frees a chunk, giving it back to the host OS.
2865	*
2866	* @param pGMM Pointer to the GMM instance.
2867	* @param pGVM This is set when called from GMMR0CleanupVM so we can
2868	* unmap and free the chunk in one go.
2869	* @param pChunk The chunk to free.
2870	* @param fRelaxedSem Whether we can release the semaphore while doing the
2871	* freeing (@c true) or not.
2872	*/
2873	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
2874	{
2875	Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
2876
2877	GMMR0CHUNKMTXSTATE MtxState;
2878	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
2879
2880	/*
2881	* Cleanup hack! Unmap the chunk from the callers address space.
2882	* This shouldn't happen, so screw lock contention...
2883	*/
2884	if ( pChunk->cMappingsX
2885	&& !pGMM->fLegacyAllocationMode
2886	&& pGVM)
2887	gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
2888
2889	/*
2890	* If there are current mappings of the chunk, then request the
2891	* VMs to unmap them. Reposition the chunk in the free list so
2892	* it won't be a likely candidate for allocations.
2893	*/
2894	if (pChunk->cMappingsX)
2895	{
2896	/** @todo R0 -> VM request */
2897	/* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
2898	Log(("gmmR0FreeChunk: chunk still has %d/%d mappings; don't free!\n", pChunk->cMappingsX));
2899	gmmR0ChunkMutexRelease(&MtxState, pChunk);
2900	return false;
2901	}
2902
2903
2904	/*
2905	* Save and trash the handle.
2906	*/
2907	RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
2908	pChunk->hMemObj = NIL_RTR0MEMOBJ;
2909
2910	/*
2911	* Unlink it from everywhere.
2912	*/
2913	gmmR0UnlinkChunk(pChunk);
2914
2915	RTListNodeRemove(&pChunk->ListNode);
2916
2917	PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
2918	Assert(pCore == &pChunk->Core); NOREF(pCore);
2919
2920	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
2921	if (pTlbe->pChunk == pChunk)
2922	{
2923	pTlbe->idChunk = NIL_GMM_CHUNKID;
2924	pTlbe->pChunk = NULL;
2925	}
2926
2927	Assert(pGMM->cChunks > 0);
2928	pGMM->cChunks--;
2929
2930	/*
2931	* Free the Chunk ID before dropping the locks and freeing the rest.
2932	*/
2933	gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
2934	pChunk->Core.Key = NIL_GMM_CHUNKID;
2935
2936	pGMM->cFreedChunks++;
2937
2938	gmmR0ChunkMutexRelease(&MtxState, NULL);
2939	if (fRelaxedSem)
2940	gmmR0MutexRelease(pGMM);
2941
2942	RTMemFree(pChunk->paMappingsX);
2943	pChunk->paMappingsX = NULL;
2944
2945	RTMemFree(pChunk);
2946
2947	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
2948	AssertLogRelRC(rc);
2949
2950	if (fRelaxedSem)
2951	gmmR0MutexAcquire(pGMM);
2952	return fRelaxedSem;
2953	}
2954
2955
2956	/**
2957	* Free page worker.
2958	*
2959	* The caller does all the statistic decrementing, we do all the incrementing.
2960	*
2961	* @param pGMM Pointer to the GMM instance data.
2962	* @param pChunk Pointer to the chunk this page belongs to.
2963	* @param idPage The Page ID.
2964	* @param pPage Pointer to the page.
2965	*/
2966	static void gmmR0FreePageWorker(PGMM pGMM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
2967	{
2968	Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
2969	pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
2970
2971	/*
2972	* Put the page on the free list.
2973	*/
2974	pPage->u = 0;
2975	pPage->Free.u2State = GMM_PAGE_STATE_FREE;
2976	Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) \|\| pChunk->iFreeHead == UINT16_MAX);
2977	pPage->Free.iNext = pChunk->iFreeHead;
2978	pChunk->iFreeHead = pPage - &pChunk->aPages[0];
2979
2980	/*
2981	* Update statistics (the cShared/cPrivate stats are up to date already),
2982	* and relink the chunk if necessary.
2983	*/
2984	if (gmmR0SelectFreeSetList(pChunk->cFree) != gmmR0SelectFreeSetList(pChunk->cFree + 1))
2985	{
2986	gmmR0UnlinkChunk(pChunk);
2987	pChunk->cFree++;
2988	gmmR0LinkChunk(pChunk, pChunk->cShared ? &pGMM->Shared : &pGMM->Private);
2989	}
2990	else
2991	{
2992	pChunk->cFree++;
2993	pChunk->pSet->cFreePages++;
2994	}
2995
2996	/*
2997	* If the chunk becomes empty, consider giving memory back to the host OS.
2998	*
2999	* The current strategy is to try give it back if there are other chunks
3000	* in this free list, meaning if there are at least 240 free pages in this
3001	* category. Note that since there are probably mappings of the chunk,
3002	* it won't be freed up instantly, which probably screws up this logic
3003	* a bit...
3004	*/
3005	/** @todo Do this on the way out. */
3006	if (RT_UNLIKELY( pChunk->cFree == GMM_CHUNK_NUM_PAGES
3007	&& pChunk->pFreeNext
3008	&& pChunk->pFreePrev /** @todo this is probably misfiring, see reset... */
3009	&& !pGMM->fLegacyAllocationMode))
3010	gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3011
3012	}
3013
3014
3015	/**
3016	* Frees a shared page, the page is known to exist and be valid and such.
3017	*
3018	* @param pGMM Pointer to the GMM instance.
3019	* @param idPage The Page ID
3020	* @param pPage The page structure.
3021	*/
3022	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage)
3023	{
3024	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3025	Assert(pChunk);
3026	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3027	Assert(pChunk->cShared > 0);
3028	Assert(pGMM->cSharedPages > 0);
3029	Assert(pGMM->cAllocatedPages > 0);
3030	Assert(!pPage->Shared.cRefs);
3031
3032	pChunk->cShared--;
3033	pGMM->cAllocatedPages--;
3034	pGMM->cSharedPages--;
3035	gmmR0FreePageWorker(pGMM, pChunk, idPage, pPage);
3036	}
3037
3038	#ifdef VBOX_WITH_PAGE_SHARING
3039
3040	/**
3041	* Converts a private page to a shared page, the page is known to exist and be valid and such.
3042	*
3043	* @param pGMM Pointer to the GMM instance.
3044	* @param pGVM Pointer to the GVM instance.
3045	* @param HCPhys Host physical address
3046	* @param idPage The Page ID
3047	* @param pPage The page structure.
3048	*/
3049	DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage)
3050	{
3051	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3052	Assert(pChunk);
3053	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3054	Assert(GMM_PAGE_IS_PRIVATE(pPage));
3055
3056	pChunk->cPrivate--;
3057	pChunk->cShared++;
3058
3059	pGMM->cSharedPages++;
3060
3061	pGVM->gmm.s.cSharedPages++;
3062	pGVM->gmm.s.cPrivatePages--;
3063
3064	/* Modify the page structure. */
3065	pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
3066	pPage->Shared.cRefs = 1;
3067	pPage->Common.u2State = GMM_PAGE_STATE_SHARED;
3068	}
3069
3070
3071	/**
3072	* Increase the use count of a shared page, the page is known to exist and be valid and such.
3073	*
3074	* @param pGMM Pointer to the GMM instance.
3075	* @param pGVM Pointer to the GVM instance.
3076	* @param pPage The page structure.
3077	*/
3078	DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
3079	{
3080	Assert(pGMM->cSharedPages > 0);
3081	Assert(pGMM->cAllocatedPages > 0);
3082
3083	pGMM->cDuplicatePages++;
3084
3085	pPage->Shared.cRefs++;
3086	pGVM->gmm.s.cSharedPages++;
3087	pGVM->gmm.s.Allocated.cBasePages++;
3088	}
3089
3090	#endif /* VBOX_WITH_PAGE_SHARING */
3091
3092	/**
3093	* Frees a private page, the page is known to exist and be valid and such.
3094	*
3095	* @param pGMM Pointer to the GMM instance.
3096	* @param idPage The Page ID
3097	* @param pPage The page structure.
3098	*/
3099	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, uint32_t idPage, PGMMPAGE pPage)
3100	{
3101	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3102	Assert(pChunk);
3103	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3104	Assert(pChunk->cPrivate > 0);
3105	Assert(pGMM->cAllocatedPages > 0);
3106
3107	pChunk->cPrivate--;
3108	pGMM->cAllocatedPages--;
3109	gmmR0FreePageWorker(pGMM, pChunk, idPage, pPage);
3110	}
3111
3112
3113	/**
3114	* Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3115	*
3116	* @returns VBox status code:
3117	* @retval xxx
3118	*
3119	* @param pGMM Pointer to the GMM instance data.
3120	* @param pGVM Pointer to the shared VM structure.
3121	* @param cPages The number of pages to free.
3122	* @param paPages Pointer to the page descriptors.
3123	* @param enmAccount The account this relates to.
3124	*/
3125	static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3126	{
3127	/*
3128	* Check that the request isn't impossible wrt to the account status.
3129	*/
3130	switch (enmAccount)
3131	{
3132	case GMMACCOUNT_BASE:
3133	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cBasePages < cPages))
3134	{
3135	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cBasePages, cPages));
3136	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3137	}
3138	break;
3139	case GMMACCOUNT_SHADOW:
3140	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cShadowPages < cPages))
3141	{
3142	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cShadowPages, cPages));
3143	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3144	}
3145	break;
3146	case GMMACCOUNT_FIXED:
3147	if (RT_UNLIKELY(pGVM->gmm.s.Allocated.cFixedPages < cPages))
3148	{
3149	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Allocated.cFixedPages, cPages));
3150	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3151	}
3152	break;
3153	default:
3154	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
3155	}
3156
3157	/*
3158	* Walk the descriptors and free the pages.
3159	*
3160	* Statistics (except the account) are being updated as we go along,
3161	* unlike the alloc code. Also, stop on the first error.
3162	*/
3163	int rc = VINF_SUCCESS;
3164	uint32_t iPage;
3165	for (iPage = 0; iPage < cPages; iPage++)
3166	{
3167	uint32_t idPage = paPages[iPage].idPage;
3168	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3169	if (RT_LIKELY(pPage))
3170	{
3171	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3172	{
3173	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3174	{
3175	Assert(pGVM->gmm.s.cPrivatePages);
3176	pGVM->gmm.s.cPrivatePages--;
3177	gmmR0FreePrivatePage(pGMM, idPage, pPage);
3178	}
3179	else
3180	{
3181	Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3182	pPage->Private.hGVM, pGVM->hSelf));
3183	rc = VERR_GMM_NOT_PAGE_OWNER;
3184	break;
3185	}
3186	}
3187	else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3188	{
3189	Assert(pGVM->gmm.s.cSharedPages);
3190	pGVM->gmm.s.cSharedPages--;
3191	Assert(pPage->Shared.cRefs);
3192	if (!--pPage->Shared.cRefs)
3193	gmmR0FreeSharedPage(pGMM, idPage, pPage);
3194	else
3195	{
3196	Assert(pGMM->cDuplicatePages);
3197	pGMM->cDuplicatePages--;
3198	}
3199	}
3200	else
3201	{
3202	Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3203	rc = VERR_GMM_PAGE_ALREADY_FREE;
3204	break;
3205	}
3206	}
3207	else
3208	{
3209	Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3210	rc = VERR_GMM_PAGE_NOT_FOUND;
3211	break;
3212	}
3213	paPages[iPage].idPage = NIL_GMM_PAGEID;
3214	}
3215
3216	/*
3217	* Update the account.
3218	*/
3219	switch (enmAccount)
3220	{
3221	case GMMACCOUNT_BASE: pGVM->gmm.s.Allocated.cBasePages -= iPage; break;
3222	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Allocated.cShadowPages -= iPage; break;
3223	case GMMACCOUNT_FIXED: pGVM->gmm.s.Allocated.cFixedPages -= iPage; break;
3224	default:
3225	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_INTERNAL_ERROR);
3226	}
3227
3228	/*
3229	* Any threshold stuff to be done here?
3230	*/
3231
3232	return rc;
3233	}
3234
3235
3236	/**
3237	* Free one or more pages.
3238	*
3239	* This is typically used at reset time or power off.
3240	*
3241	* @returns VBox status code:
3242	* @retval xxx
3243	*
3244	* @param pVM Pointer to the shared VM structure.
3245	* @param idCpu VCPU id
3246	* @param cPages The number of pages to allocate.
3247	* @param paPages Pointer to the page descriptors containing the Page IDs for each page.
3248	* @param enmAccount The account this relates to.
3249	* @thread EMT.
3250	*/
3251	GMMR0DECL(int) GMMR0FreePages(PVM pVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3252	{
3253	LogFlow(("GMMR0FreePages: pVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pVM, cPages, paPages, enmAccount));
3254
3255	/*
3256	* Validate input and get the basics.
3257	*/
3258	PGMM pGMM;
3259	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3260	PGVM pGVM;
3261	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3262	if (RT_FAILURE(rc))
3263	return rc;
3264
3265	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3266	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3267	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3268
3269	for (unsigned iPage = 0; iPage < cPages; iPage++)
3270	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3271	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
3272	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3273
3274	/*
3275	* Take the semaphore and call the worker function.
3276	*/
3277	gmmR0MutexAcquire(pGMM);
3278	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3279	{
3280	rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3281	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3282	}
3283	else
3284	rc = VERR_INTERNAL_ERROR_5;
3285	gmmR0MutexRelease(pGMM);
3286	LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3287	return rc;
3288	}
3289
3290
3291	/**
3292	* VMMR0 request wrapper for GMMR0FreePages.
3293	*
3294	* @returns see GMMR0FreePages.
3295	* @param pVM Pointer to the shared VM structure.
3296	* @param idCpu VCPU id
3297	* @param pReq The request packet.
3298	*/
3299	GMMR0DECL(int) GMMR0FreePagesReq(PVM pVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3300	{
3301	/*
3302	* Validate input and pass it on.
3303	*/
3304	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3305	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3306	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3307	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3308	VERR_INVALID_PARAMETER);
3309	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3310	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3311	VERR_INVALID_PARAMETER);
3312
3313	return GMMR0FreePages(pVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3314	}
3315
3316
3317	/**
3318	* Report back on a memory ballooning request.
3319	*
3320	* The request may or may not have been initiated by the GMM. If it was initiated
3321	* by the GMM it is important that this function is called even if no pages were
3322	* ballooned.
3323	*
3324	* @returns VBox status code:
3325	* @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3326	* @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3327	* @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3328	* indicating that we won't necessarily have sufficient RAM to boot
3329	* the VM again and that it should pause until this changes (we'll try
3330	* balloon some other VM). (For standard deflate we have little choice
3331	* but to hope the VM won't use the memory that was returned to it.)
3332	*
3333	* @param pVM Pointer to the shared VM structure.
3334	* @param idCpu VCPU id
3335	* @param enmAction Inflate/deflate/reset
3336	* @param cBalloonedPages The number of pages that was ballooned.
3337	*
3338	* @thread EMT.
3339	*/
3340	GMMR0DECL(int) GMMR0BalloonedPages(PVM pVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3341	{
3342	LogFlow(("GMMR0BalloonedPages: pVM=%p enmAction=%d cBalloonedPages=%#x\n",
3343	pVM, enmAction, cBalloonedPages));
3344
3345	AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3346
3347	/*
3348	* Validate input and get the basics.
3349	*/
3350	PGMM pGMM;
3351	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3352	PGVM pGVM;
3353	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3354	if (RT_FAILURE(rc))
3355	return rc;
3356
3357	/*
3358	* Take the semaphore and do some more validations.
3359	*/
3360	gmmR0MutexAcquire(pGMM);
3361	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3362	{
3363	switch (enmAction)
3364	{
3365	case GMMBALLOONACTION_INFLATE:
3366	{
3367	if (RT_LIKELY(pGVM->gmm.s.Allocated.cBasePages + pGVM->gmm.s.cBalloonedPages + cBalloonedPages <= pGVM->gmm.s.Reserved.cBasePages))
3368	{
3369	/*
3370	* Record the ballooned memory.
3371	*/
3372	pGMM->cBalloonedPages += cBalloonedPages;
3373	if (pGVM->gmm.s.cReqBalloonedPages)
3374	{
3375	/* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3376	AssertFailed();
3377
3378	pGVM->gmm.s.cBalloonedPages += cBalloonedPages;
3379	pGVM->gmm.s.cReqActuallyBalloonedPages += cBalloonedPages;
3380	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n", cBalloonedPages,
3381	pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages, pGVM->gmm.s.cReqBalloonedPages, pGVM->gmm.s.cReqActuallyBalloonedPages));
3382	}
3383	else
3384	{
3385	pGVM->gmm.s.cBalloonedPages += cBalloonedPages;
3386	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3387	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages));
3388	}
3389	}
3390	else
3391	{
3392	Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3393	pGVM->gmm.s.Allocated.cBasePages, pGVM->gmm.s.cBalloonedPages, cBalloonedPages, pGVM->gmm.s.Reserved.cBasePages));
3394	rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3395	}
3396	break;
3397	}
3398
3399	case GMMBALLOONACTION_DEFLATE:
3400	{
3401	/* Deflate. */
3402	if (pGVM->gmm.s.cBalloonedPages >= cBalloonedPages)
3403	{
3404	/*
3405	* Record the ballooned memory.
3406	*/
3407	Assert(pGMM->cBalloonedPages >= cBalloonedPages);
3408	pGMM->cBalloonedPages -= cBalloonedPages;
3409	pGVM->gmm.s.cBalloonedPages -= cBalloonedPages;
3410	if (pGVM->gmm.s.cReqDeflatePages)
3411	{
3412	AssertFailed(); /* This is path is for later. */
3413	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
3414	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages, pGVM->gmm.s.cReqDeflatePages));
3415
3416	/*
3417	* Anything we need to do here now when the request has been completed?
3418	*/
3419	pGVM->gmm.s.cReqDeflatePages = 0;
3420	}
3421	else
3422	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3423	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.cBalloonedPages));
3424	}
3425	else
3426	{
3427	Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.cBalloonedPages, cBalloonedPages));
3428	rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
3429	}
3430	break;
3431	}
3432
3433	case GMMBALLOONACTION_RESET:
3434	{
3435	/* Reset to an empty balloon. */
3436	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.cBalloonedPages);
3437
3438	pGMM->cBalloonedPages -= pGVM->gmm.s.cBalloonedPages;
3439	pGVM->gmm.s.cBalloonedPages = 0;
3440	break;
3441	}
3442
3443	default:
3444	rc = VERR_INVALID_PARAMETER;
3445	break;
3446	}
3447	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3448	}
3449	else
3450	rc = VERR_INTERNAL_ERROR_5;
3451
3452	gmmR0MutexRelease(pGMM);
3453	LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
3454	return rc;
3455	}
3456
3457
3458	/**
3459	* VMMR0 request wrapper for GMMR0BalloonedPages.
3460	*
3461	* @returns see GMMR0BalloonedPages.
3462	* @param pVM Pointer to the shared VM structure.
3463	* @param idCpu VCPU id
3464	* @param pReq The request packet.
3465	*/
3466	GMMR0DECL(int) GMMR0BalloonedPagesReq(PVM pVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
3467	{
3468	/*
3469	* Validate input and pass it on.
3470	*/
3471	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3472	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3473	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
3474	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
3475	VERR_INVALID_PARAMETER);
3476
3477	return GMMR0BalloonedPages(pVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
3478	}
3479
3480	/**
3481	* Return memory statistics for the hypervisor
3482	*
3483	* @returns VBox status code:
3484	* @param pVM Pointer to the shared VM structure.
3485	* @param pReq The request packet.
3486	*/
3487	GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PVM pVM, PGMMMEMSTATSREQ pReq)
3488	{
3489	/*
3490	* Validate input and pass it on.
3491	*/
3492	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3493	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3494	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3495	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3496	VERR_INVALID_PARAMETER);
3497
3498	/*
3499	* Validate input and get the basics.
3500	*/
3501	PGMM pGMM;
3502	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3503	pReq->cAllocPages = pGMM->cAllocatedPages;
3504	pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
3505	pReq->cBalloonedPages = pGMM->cBalloonedPages;
3506	pReq->cMaxPages = pGMM->cMaxPages;
3507	pReq->cSharedPages = pGMM->cDuplicatePages;
3508	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3509
3510	return VINF_SUCCESS;
3511	}
3512
3513	/**
3514	* Return memory statistics for the VM
3515	*
3516	* @returns VBox status code:
3517	* @param pVM Pointer to the shared VM structure.
3518	* @parma idCpu Cpu id.
3519	* @param pReq The request packet.
3520	*/
3521	GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PVM pVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
3522	{
3523	/*
3524	* Validate input and pass it on.
3525	*/
3526	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3527	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3528	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
3529	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
3530	VERR_INVALID_PARAMETER);
3531
3532	/*
3533	* Validate input and get the basics.
3534	*/
3535	PGMM pGMM;
3536	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3537	PGVM pGVM;
3538	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3539	if (RT_FAILURE(rc))
3540	return rc;
3541
3542	/*
3543	* Take the semaphore and do some more validations.
3544	*/
3545	gmmR0MutexAcquire(pGMM);
3546	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3547	{
3548	pReq->cAllocPages = pGVM->gmm.s.Allocated.cBasePages;
3549	pReq->cBalloonedPages = pGVM->gmm.s.cBalloonedPages;
3550	pReq->cMaxPages = pGVM->gmm.s.Reserved.cBasePages;
3551	pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
3552	}
3553	else
3554	rc = VERR_INTERNAL_ERROR_5;
3555
3556	gmmR0MutexRelease(pGMM);
3557	LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
3558	return rc;
3559	}
3560
3561
3562	/**
3563	* Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
3564	*
3565	* Don't call this in legacy allocation mode!
3566	*
3567	* @returns VBox status code.
3568	* @param pGMM Pointer to the GMM instance data.
3569	* @param pGVM Pointer to the Global VM structure.
3570	* @param pChunk Pointer to the chunk to be unmapped.
3571	*/
3572	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
3573	{
3574	Assert(!pGMM->fLegacyAllocationMode);
3575
3576	/*
3577	* Find the mapping and try unmapping it.
3578	*/
3579	uint32_t cMappings = pChunk->cMappingsX;
3580	for (uint32_t i = 0; i < cMappings; i++)
3581	{
3582	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3583	if (pChunk->paMappingsX[i].pGVM == pGVM)
3584	{
3585	/* unmap */
3586	int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
3587	if (RT_SUCCESS(rc))
3588	{
3589	/* update the record. */
3590	cMappings--;
3591	if (i < cMappings)
3592	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
3593	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
3594	pChunk->paMappingsX[cMappings].pGVM = NULL;
3595	Assert(pChunk->cMappingsX - 1U == cMappings);
3596	pChunk->cMappingsX = cMappings;
3597	}
3598
3599	return rc;
3600	}
3601	}
3602
3603	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3604	return VERR_GMM_CHUNK_NOT_MAPPED;
3605	}
3606
3607
3608	/**
3609	* Unmaps a chunk previously mapped into the address space of the current process.
3610	*
3611	* @returns VBox status code.
3612	* @param pGMM Pointer to the GMM instance data.
3613	* @param pGVM Pointer to the Global VM structure.
3614	* @param pChunk Pointer to the chunk to be unmapped.
3615	*/
3616	static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3617	{
3618	if (!pGMM->fLegacyAllocationMode)
3619	{
3620	/*
3621	* Lock the chunk and if possible leave the giant GMM lock.
3622	*/
3623	GMMR0CHUNKMTXSTATE MtxState;
3624	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
3625	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
3626	if (RT_SUCCESS(rc))
3627	{
3628	rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3629	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3630	}
3631	return rc;
3632	}
3633
3634	if (pChunk->hGVM == pGVM->hSelf)
3635	return VINF_SUCCESS;
3636
3637	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x (legacy)\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
3638	return VERR_GMM_CHUNK_NOT_MAPPED;
3639	}
3640
3641
3642	/**
3643	* Worker for gmmR0MapChunk.
3644	*
3645	* @returns VBox status code.
3646	* @param pGMM Pointer to the GMM instance data.
3647	* @param pGVM Pointer to the Global VM structure.
3648	* @param pChunk Pointer to the chunk to be mapped.
3649	* @param ppvR3 Where to store the ring-3 address of the mapping.
3650	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3651	* contain the address of the existing mapping.
3652	*/
3653	static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
3654	{
3655	/*
3656	* If we're in legacy mode this is simple.
3657	*/
3658	if (pGMM->fLegacyAllocationMode)
3659	{
3660	if (pChunk->hGVM != pGVM->hSelf)
3661	{
3662	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3663	return VERR_GMM_CHUNK_NOT_FOUND;
3664	}
3665
3666	*ppvR3 = RTR0MemObjAddressR3(pChunk->hMemObj);
3667	return VINF_SUCCESS;
3668	}
3669
3670	/*
3671	* Check to see if the chunk is already mapped.
3672	*/
3673	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
3674	{
3675	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3676	if (pChunk->paMappingsX[i].pGVM == pGVM)
3677	{
3678	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
3679	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
3680	#ifdef VBOX_WITH_PAGE_SHARING
3681	/* The ring-3 chunk cache can be out of sync; don't fail. */
3682	return VINF_SUCCESS;
3683	#else
3684	return VERR_GMM_CHUNK_ALREADY_MAPPED;
3685	#endif
3686	}
3687	}
3688
3689	/*
3690	* Do the mapping.
3691	*/
3692	RTR0MEMOBJ hMapObj;
3693	int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
3694	if (RT_SUCCESS(rc))
3695	{
3696	/* reallocate the array? assumes few users per chunk (usually one). */
3697	unsigned iMapping = pChunk->cMappingsX;
3698	if ( iMapping <= 3
3699	\|\| (iMapping & 3) == 0)
3700	{
3701	unsigned cNewSize = iMapping <= 3
3702	? iMapping + 1
3703	: iMapping + 4;
3704	Assert(cNewSize < 4 \|\| RT_ALIGN_32(cNewSize, 4) == cNewSize);
3705	if (RT_UNLIKELY(cNewSize > UINT16_MAX))
3706	{
3707	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
3708	return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
3709	}
3710
3711	void pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize sizeof(pChunk->paMappingsX[0]));
3712	if (RT_UNLIKELY(!pvMappings))
3713	{
3714	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
3715	return VERR_NO_MEMORY;
3716	}
3717	pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
3718	}
3719
3720	/* insert new entry */
3721	pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
3722	pChunk->paMappingsX[iMapping].pGVM = pGVM;
3723	Assert(pChunk->cMappingsX == iMapping);
3724	pChunk->cMappingsX = iMapping + 1;
3725
3726	*ppvR3 = RTR0MemObjAddressR3(hMapObj);
3727	}
3728
3729	return rc;
3730	}
3731
3732
3733	/**
3734	* Maps a chunk into the user address space of the current process.
3735	*
3736	* @returns VBox status code.
3737	* @param pGMM Pointer to the GMM instance data.
3738	* @param pGVM Pointer to the Global VM structure.
3739	* @param pChunk Pointer to the chunk to be mapped.
3740	* @param fRelaxedSem Whether we can release the semaphore while doing the
3741	* mapping (@c true) or not.
3742	* @param ppvR3 Where to store the ring-3 address of the mapping.
3743	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
3744	* contain the address of the existing mapping.
3745	*/
3746	static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
3747	{
3748	/*
3749	* Take the chunk lock and leave the giant GMM lock when possible, then
3750	* call the worker function.
3751	*/
3752	GMMR0CHUNKMTXSTATE MtxState;
3753	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
3754	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
3755	if (RT_SUCCESS(rc))
3756	{
3757	rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
3758	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3759	}
3760
3761	return rc;
3762	}
3763
3764
3765
3766	/**
3767	* Check if a chunk is mapped into the specified VM
3768	*
3769	* @returns mapped yes/no
3770	* @param pGMM Pointer to the GMM instance.
3771	* @param pGVM Pointer to the Global VM structure.
3772	* @param pChunk Pointer to the chunk to be mapped.
3773	* @param ppvR3 Where to store the ring-3 address of the mapping.
3774	*/
3775	static int gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
3776	{
3777	GMMR0CHUNKMTXSTATE MtxState;
3778	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3779	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
3780	{
3781	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
3782	if (pChunk->paMappingsX[i].pGVM == pGVM)
3783	{
3784	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
3785	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3786	return true;
3787	}
3788	}
3789	*ppvR3 = NULL;
3790	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3791	return false;
3792	}
3793
3794
3795	/**
3796	* Map a chunk and/or unmap another chunk.
3797	*
3798	* The mapping and unmapping applies to the current process.
3799	*
3800	* This API does two things because it saves a kernel call per mapping when
3801	* when the ring-3 mapping cache is full.
3802	*
3803	* @returns VBox status code.
3804	* @param pVM The VM.
3805	* @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
3806	* @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
3807	* @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
3808	* @thread EMT
3809	*/
3810	GMMR0DECL(int) GMMR0MapUnmapChunk(PVM pVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
3811	{
3812	LogFlow(("GMMR0MapUnmapChunk: pVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
3813	pVM, idChunkMap, idChunkUnmap, ppvR3));
3814
3815	/*
3816	* Validate input and get the basics.
3817	*/
3818	PGMM pGMM;
3819	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3820	PGVM pGVM;
3821	int rc = GVMMR0ByVM(pVM, &pGVM);
3822	if (RT_FAILURE(rc))
3823	return rc;
3824
3825	AssertCompile(NIL_GMM_CHUNKID == 0);
3826	AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
3827	AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
3828
3829	if ( idChunkMap == NIL_GMM_CHUNKID
3830	&& idChunkUnmap == NIL_GMM_CHUNKID)
3831	return VERR_INVALID_PARAMETER;
3832
3833	if (idChunkMap != NIL_GMM_CHUNKID)
3834	{
3835	AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
3836	*ppvR3 = NIL_RTR3PTR;
3837	}
3838
3839	/*
3840	* Take the semaphore and do the work.
3841	*
3842	* The unmapping is done last since it's easier to undo a mapping than
3843	* undoing an unmapping. The ring-3 mapping cache cannot not be so big
3844	* that it pushes the user virtual address space to within a chunk of
3845	* it it's limits, so, no problem here.
3846	*/
3847	gmmR0MutexAcquire(pGMM);
3848	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3849	{
3850	PGMMCHUNK pMap = NULL;
3851	if (idChunkMap != NIL_GVM_HANDLE)
3852	{
3853	pMap = gmmR0GetChunk(pGMM, idChunkMap);
3854	if (RT_LIKELY(pMap))
3855	rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /fRelaxedSem/, ppvR3);
3856	else
3857	{
3858	Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
3859	rc = VERR_GMM_CHUNK_NOT_FOUND;
3860	}
3861	}
3862	/** @todo split this operation, the bail out might (theoretcially) not be
3863	* entirely safe. */
3864
3865	if ( idChunkUnmap != NIL_GMM_CHUNKID
3866	&& RT_SUCCESS(rc))
3867	{
3868	PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
3869	if (RT_LIKELY(pUnmap))
3870	rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /fRelaxedSem/);
3871	else
3872	{
3873	Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
3874	rc = VERR_GMM_CHUNK_NOT_FOUND;
3875	}
3876
3877	if (RT_FAILURE(rc) && pMap)
3878	gmmR0UnmapChunk(pGMM, pGVM, pMap, false /fRelaxedSem/);
3879	}
3880
3881	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3882	}
3883	else
3884	rc = VERR_INTERNAL_ERROR_5;
3885	gmmR0MutexRelease(pGMM);
3886
3887	LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
3888	return rc;
3889	}
3890
3891
3892	/**
3893	* VMMR0 request wrapper for GMMR0MapUnmapChunk.
3894	*
3895	* @returns see GMMR0MapUnmapChunk.
3896	* @param pVM Pointer to the shared VM structure.
3897	* @param pReq The request packet.
3898	*/
3899	GMMR0DECL(int) GMMR0MapUnmapChunkReq(PVM pVM, PGMMMAPUNMAPCHUNKREQ pReq)
3900	{
3901	/*
3902	* Validate input and pass it on.
3903	*/
3904	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
3905	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3906	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
3907
3908	return GMMR0MapUnmapChunk(pVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
3909	}
3910
3911
3912	/**
3913	* Legacy mode API for supplying pages.
3914	*
3915	* The specified user address points to a allocation chunk sized block that
3916	* will be locked down and used by the GMM when the GM asks for pages.
3917	*
3918	* @returns VBox status code.
3919	* @param pVM The VM.
3920	* @param idCpu VCPU id
3921	* @param pvR3 Pointer to the chunk size memory block to lock down.
3922	*/
3923	GMMR0DECL(int) GMMR0SeedChunk(PVM pVM, VMCPUID idCpu, RTR3PTR pvR3)
3924	{
3925	/*
3926	* Validate input and get the basics.
3927	*/
3928	PGMM pGMM;
3929	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
3930	PGVM pGVM;
3931	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
3932	if (RT_FAILURE(rc))
3933	return rc;
3934
3935	AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
3936	AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
3937
3938	if (!pGMM->fLegacyAllocationMode)
3939	{
3940	Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
3941	return VERR_NOT_SUPPORTED;
3942	}
3943
3944	/*
3945	* Lock the memory and add it as new chunk with our hGVM.
3946	* (The GMM locking is done inside gmmR0RegisterChunk.)
3947	*/
3948	RTR0MEMOBJ MemObj;
3949	rc = RTR0MemObjLockUser(&MemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
3950	if (RT_SUCCESS(rc))
3951	{
3952	rc = gmmR0RegisterChunk(pGMM, &pGMM->Private, MemObj, pGVM->hSelf, 0 /fChunkFlags/, NULL);
3953	if (RT_SUCCESS(rc))
3954	gmmR0MutexRelease(pGMM);
3955	else
3956	RTR0MemObjFree(MemObj, false /* fFreeMappings */);
3957	}
3958
3959	LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
3960	return rc;
3961	}
3962
3963
3964	typedef struct
3965	{
3966	PAVLGCPTRNODECORE pNode;
3967	char *pszModuleName;
3968	char *pszVersion;
3969	VBOXOSFAMILY enmGuestOS;
3970	} GMMFINDMODULEBYNAME, *PGMMFINDMODULEBYNAME;
3971
3972	/**
3973	* Tree enumeration callback for finding identical modules by name and version
3974	*/
3975	DECLCALLBACK(int) gmmR0CheckForIdenticalModule(PAVLGCPTRNODECORE pNode, void *pvUser)
3976	{
3977	PGMMFINDMODULEBYNAME pInfo = (PGMMFINDMODULEBYNAME)pvUser;
3978	PGMMSHAREDMODULE pModule = (PGMMSHAREDMODULE)pNode;
3979
3980	if ( pInfo
3981	&& pInfo->enmGuestOS == pModule->enmGuestOS
3982	/** @todo replace with RTStrNCmp */
3983	&& !strcmp(pModule->szName, pInfo->pszModuleName)
3984	&& !strcmp(pModule->szVersion, pInfo->pszVersion))
3985	{
3986	pInfo->pNode = pNode;
3987	return 1; /* stop search */
3988	}
3989	return 0;
3990	}
3991
3992
3993	/**
3994	* Registers a new shared module for the VM
3995	*
3996	* @returns VBox status code.
3997	* @param pVM VM handle
3998	* @param idCpu VCPU id
3999	* @param enmGuestOS Guest OS type
4000	* @param pszModuleName Module name
4001	* @param pszVersion Module version
4002	* @param GCBaseAddr Module base address
4003	* @param cbModule Module size
4004	* @param cRegions Number of shared region descriptors
4005	* @param pRegions Shared region(s)
4006	*/
4007	GMMR0DECL(int) GMMR0RegisterSharedModule(PVM pVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char pszModuleName, char pszVersion, RTGCPTR GCBaseAddr, uint32_t cbModule,
4008	unsigned cRegions, VMMDEVSHAREDREGIONDESC *pRegions)
4009	{
4010	#ifdef VBOX_WITH_PAGE_SHARING
4011	/*
4012	* Validate input and get the basics.
4013	*/
4014	PGMM pGMM;
4015	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4016	PGVM pGVM;
4017	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4018	if (RT_FAILURE(rc))
4019	return rc;
4020
4021	Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x\n", pszModuleName, pszVersion, GCBaseAddr, cbModule));
4022
4023	/*
4024	* Take the semaphore and do some more validations.
4025	*/
4026	gmmR0MutexAcquire(pGMM);
4027	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4028	{
4029	bool fNewModule = false;
4030
4031	/* Check if this module is already locally registered. */
4032	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCBaseAddr);
4033	if (!pRecVM)
4034	{
4035	pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULEPERVM, aRegions[cRegions]));
4036	if (!pRecVM)
4037	{
4038	AssertFailed();
4039	rc = VERR_NO_MEMORY;
4040	goto end;
4041	}
4042	pRecVM->Core.Key = GCBaseAddr;
4043	pRecVM->cRegions = cRegions;
4044
4045	/* Save the region data as they can differ between VMs (address space scrambling or simply different loading order) */
4046	for (unsigned i = 0; i < cRegions; i++)
4047	{
4048	pRecVM->aRegions[i].GCRegionAddr = pRegions[i].GCRegionAddr;
4049	pRecVM->aRegions[i].cbRegion = RT_ALIGN_T(pRegions[i].cbRegion, PAGE_SIZE, uint32_t);
4050	pRecVM->aRegions[i].u32Alignment = 0;
4051	pRecVM->aRegions[i].paHCPhysPageID = NULL; /* unused */
4052	}
4053
4054	bool ret = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4055	Assert(ret);
4056
4057	Log(("GMMR0RegisterSharedModule: new local module %s\n", pszModuleName));
4058	fNewModule = true;
4059	}
4060	else
4061	rc = VINF_PGM_SHARED_MODULE_ALREADY_REGISTERED;
4062
4063	/* Check if this module is already globally registered. */
4064	PGMMSHAREDMODULE pGlobalModule = (PGMMSHAREDMODULE)RTAvlGCPtrGet(&pGMM->pGlobalSharedModuleTree, GCBaseAddr);
4065	if ( !pGlobalModule
4066	&& enmGuestOS == VBOXOSFAMILY_Windows64)
4067	{
4068	/* Two identical copies of e.g. Win7 x64 will typically not have a similar virtual address space layout for dlls or kernel modules.
4069	* Try to find identical binaries based on name and version.
4070	*/
4071	GMMFINDMODULEBYNAME Info;
4072
4073	Info.pNode = NULL;
4074	Info.pszVersion = pszVersion;
4075	Info.pszModuleName = pszModuleName;
4076	Info.enmGuestOS = enmGuestOS;
4077
4078	Log(("Try to find identical module %s\n", pszModuleName));
4079	int ret = RTAvlGCPtrDoWithAll(&pGMM->pGlobalSharedModuleTree, true /* fFromLeft */, gmmR0CheckForIdenticalModule, &Info);
4080	if (ret == 1)
4081	{
4082	Assert(Info.pNode);
4083	pGlobalModule = (PGMMSHAREDMODULE)Info.pNode;
4084	Log(("Found identical module at %RGv\n", pGlobalModule->Core.Key));
4085	}
4086	}
4087
4088	if (!pGlobalModule)
4089	{
4090	Assert(fNewModule);
4091	Assert(!pRecVM->fCollision);
4092
4093	pGlobalModule = (PGMMSHAREDMODULE)RTMemAllocZ(RT_OFFSETOF(GMMSHAREDMODULE, aRegions[cRegions]));
4094	if (!pGlobalModule)
4095	{
4096	AssertFailed();
4097	rc = VERR_NO_MEMORY;
4098	goto end;
4099	}
4100
4101	pGlobalModule->Core.Key = GCBaseAddr;
4102	pGlobalModule->cbModule = cbModule;
4103	/* Input limit already safe; no need to check again. */
4104	/** @todo replace with RTStrCopy */
4105	strcpy(pGlobalModule->szName, pszModuleName);
4106	strcpy(pGlobalModule->szVersion, pszVersion);
4107
4108	pGlobalModule->enmGuestOS = enmGuestOS;
4109	pGlobalModule->cRegions = cRegions;
4110
4111	for (unsigned i = 0; i < cRegions; i++)
4112	{
4113	Log(("New region %d base=%RGv size %x\n", i, pRegions[i].GCRegionAddr, pRegions[i].cbRegion));
4114	pGlobalModule->aRegions[i].GCRegionAddr = pRegions[i].GCRegionAddr;
4115	pGlobalModule->aRegions[i].cbRegion = RT_ALIGN_T(pRegions[i].cbRegion, PAGE_SIZE, uint32_t);
4116	pGlobalModule->aRegions[i].u32Alignment = 0;
4117	pGlobalModule->aRegions[i].paHCPhysPageID = NULL; /* uninitialized. */
4118	}
4119
4120	/* Save reference. */
4121	pRecVM->pGlobalModule = pGlobalModule;
4122	pRecVM->fCollision = false;
4123	pGlobalModule->cUsers++;
4124	rc = VINF_SUCCESS;
4125
4126	bool ret = RTAvlGCPtrInsert(&pGMM->pGlobalSharedModuleTree, &pGlobalModule->Core);
4127	Assert(ret);
4128
4129	Log(("GMMR0RegisterSharedModule: new global module %s\n", pszModuleName));
4130	}
4131	else
4132	{
4133	Assert(pGlobalModule->cUsers > 0);
4134
4135	/* Make sure the name and version are identical. */
4136	/** @todo replace with RTStrNCmp */
4137	if ( !strcmp(pGlobalModule->szName, pszModuleName)
4138	&& !strcmp(pGlobalModule->szVersion, pszVersion))
4139	{
4140	/* Save reference. */
4141	pRecVM->pGlobalModule = pGlobalModule;
4142	if ( fNewModule
4143	\|\| pRecVM->fCollision == true) /* colliding module unregistered and new one registered since the last check */
4144	{
4145	pGlobalModule->cUsers++;
4146	Log(("GMMR0RegisterSharedModule: using existing module %s cUser=%d!\n", pszModuleName, pGlobalModule->cUsers));
4147	}
4148	pRecVM->fCollision = false;
4149	rc = VINF_SUCCESS;
4150	}
4151	else
4152	{
4153	Log(("GMMR0RegisterSharedModule: module %s collision!\n", pszModuleName));
4154	pRecVM->fCollision = true;
4155	rc = VINF_PGM_SHARED_MODULE_COLLISION;
4156	goto end;
4157	}
4158	}
4159
4160	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4161	}
4162	else
4163	rc = VERR_INTERNAL_ERROR_5;
4164
4165	end:
4166	gmmR0MutexRelease(pGMM);
4167	return rc;
4168	#else
4169	return VERR_NOT_IMPLEMENTED;
4170	#endif
4171	}
4172
4173
4174	/**
4175	* VMMR0 request wrapper for GMMR0RegisterSharedModule.
4176	*
4177	* @returns see GMMR0RegisterSharedModule.
4178	* @param pVM Pointer to the shared VM structure.
4179	* @param idCpu VCPU id
4180	* @param pReq The request packet.
4181	*/
4182	GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
4183	{
4184	/*
4185	* Validate input and pass it on.
4186	*/
4187	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4188	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4189	AssertMsgReturn(pReq->Hdr.cbReq >= sizeof(pReq) && pReq->Hdr.cbReq == RT_UOFFSETOF(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4190
4191	/* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
4192	pReq->rc = GMMR0RegisterSharedModule(pVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
4193	return VINF_SUCCESS;
4194	}
4195
4196	/**
4197	* Unregisters a shared module for the VM
4198	*
4199	* @returns VBox status code.
4200	* @param pVM VM handle
4201	* @param idCpu VCPU id
4202	* @param pszModuleName Module name
4203	* @param pszVersion Module version
4204	* @param GCBaseAddr Module base address
4205	* @param cbModule Module size
4206	*/
4207	GMMR0DECL(int) GMMR0UnregisterSharedModule(PVM pVM, VMCPUID idCpu, char pszModuleName, char pszVersion, RTGCPTR GCBaseAddr, uint32_t cbModule)
4208	{
4209	#ifdef VBOX_WITH_PAGE_SHARING
4210	/*
4211	* Validate input and get the basics.
4212	*/
4213	PGMM pGMM;
4214	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4215	PGVM pGVM;
4216	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4217	if (RT_FAILURE(rc))
4218	return rc;
4219
4220	Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCBaseAddr, cbModule));
4221
4222	/*
4223	* Take the semaphore and do some more validations.
4224	*/
4225	gmmR0MutexAcquire(pGMM);
4226	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4227	{
4228	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCBaseAddr);
4229	if (pRecVM)
4230	{
4231	/* Remove reference to global shared module. */
4232	if (!pRecVM->fCollision)
4233	{
4234	PGMMSHAREDMODULE pRec = pRecVM->pGlobalModule;
4235	Assert(pRec);
4236
4237	if (pRec) /* paranoia */
4238	{
4239	Assert(pRec->cUsers);
4240	pRec->cUsers--;
4241	if (pRec->cUsers == 0)
4242	{
4243	/* Free the ranges, but leave the pages intact as there might still be references; they will be cleared by the COW mechanism. */
4244	for (unsigned i = 0; i < pRec->cRegions; i++)
4245	if (pRec->aRegions[i].paHCPhysPageID)
4246	RTMemFree(pRec->aRegions[i].paHCPhysPageID);
4247
4248	Assert(pRec->Core.Key == GCBaseAddr \|\| pRec->enmGuestOS == VBOXOSFAMILY_Windows64);
4249	Assert(pRec->cRegions == pRecVM->cRegions);
4250	#ifdef VBOX_STRICT
4251	for (unsigned i = 0; i < pRecVM->cRegions; i++)
4252	{
4253	Assert(pRecVM->aRegions[i].GCRegionAddr == pRec->aRegions[i].GCRegionAddr);
4254	Assert(pRecVM->aRegions[i].cbRegion == pRec->aRegions[i].cbRegion);
4255	}
4256	#endif
4257
4258	/* Remove from the tree and free memory. */
4259	RTAvlGCPtrRemove(&pGMM->pGlobalSharedModuleTree, pRec->Core.Key);
4260	RTMemFree(pRec);
4261	}
4262	}
4263	else
4264	rc = VERR_PGM_SHARED_MODULE_REGISTRATION_INCONSISTENCY;
4265	}
4266	else
4267	Assert(!pRecVM->pGlobalModule);
4268
4269	/* Remove from the tree and free memory. */
4270	RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, GCBaseAddr);
4271	RTMemFree(pRecVM);
4272	}
4273	else
4274	rc = VERR_PGM_SHARED_MODULE_NOT_FOUND;
4275
4276	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4277	}
4278	else
4279	rc = VERR_INTERNAL_ERROR_5;
4280
4281	gmmR0MutexRelease(pGMM);
4282	return rc;
4283	#else
4284	return VERR_NOT_IMPLEMENTED;
4285	#endif
4286	}
4287
4288	/**
4289	* VMMR0 request wrapper for GMMR0UnregisterSharedModule.
4290	*
4291	* @returns see GMMR0UnregisterSharedModule.
4292	* @param pVM Pointer to the shared VM structure.
4293	* @param idCpu VCPU id
4294	* @param pReq The request packet.
4295	*/
4296	GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PVM pVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
4297	{
4298	/*
4299	* Validate input and pass it on.
4300	*/
4301	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4302	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4303	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4304
4305	return GMMR0UnregisterSharedModule(pVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
4306	}
4307
4308	#ifdef VBOX_WITH_PAGE_SHARING
4309
4310	/**
4311	* Checks specified shared module range for changes
4312	*
4313	* Performs the following tasks:
4314	* - If a shared page is new, then it changes the GMM page type to shared and
4315	* returns it in the pPageDesc descriptor.
4316	* - If a shared page already exists, then it checks if the VM page is
4317	* identical and if so frees the VM page and returns the shared page in
4318	* pPageDesc descriptor.
4319	*
4320	* @remarks ASSUMES the caller has acquired the GMM semaphore!!
4321	*
4322	* @returns VBox status code.
4323	* @param pGMM Pointer to the GMM instance data.
4324	* @param pGVM Pointer to the GVM instance data.
4325	* @param pModule Module description
4326	* @param idxRegion Region index
4327	* @param idxPage Page index
4328	* @param paPageDesc Page descriptor
4329	*/
4330	GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, unsigned idxRegion, unsigned idxPage,
4331	PGMMSHAREDPAGEDESC pPageDesc)
4332	{
4333	int rc = VINF_SUCCESS;
4334	PGMM pGMM;
4335	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4336	unsigned cPages = pModule->aRegions[idxRegion].cbRegion >> PAGE_SHIFT;
4337
4338	AssertReturn(idxRegion < pModule->cRegions, VERR_INVALID_PARAMETER);
4339	AssertReturn(idxPage < cPages, VERR_INVALID_PARAMETER);
4340
4341	LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
4342
4343	PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
4344	if (!pGlobalRegion->paHCPhysPageID)
4345	{
4346	/* First time; create a page descriptor array. */
4347	Log(("Allocate page descriptor array for %d pages\n", cPages));
4348	pGlobalRegion->paHCPhysPageID = (uint32_t )RTMemAlloc(cPages sizeof(*pGlobalRegion->paHCPhysPageID));
4349	if (!pGlobalRegion->paHCPhysPageID)
4350	{
4351	AssertFailed();
4352	rc = VERR_NO_MEMORY;
4353	goto end;
4354	}
4355	/* Invalidate all descriptors. */
4356	for (unsigned i = 0; i < cPages; i++)
4357	pGlobalRegion->paHCPhysPageID[i] = NIL_GMM_PAGEID;
4358	}
4359
4360	/* We've seen this shared page for the first time? */
4361	if (pGlobalRegion->paHCPhysPageID[idxPage] == NIL_GMM_PAGEID)
4362	{
4363	new_shared_page:
4364	Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
4365
4366	/* Easy case: just change the internal page type. */
4367	PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->uHCPhysPageId);
4368	if (!pPage)
4369	{
4370	Log(("GMMR0SharedModuleCheckPage: Invalid idPage=%#x #1 (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x)\n",
4371	pPageDesc->uHCPhysPageId, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage));
4372	AssertFailed();
4373	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
4374	goto end;
4375	}
4376
4377	AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
4378
4379	gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->uHCPhysPageId, pPage);
4380
4381	/* Keep track of these references. */
4382	pGlobalRegion->paHCPhysPageID[idxPage] = pPageDesc->uHCPhysPageId;
4383	}
4384	else
4385	{
4386	uint8_t pbLocalPage, pbSharedPage;
4387	uint8_t *pbChunk;
4388	PGMMCHUNK pChunk;
4389
4390	Assert(pPageDesc->uHCPhysPageId != pGlobalRegion->paHCPhysPageID[idxPage]);
4391
4392	Log(("Replace existing page guest %RGp host %RHp id %x -> id %x\n", pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->uHCPhysPageId, pGlobalRegion->paHCPhysPageID[idxPage]));
4393
4394	/* Get the shared page source. */
4395	PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paHCPhysPageID[idxPage]);
4396	if (!pPage)
4397	{
4398	Log(("GMMR0SharedModuleCheckPage: Invalid idPage=%#x #2 (idxRegion=%#x idxPage=%#x)\n",
4399	pPageDesc->uHCPhysPageId, idxRegion, idxPage));
4400	AssertFailed();
4401	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
4402	goto end;
4403	}
4404	if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
4405	{
4406	/* Page was freed at some point; invalidate this entry. */
4407	/** @todo this isn't really bullet proof. */
4408	Log(("Old shared page was freed -> create a new one\n"));
4409	pGlobalRegion->paHCPhysPageID[idxPage] = NIL_GMM_PAGEID;
4410	goto new_shared_page; /* ugly goto */
4411	}
4412
4413	Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
4414
4415	/* Calculate the virtual address of the local page. */
4416	pChunk = gmmR0GetChunk(pGMM, pPageDesc->uHCPhysPageId >> GMM_CHUNKID_SHIFT);
4417	if (pChunk)
4418	{
4419	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4420	{
4421	Log(("GMMR0SharedModuleCheckPage: Invalid idPage=%#x #3\n", pPageDesc->uHCPhysPageId));
4422	AssertFailed();
4423	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
4424	goto end;
4425	}
4426	pbLocalPage = pbChunk + ((pPageDesc->uHCPhysPageId & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4427	}
4428	else
4429	{
4430	Log(("GMMR0SharedModuleCheckPage: Invalid idPage=%#x #4\n", pPageDesc->uHCPhysPageId));
4431	AssertFailed();
4432	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
4433	goto end;
4434	}
4435
4436	/* Calculate the virtual address of the shared page. */
4437	pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paHCPhysPageID[idxPage] >> GMM_CHUNKID_SHIFT);
4438	Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
4439
4440	/* Get the virtual address of the physical page; map the chunk into the VM process if not already done. */
4441	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4442	{
4443	Log(("Map chunk into process!\n"));
4444	rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
4445	if (rc != VINF_SUCCESS)
4446	{
4447	AssertRC(rc);
4448	goto end;
4449	}
4450	}
4451	pbSharedPage = pbChunk + ((pGlobalRegion->paHCPhysPageID[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4452
4453	/** @todo write ASMMemComparePage. */
4454	if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
4455	{
4456	Log(("Unexpected differences found between local and shared page; skip\n"));
4457	/* Signal to the caller that this one hasn't changed. */
4458	pPageDesc->uHCPhysPageId = NIL_GMM_PAGEID;
4459	goto end;
4460	}
4461
4462	/* Free the old local page. */
4463	GMMFREEPAGEDESC PageDesc;
4464
4465	PageDesc.idPage = pPageDesc->uHCPhysPageId;
4466	rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
4467	AssertRCReturn(rc, rc);
4468
4469	gmmR0UseSharedPage(pGMM, pGVM, pPage);
4470
4471	/* Pass along the new physical address & page id. */
4472	pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
4473	pPageDesc->uHCPhysPageId = pGlobalRegion->paHCPhysPageID[idxPage];
4474	}
4475	end:
4476	return rc;
4477	}
4478
4479
4480	/**
4481	* RTAvlGCPtrDestroy callback.
4482	*
4483	* @returns 0 or VERR_INTERNAL_ERROR.
4484	* @param pNode The node to destroy.
4485	* @param pvGVM The GVM handle.
4486	*/
4487	static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvGVM)
4488	{
4489	PGVM pGVM = (PGVM)pvGVM;
4490	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
4491
4492	Assert(pRecVM->pGlobalModule \|\| pRecVM->fCollision);
4493	if (pRecVM->pGlobalModule)
4494	{
4495	PGMMSHAREDMODULE pRec = pRecVM->pGlobalModule;
4496	AssertPtr(pRec);
4497	Assert(pRec->cUsers);
4498
4499	Log(("gmmR0CleanupSharedModule: %s %s cUsers=%d\n", pRec->szName, pRec->szVersion, pRec->cUsers));
4500	pRec->cUsers--;
4501	if (pRec->cUsers == 0)
4502	{
4503	for (uint32_t i = 0; i < pRec->cRegions; i++)
4504	if (pRec->aRegions[i].paHCPhysPageID)
4505	RTMemFree(pRec->aRegions[i].paHCPhysPageID);
4506
4507	/* Remove from the tree and free memory. */
4508	PGMM pGMM;
4509	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4510	RTAvlGCPtrRemove(&pGMM->pGlobalSharedModuleTree, pRec->Core.Key);
4511	RTMemFree(pRec);
4512	}
4513	}
4514	RTMemFree(pRecVM);
4515	return 0;
4516	}
4517
4518
4519	/**
4520	* Used by GMMR0CleanupVM to clean up shared modules.
4521	*
4522	* This is called without taking the GMM lock so that it can be yielded as
4523	* needed here.
4524	*
4525	* @param pGMM The GMM handle.
4526	* @param pGVM The global VM handle.
4527	*/
4528	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
4529	{
4530	gmmR0MutexAcquire(pGMM);
4531	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
4532
4533	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, pGVM);
4534
4535	gmmR0MutexRelease(pGMM);
4536	}
4537
4538	#endif /* VBOX_WITH_PAGE_SHARING */
4539
4540	/**
4541	* Removes all shared modules for the specified VM
4542	*
4543	* @returns VBox status code.
4544	* @param pVM VM handle
4545	* @param idCpu VCPU id
4546	*/
4547	GMMR0DECL(int) GMMR0ResetSharedModules(PVM pVM, VMCPUID idCpu)
4548	{
4549	#ifdef VBOX_WITH_PAGE_SHARING
4550	/*
4551	* Validate input and get the basics.
4552	*/
4553	PGMM pGMM;
4554	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4555	PGVM pGVM;
4556	int rc = GVMMR0ByVMAndEMT(pVM, idCpu, &pGVM);
4557	if (RT_FAILURE(rc))
4558	return rc;
4559
4560	/*
4561	* Take the semaphore and do some more validations.
4562	*/
4563	gmmR0MutexAcquire(pGMM);
4564	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4565	{
4566	Log(("GMMR0ResetSharedModules\n"));
4567	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, pGVM);
4568
4569	rc = VINF_SUCCESS;
4570	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4571	}
4572	else
4573	rc = VERR_INTERNAL_ERROR_5;
4574
4575	gmmR0MutexRelease(pGMM);
4576	return rc;
4577	#else
4578	return VERR_NOT_IMPLEMENTED;
4579	#endif
4580	}
4581
4582	#ifdef VBOX_WITH_PAGE_SHARING
4583
4584	typedef struct
4585	{
4586	PGVM pGVM;
4587	VMCPUID idCpu;
4588	int rc;
4589	} GMMCHECKSHAREDMODULEINFO, *PGMMCHECKSHAREDMODULEINFO;
4590
4591	/**
4592	* Tree enumeration callback for checking a shared module.
4593	*/
4594	DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
4595	{
4596	PGMMCHECKSHAREDMODULEINFO pInfo = (PGMMCHECKSHAREDMODULEINFO)pvUser;
4597	PGMMSHAREDMODULEPERVM pLocalModule = (PGMMSHAREDMODULEPERVM)pNode;
4598	PGMMSHAREDMODULE pGlobalModule = pLocalModule->pGlobalModule;
4599
4600	if ( !pLocalModule->fCollision
4601	&& pGlobalModule)
4602	{
4603	Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x collision=%d\n", pGlobalModule->szName, pGlobalModule->szVersion, pGlobalModule->Core.Key, pGlobalModule->cbModule, pLocalModule->fCollision));
4604	pInfo->rc = PGMR0SharedModuleCheck(pInfo->pGVM->pVM, pInfo->pGVM, pInfo->idCpu, pGlobalModule, pLocalModule->cRegions, pLocalModule->aRegions);
4605	if (RT_FAILURE(pInfo->rc))
4606	return 1; /* stop enumeration. */
4607	}
4608	return 0;
4609	}
4610
4611	#endif /* VBOX_WITH_PAGE_SHARING */
4612	#ifdef DEBUG_sandervl
4613
4614	/**
4615	* Setup for a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
4616	*
4617	* @returns VBox status code.
4618	* @param pVM VM handle
4619	*/
4620	GMMR0DECL(int) GMMR0CheckSharedModulesStart(PVM pVM)
4621	{
4622	/*
4623	* Validate input and get the basics.
4624	*/
4625	PGMM pGMM;
4626	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4627
4628	/*
4629	* Take the semaphore and do some more validations.
4630	*/
4631	gmmR0MutexAcquire(pGMM);
4632	if (!GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4633	rc = VERR_INTERNAL_ERROR_5;
4634	else
4635	rc = VINF_SUCCESS;
4636
4637	return rc;
4638	}
4639
4640	/**
4641	* Clean up after a GMMR0CheckSharedModules call (to allow log flush jumps back to ring 3)
4642	*
4643	* @returns VBox status code.
4644	* @param pVM VM handle
4645	*/
4646	GMMR0DECL(int) GMMR0CheckSharedModulesEnd(PVM pVM)
4647	{
4648	/*
4649	* Validate input and get the basics.
4650	*/
4651	PGMM pGMM;
4652	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4653
4654	gmmR0MutexRelease(pGMM);
4655	return VINF_SUCCESS;
4656	}
4657
4658	#endif /* DEBUG_sandervl */
4659
4660	/**
4661	* Check all shared modules for the specified VM
4662	*
4663	* @returns VBox status code.
4664	* @param pVM VM handle
4665	* @param pVCpu VMCPU handle
4666	*/
4667	GMMR0DECL(int) GMMR0CheckSharedModules(PVM pVM, PVMCPU pVCpu)
4668	{
4669	#ifdef VBOX_WITH_PAGE_SHARING
4670	/*
4671	* Validate input and get the basics.
4672	*/
4673	PGMM pGMM;
4674	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4675	PGVM pGVM;
4676	int rc = GVMMR0ByVMAndEMT(pVM, pVCpu->idCpu, &pGVM);
4677	if (RT_FAILURE(rc))
4678	return rc;
4679
4680	# ifndef DEBUG_sandervl
4681	/*
4682	* Take the semaphore and do some more validations.
4683	*/
4684	gmmR0MutexAcquire(pGMM);
4685	# endif
4686	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4687	{
4688	GMMCHECKSHAREDMODULEINFO Info;
4689
4690	Log(("GMMR0CheckSharedModules\n"));
4691	Info.pGVM = pGVM;
4692	Info.idCpu = pVCpu->idCpu;
4693	Info.rc = VINF_SUCCESS;
4694
4695	RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Info);
4696
4697	rc = Info.rc;
4698
4699	Log(("GMMR0CheckSharedModules done!\n"));
4700
4701	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4702	}
4703	else
4704	rc = VERR_INTERNAL_ERROR_5;
4705
4706	# ifndef DEBUG_sandervl
4707	gmmR0MutexRelease(pGMM);
4708	# endif
4709	return rc;
4710	#else
4711	return VERR_NOT_IMPLEMENTED;
4712	#endif
4713	}
4714
4715	#if defined(VBOX_STRICT) && HC_ARCH_BITS == 64
4716
4717	typedef struct
4718	{
4719	PGVM pGVM;
4720	PGMM pGMM;
4721	uint8_t *pSourcePage;
4722	bool fFoundDuplicate;
4723	} GMMFINDDUPPAGEINFO, *PGMMFINDDUPPAGEINFO;
4724
4725	/**
4726	* RTAvlU32DoWithAll callback.
4727	*
4728	* @returns 0
4729	* @param pNode The node to search.
4730	* @param pvInfo Pointer to the input parameters
4731	*/
4732	static DECLCALLBACK(int) gmmR0FindDupPageInChunk(PAVLU32NODECORE pNode, void *pvInfo)
4733	{
4734	PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
4735	PGMMFINDDUPPAGEINFO pInfo = (PGMMFINDDUPPAGEINFO)pvInfo;
4736	PGVM pGVM = pInfo->pGVM;
4737	PGMM pGMM = pInfo->pGMM;
4738	uint8_t *pbChunk;
4739
4740	/* Only take chunks not mapped into this VM process; not entirely correct. */
4741	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4742	{
4743	int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
4744	if (RT_SUCCESS(rc))
4745	{
4746	/*
4747	* Look for duplicate pages
4748	*/
4749	unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
4750	while (iPage-- > 0)
4751	{
4752	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
4753	{
4754	uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
4755
4756	if (!memcmp(pInfo->pSourcePage, pbDestPage, PAGE_SIZE))
4757	{
4758	pInfo->fFoundDuplicate = true;
4759	break;
4760	}
4761	}
4762	}
4763	gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/);
4764	}
4765	}
4766	return pInfo->fFoundDuplicate; /* (stops search if true) */
4767	}
4768
4769
4770	/**
4771	* Find a duplicate of the specified page in other active VMs
4772	*
4773	* @returns VBox status code.
4774	* @param pVM VM handle
4775	* @param pReq Request packet
4776	*/
4777	GMMR0DECL(int) GMMR0FindDuplicatePageReq(PVM pVM, PGMMFINDDUPLICATEPAGEREQ pReq)
4778	{
4779	/*
4780	* Validate input and pass it on.
4781	*/
4782	AssertPtrReturn(pVM, VERR_INVALID_POINTER);
4783	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4784	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4785
4786	PGMM pGMM;
4787	GMM_GET_VALID_INSTANCE(pGMM, VERR_INTERNAL_ERROR);
4788
4789	PGVM pGVM;
4790	int rc = GVMMR0ByVM(pVM, &pGVM);
4791	if (RT_FAILURE(rc))
4792	return rc;
4793
4794	/*
4795	* Take the semaphore and do some more validations.
4796	*/
4797	rc = gmmR0MutexAcquire(pGMM);
4798	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4799	{
4800	uint8_t *pbChunk;
4801	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
4802	if (pChunk)
4803	{
4804	if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4805	{
4806	uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4807	PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
4808	if (pPage)
4809	{
4810	GMMFINDDUPPAGEINFO Info;
4811	Info.pGVM = pGVM;
4812	Info.pGMM = pGMM;
4813	Info.pSourcePage = pbSourcePage;
4814	Info.fFoundDuplicate = false;
4815	RTAvlU32DoWithAll(&pGMM->pChunks, true /* fFromLeft */, gmmR0FindDupPageInChunk, &Info);
4816
4817	pReq->fDuplicate = Info.fFoundDuplicate;
4818	}
4819	else
4820	{
4821	AssertFailed();
4822	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
4823	}
4824	}
4825	else
4826	AssertFailed();
4827	}
4828	else
4829	AssertFailed();
4830	}
4831	else
4832	rc = VERR_INTERNAL_ERROR_5;
4833
4834	gmmR0MutexRelease(pGMM);
4835	return rc;
4836	}
4837
4838	#endif /* VBOX_STRICT && HC_ARCH_BITS == 64 */
4839

Note: See TracBrowser for help on using the repository browser.

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 37206

Download in other formats: