GMMR0.cpp@ 91044

Last change on this file since 91044 was 91014, checked in by vboxsync, 3 years ago
VMM: Made VBOX_WITH_RAM_IN_KERNEL non-optional, removing all the tests for it. bugref:9627
Property svn:eol-style set to `native` Property svn:keywords set to `Id Revision`
File size: 200.6 KB

Line
1	/* $Id: GMMR0.cpp 91014 2021-08-31 01:03:39Z vboxsync $ */
2	/** @file
3	* GMM - Global Memory Manager.
4	*/
5
6	/*
7	* Copyright (C) 2007-2020 Oracle Corporation
8	*
9	* This file is part of VirtualBox Open Source Edition (OSE), as
10	* available from http://www.virtualbox.org. This file is free software;
11	* you can redistribute it and/or modify it under the terms of the GNU
12	* General Public License (GPL) as published by the Free Software
13	* Foundation, in version 2 as it comes in the "COPYING" file of the
14	* VirtualBox OSE distribution. VirtualBox OSE is distributed in the
15	* hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
16	*/
17
18
19	/** @page pg_gmm GMM - The Global Memory Manager
20	*
21	* As the name indicates, this component is responsible for global memory
22	* management. Currently only guest RAM is allocated from the GMM, but this
23	* may change to include shadow page tables and other bits later.
24	*
25	* Guest RAM is managed as individual pages, but allocated from the host OS
26	* in chunks for reasons of portability / efficiency. To minimize the memory
27	* footprint all tracking structure must be as small as possible without
28	* unnecessary performance penalties.
29	*
30	* The allocation chunks has fixed sized, the size defined at compile time
31	* by the #GMM_CHUNK_SIZE \#define.
32	*
33	* Each chunk is given an unique ID. Each page also has a unique ID. The
34	* relationship between the two IDs is:
35	* @code
36	* GMM_CHUNK_SHIFT = log2(GMM_CHUNK_SIZE / PAGE_SIZE);
37	* idPage = (idChunk << GMM_CHUNK_SHIFT) \| iPage;
38	* @endcode
39	* Where iPage is the index of the page within the chunk. This ID scheme
40	* permits for efficient chunk and page lookup, but it relies on the chunk size
41	* to be set at compile time. The chunks are organized in an AVL tree with their
42	* IDs being the keys.
43	*
44	* The physical address of each page in an allocation chunk is maintained by
45	* the #RTR0MEMOBJ and obtained using #RTR0MemObjGetPagePhysAddr. There is no
46	* need to duplicate this information (it'll cost 8-bytes per page if we did).
47	*
48	* So what do we need to track per page? Most importantly we need to know
49	* which state the page is in:
50	* - Private - Allocated for (eventually) backing one particular VM page.
51	* - Shared - Readonly page that is used by one or more VMs and treated
52	* as COW by PGM.
53	* - Free - Not used by anyone.
54	*
55	* For the page replacement operations (sharing, defragmenting and freeing)
56	* to be somewhat efficient, private pages needs to be associated with a
57	* particular page in a particular VM.
58	*
59	* Tracking the usage of shared pages is impractical and expensive, so we'll
60	* settle for a reference counting system instead.
61	*
62	* Free pages will be chained on LIFOs
63	*
64	* On 64-bit systems we will use a 64-bit bitfield per page, while on 32-bit
65	* systems a 32-bit bitfield will have to suffice because of address space
66	* limitations. The #GMMPAGE structure shows the details.
67	*
68	*
69	* @section sec_gmm_alloc_strat Page Allocation Strategy
70	*
71	* The strategy for allocating pages has to take fragmentation and shared
72	* pages into account, or we may end up with with 2000 chunks with only
73	* a few pages in each. Shared pages cannot easily be reallocated because
74	* of the inaccurate usage accounting (see above). Private pages can be
75	* reallocated by a defragmentation thread in the same manner that sharing
76	* is done.
77	*
78	* The first approach is to manage the free pages in two sets depending on
79	* whether they are mainly for the allocation of shared or private pages.
80	* In the initial implementation there will be almost no possibility for
81	* mixing shared and private pages in the same chunk (only if we're really
82	* stressed on memory), but when we implement forking of VMs and have to
83	* deal with lots of COW pages it'll start getting kind of interesting.
84	*
85	* The sets are lists of chunks with approximately the same number of
86	* free pages. Say the chunk size is 1MB, meaning 256 pages, and a set
87	* consists of 16 lists. So, the first list will contain the chunks with
88	* 1-7 free pages, the second covers 8-15, and so on. The chunks will be
89	* moved between the lists as pages are freed up or allocated.
90	*
91	*
92	* @section sec_gmm_costs Costs
93	*
94	* The per page cost in kernel space is 32-bit plus whatever RTR0MEMOBJ
95	* entails. In addition there is the chunk cost of approximately
96	* (sizeof(RT0MEMOBJ) + sizeof(CHUNK)) / 2^CHUNK_SHIFT bytes per page.
97	*
98	* On Windows the per page #RTR0MEMOBJ cost is 32-bit on 32-bit windows
99	* and 64-bit on 64-bit windows (a PFN_NUMBER in the MDL). So, 64-bit per page.
100	* The cost on Linux is identical, but here it's because of sizeof(struct page *).
101	*
102	*
103	* @section sec_gmm_legacy Legacy Mode for Non-Tier-1 Platforms
104	*
105	* In legacy mode the page source is locked user pages and not
106	* #RTR0MemObjAllocPhysNC, this means that a page can only be allocated
107	* by the VM that locked it. We will make no attempt at implementing
108	* page sharing on these systems, just do enough to make it all work.
109	*
110	* @note With 6.1 really dropping 32-bit support, the legacy mode is obsoleted
111	* under the assumption that there is sufficient kernel virtual address
112	* space to map all of the guest memory allocations. So, we'll be using
113	* #RTR0MemObjAllocPage on some platforms as an alternative to
114	* #RTR0MemObjAllocPhysNC.
115	*
116	*
117	* @subsection sub_gmm_locking Serializing
118	*
119	* One simple fast mutex will be employed in the initial implementation, not
120	* two as mentioned in @ref sec_pgmPhys_Serializing.
121	*
122	* @see @ref sec_pgmPhys_Serializing
123	*
124	*
125	* @section sec_gmm_overcommit Memory Over-Commitment Management
126	*
127	* The GVM will have to do the system wide memory over-commitment
128	* management. My current ideas are:
129	* - Per VM oc policy that indicates how much to initially commit
130	* to it and what to do in a out-of-memory situation.
131	* - Prevent overtaxing the host.
132	*
133	* There are some challenges here, the main ones are configurability and
134	* security. Should we for instance permit anyone to request 100% memory
135	* commitment? Who should be allowed to do runtime adjustments of the
136	* config. And how to prevent these settings from being lost when the last
137	* VM process exits? The solution is probably to have an optional root
138	* daemon the will keep VMMR0.r0 in memory and enable the security measures.
139	*
140	*
141	*
142	* @section sec_gmm_numa NUMA
143	*
144	* NUMA considerations will be designed and implemented a bit later.
145	*
146	* The preliminary guesses is that we will have to try allocate memory as
147	* close as possible to the CPUs the VM is executed on (EMT and additional CPU
148	* threads). Which means it's mostly about allocation and sharing policies.
149	* Both the scheduler and allocator interface will to supply some NUMA info
150	* and we'll need to have a way to calc access costs.
151	*
152	*/
153
154
155	/*********************************************************************************************************************************
156	* Header Files *
157	*********************************************************************************************************************************/
158	#define LOG_GROUP LOG_GROUP_GMM
159	#include <VBox/rawpci.h>
160	#include <VBox/vmm/gmm.h>
161	#include "GMMR0Internal.h"
162	#include <VBox/vmm/vmcc.h>
163	#include <VBox/vmm/pgm.h>
164	#include <VBox/log.h>
165	#include <VBox/param.h>
166	#include <VBox/err.h>
167	#include <VBox/VMMDev.h>
168	#include <iprt/asm.h>
169	#include <iprt/avl.h>
170	#ifdef VBOX_STRICT
171	# include <iprt/crc.h>
172	#endif
173	#include <iprt/critsect.h>
174	#include <iprt/list.h>
175	#include <iprt/mem.h>
176	#include <iprt/memobj.h>
177	#include <iprt/mp.h>
178	#include <iprt/semaphore.h>
179	#include <iprt/spinlock.h>
180	#include <iprt/string.h>
181	#include <iprt/time.h>
182
183
184	/*********************************************************************************************************************************
185	* Defined Constants And Macros *
186	*********************************************************************************************************************************/
187	/** @def VBOX_USE_CRIT_SECT_FOR_GIANT
188	* Use a critical section instead of a fast mutex for the giant GMM lock.
189	*
190	* @remarks This is primarily a way of avoiding the deadlock checks in the
191	* windows driver verifier. */
192	#if defined(RT_OS_WINDOWS) \|\| defined(RT_OS_DARWIN) \|\| defined(DOXYGEN_RUNNING)
193	# define VBOX_USE_CRIT_SECT_FOR_GIANT
194	#endif
195
196	#if defined(VBOX_WITH_LINEAR_HOST_PHYS_MEM) && !defined(RT_OS_DARWIN)
197	/** Enable the legacy mode code (will be dropped soon). */
198	# define GMM_WITH_LEGACY_MODE
199	#endif
200
201
202	/*********************************************************************************************************************************
203	* Structures and Typedefs *
204	*********************************************************************************************************************************/
205	/** Pointer to set of free chunks. */
206	typedef struct GMMCHUNKFREESET *PGMMCHUNKFREESET;
207
208	/**
209	* The per-page tracking structure employed by the GMM.
210	*
211	* On 32-bit hosts we'll some trickery is necessary to compress all
212	* the information into 32-bits. When the fSharedFree member is set,
213	* the 30th bit decides whether it's a free page or not.
214	*
215	* Because of the different layout on 32-bit and 64-bit hosts, macros
216	* are used to get and set some of the data.
217	*/
218	typedef union GMMPAGE
219	{
220	#if HC_ARCH_BITS == 64
221	/** Unsigned integer view. */
222	uint64_t u;
223
224	/** The common view. */
225	struct GMMPAGECOMMON
226	{
227	uint32_t uStuff1 : 32;
228	uint32_t uStuff2 : 30;
229	/** The page state. */
230	uint32_t u2State : 2;
231	} Common;
232
233	/** The view of a private page. */
234	struct GMMPAGEPRIVATE
235	{
236	/** The guest page frame number. (Max addressable: 2 ^ 44 - 16) */
237	uint32_t pfn;
238	/** The GVM handle. (64K VMs) */
239	uint32_t hGVM : 16;
240	/** Reserved. */
241	uint32_t u16Reserved : 14;
242	/** The page state. */
243	uint32_t u2State : 2;
244	} Private;
245
246	/** The view of a shared page. */
247	struct GMMPAGESHARED
248	{
249	/** The host page frame number. (Max addressable: 2 ^ 44 - 16) */
250	uint32_t pfn;
251	/** The reference count (64K VMs). */
252	uint32_t cRefs : 16;
253	/** Used for debug checksumming. */
254	uint32_t u14Checksum : 14;
255	/** The page state. */
256	uint32_t u2State : 2;
257	} Shared;
258
259	/** The view of a free page. */
260	struct GMMPAGEFREE
261	{
262	/** The index of the next page in the free list. UINT16_MAX is NIL. */
263	uint16_t iNext;
264	/** Reserved. Checksum or something? */
265	uint16_t u16Reserved0;
266	/** Reserved. Checksum or something? */
267	uint32_t u30Reserved1 : 30;
268	/** The page state. */
269	uint32_t u2State : 2;
270	} Free;
271
272	#else /* 32-bit */
273	/** Unsigned integer view. */
274	uint32_t u;
275
276	/** The common view. */
277	struct GMMPAGECOMMON
278	{
279	uint32_t uStuff : 30;
280	/** The page state. */
281	uint32_t u2State : 2;
282	} Common;
283
284	/** The view of a private page. */
285	struct GMMPAGEPRIVATE
286	{
287	/** The guest page frame number. (Max addressable: 2 ^ 36) */
288	uint32_t pfn : 24;
289	/** The GVM handle. (127 VMs) */
290	uint32_t hGVM : 7;
291	/** The top page state bit, MBZ. */
292	uint32_t fZero : 1;
293	} Private;
294
295	/** The view of a shared page. */
296	struct GMMPAGESHARED
297	{
298	/** The reference count. */
299	uint32_t cRefs : 30;
300	/** The page state. */
301	uint32_t u2State : 2;
302	} Shared;
303
304	/** The view of a free page. */
305	struct GMMPAGEFREE
306	{
307	/** The index of the next page in the free list. UINT16_MAX is NIL. */
308	uint32_t iNext : 16;
309	/** Reserved. Checksum or something? */
310	uint32_t u14Reserved : 14;
311	/** The page state. */
312	uint32_t u2State : 2;
313	} Free;
314	#endif
315	} GMMPAGE;
316	AssertCompileSize(GMMPAGE, sizeof(RTHCUINTPTR));
317	/** Pointer to a GMMPAGE. */
318	typedef GMMPAGE *PGMMPAGE;
319
320
321	/** @name The Page States.
322	* @{ */
323	/** A private page. */
324	#define GMM_PAGE_STATE_PRIVATE 0
325	/** A private page - alternative value used on the 32-bit implementation.
326	* This will never be used on 64-bit hosts. */
327	#define GMM_PAGE_STATE_PRIVATE_32 1
328	/** A shared page. */
329	#define GMM_PAGE_STATE_SHARED 2
330	/** A free page. */
331	#define GMM_PAGE_STATE_FREE 3
332	/** @} */
333
334
335	/** @def GMM_PAGE_IS_PRIVATE
336	*
337	* @returns true if private, false if not.
338	* @param pPage The GMM page.
339	*/
340	#if HC_ARCH_BITS == 64
341	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_PRIVATE )
342	#else
343	# define GMM_PAGE_IS_PRIVATE(pPage) ( (pPage)->Private.fZero == 0 )
344	#endif
345
346	/** @def GMM_PAGE_IS_SHARED
347	*
348	* @returns true if shared, false if not.
349	* @param pPage The GMM page.
350	*/
351	#define GMM_PAGE_IS_SHARED(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_SHARED )
352
353	/** @def GMM_PAGE_IS_FREE
354	*
355	* @returns true if free, false if not.
356	* @param pPage The GMM page.
357	*/
358	#define GMM_PAGE_IS_FREE(pPage) ( (pPage)->Common.u2State == GMM_PAGE_STATE_FREE )
359
360	/** @def GMM_PAGE_PFN_LAST
361	* The last valid guest pfn range.
362	* @remark Some of the values outside the range has special meaning,
363	* see GMM_PAGE_PFN_UNSHAREABLE.
364	*/
365	#if HC_ARCH_BITS == 64
366	# define GMM_PAGE_PFN_LAST UINT32_C(0xfffffff0)
367	#else
368	# define GMM_PAGE_PFN_LAST UINT32_C(0x00fffff0)
369	#endif
370	AssertCompile(GMM_PAGE_PFN_LAST == (GMM_GCPHYS_LAST >> PAGE_SHIFT));
371
372	/** @def GMM_PAGE_PFN_UNSHAREABLE
373	* Indicates that this page isn't used for normal guest memory and thus isn't shareable.
374	*/
375	#if HC_ARCH_BITS == 64
376	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0xfffffff1)
377	#else
378	# define GMM_PAGE_PFN_UNSHAREABLE UINT32_C(0x00fffff1)
379	#endif
380	AssertCompile(GMM_PAGE_PFN_UNSHAREABLE == (GMM_GCPHYS_UNSHAREABLE >> PAGE_SHIFT));
381
382
383	/**
384	* A GMM allocation chunk ring-3 mapping record.
385	*
386	* This should really be associated with a session and not a VM, but
387	* it's simpler to associated with a VM and cleanup with the VM object
388	* is destroyed.
389	*/
390	typedef struct GMMCHUNKMAP
391	{
392	/** The mapping object. */
393	RTR0MEMOBJ hMapObj;
394	/** The VM owning the mapping. */
395	PGVM pGVM;
396	} GMMCHUNKMAP;
397	/** Pointer to a GMM allocation chunk mapping. */
398	typedef struct GMMCHUNKMAP *PGMMCHUNKMAP;
399
400
401	/**
402	* A GMM allocation chunk.
403	*/
404	typedef struct GMMCHUNK
405	{
406	/** The AVL node core.
407	* The Key is the chunk ID. (Giant mtx.) */
408	AVLU32NODECORE Core;
409	/** The memory object.
410	* Either from RTR0MemObjAllocPhysNC or RTR0MemObjLockUser depending on
411	* what the host can dish up with. (Chunk mtx protects mapping accesses
412	* and related frees.) */
413	RTR0MEMOBJ hMemObj;
414	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
415	/** Pointer to the kernel mapping. */
416	uint8_t *pbMapping;
417	#endif
418	/** Pointer to the next chunk in the free list. (Giant mtx.) */
419	PGMMCHUNK pFreeNext;
420	/** Pointer to the previous chunk in the free list. (Giant mtx.) */
421	PGMMCHUNK pFreePrev;
422	/** Pointer to the free set this chunk belongs to. NULL for
423	* chunks with no free pages. (Giant mtx.) */
424	PGMMCHUNKFREESET pSet;
425	/** List node in the chunk list (GMM::ChunkList). (Giant mtx.) */
426	RTLISTNODE ListNode;
427	/** Pointer to an array of mappings. (Chunk mtx.) */
428	PGMMCHUNKMAP paMappingsX;
429	/** The number of mappings. (Chunk mtx.) */
430	uint16_t cMappingsX;
431	/** The mapping lock this chunk is using using. UINT16_MAX if nobody is
432	* mapping or freeing anything. (Giant mtx.) */
433	uint8_t volatile iChunkMtx;
434	/** GMM_CHUNK_FLAGS_XXX. (Giant mtx.) */
435	uint8_t fFlags;
436	/** The head of the list of free pages. UINT16_MAX is the NIL value.
437	* (Giant mtx.) */
438	uint16_t iFreeHead;
439	/** The number of free pages. (Giant mtx.) */
440	uint16_t cFree;
441	/** The GVM handle of the VM that first allocated pages from this chunk, this
442	* is used as a preference when there are several chunks to choose from.
443	* When in bound memory mode this isn't a preference any longer. (Giant
444	* mtx.) */
445	uint16_t hGVM;
446	/** The ID of the NUMA node the memory mostly resides on. (Reserved for
447	* future use.) (Giant mtx.) */
448	uint16_t idNumaNode;
449	/** The number of private pages. (Giant mtx.) */
450	uint16_t cPrivate;
451	/** The number of shared pages. (Giant mtx.) */
452	uint16_t cShared;
453	/** The pages. (Giant mtx.) */
454	GMMPAGE aPages[GMM_CHUNK_SIZE >> PAGE_SHIFT];
455	} GMMCHUNK;
456
457	/** Indicates that the NUMA properies of the memory is unknown. */
458	#define GMM_CHUNK_NUMA_ID_UNKNOWN UINT16_C(0xfffe)
459
460	/** @name GMM_CHUNK_FLAGS_XXX - chunk flags.
461	* @{ */
462	/** Indicates that the chunk is a large page (2MB). */
463	#define GMM_CHUNK_FLAGS_LARGE_PAGE UINT16_C(0x0001)
464	#ifdef GMM_WITH_LEGACY_MODE
465	/** Indicates that the chunk was locked rather than allocated directly. */
466	# define GMM_CHUNK_FLAGS_SEEDED UINT16_C(0x0002)
467	#endif
468	/** @} */
469
470
471	/**
472	* An allocation chunk TLB entry.
473	*/
474	typedef struct GMMCHUNKTLBE
475	{
476	/** The chunk id. */
477	uint32_t idChunk;
478	/** Pointer to the chunk. */
479	PGMMCHUNK pChunk;
480	} GMMCHUNKTLBE;
481	/** Pointer to an allocation chunk TLB entry. */
482	typedef GMMCHUNKTLBE *PGMMCHUNKTLBE;
483
484
485	/** The number of entries in the allocation chunk TLB. */
486	#define GMM_CHUNKTLB_ENTRIES 32
487	/** Gets the TLB entry index for the given Chunk ID. */
488	#define GMM_CHUNKTLB_IDX(idChunk) ( (idChunk) & (GMM_CHUNKTLB_ENTRIES - 1) )
489
490	/**
491	* An allocation chunk TLB.
492	*/
493	typedef struct GMMCHUNKTLB
494	{
495	/** The TLB entries. */
496	GMMCHUNKTLBE aEntries[GMM_CHUNKTLB_ENTRIES];
497	} GMMCHUNKTLB;
498	/** Pointer to an allocation chunk TLB. */
499	typedef GMMCHUNKTLB *PGMMCHUNKTLB;
500
501
502	/**
503	* The GMM instance data.
504	*/
505	typedef struct GMM
506	{
507	/** Magic / eye catcher. GMM_MAGIC */
508	uint32_t u32Magic;
509	/** The number of threads waiting on the mutex. */
510	uint32_t cMtxContenders;
511	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
512	/** The critical section protecting the GMM.
513	* More fine grained locking can be implemented later if necessary. */
514	RTCRITSECT GiantCritSect;
515	#else
516	/** The fast mutex protecting the GMM.
517	* More fine grained locking can be implemented later if necessary. */
518	RTSEMFASTMUTEX hMtx;
519	#endif
520	#ifdef VBOX_STRICT
521	/** The current mutex owner. */
522	RTNATIVETHREAD hMtxOwner;
523	#endif
524	/** Spinlock protecting the AVL tree.
525	* @todo Make this a read-write spinlock as we should allow concurrent
526	* lookups. */
527	RTSPINLOCK hSpinLockTree;
528	/** The chunk tree.
529	* Protected by hSpinLockTree. */
530	PAVLU32NODECORE pChunks;
531	/** Chunk freeing generation - incremented whenever a chunk is freed. Used
532	* for validating the per-VM chunk TLB entries. Valid range is 1 to 2^62
533	* (exclusive), though higher numbers may temporarily occure while
534	* invalidating the individual TLBs during wrap-around processing. */
535	uint64_t volatile idFreeGeneration;
536	/** The chunk TLB.
537	* Protected by hSpinLockTree. */
538	GMMCHUNKTLB ChunkTLB;
539	/** The private free set. */
540	GMMCHUNKFREESET PrivateX;
541	/** The shared free set. */
542	GMMCHUNKFREESET Shared;
543
544	/** Shared module tree (global).
545	* @todo separate trees for distinctly different guest OSes. */
546	PAVLLU32NODECORE pGlobalSharedModuleTree;
547	/** Sharable modules (count of nodes in pGlobalSharedModuleTree). */
548	uint32_t cShareableModules;
549
550	/** The chunk list. For simplifying the cleanup process and avoid tree
551	* traversal. */
552	RTLISTANCHOR ChunkList;
553
554	/** The maximum number of pages we're allowed to allocate.
555	* @gcfgm{GMM/MaxPages,64-bit, Direct.}
556	* @gcfgm{GMM/PctPages,32-bit, Relative to the number of host pages.} */
557	uint64_t cMaxPages;
558	/** The number of pages that has been reserved.
559	* The deal is that cReservedPages - cOverCommittedPages <= cMaxPages. */
560	uint64_t cReservedPages;
561	/** The number of pages that we have over-committed in reservations. */
562	uint64_t cOverCommittedPages;
563	/** The number of actually allocated (committed if you like) pages. */
564	uint64_t cAllocatedPages;
565	/** The number of pages that are shared. A subset of cAllocatedPages. */
566	uint64_t cSharedPages;
567	/** The number of pages that are actually shared between VMs. */
568	uint64_t cDuplicatePages;
569	/** The number of pages that are shared that has been left behind by
570	* VMs not doing proper cleanups. */
571	uint64_t cLeftBehindSharedPages;
572	/** The number of allocation chunks.
573	* (The number of pages we've allocated from the host can be derived from this.) */
574	uint32_t cChunks;
575	/** The number of current ballooned pages. */
576	uint64_t cBalloonedPages;
577
578	#ifndef GMM_WITH_LEGACY_MODE
579	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
580	/** Whether #RTR0MemObjAllocPhysNC works. */
581	bool fHasWorkingAllocPhysNC;
582	# else
583	bool fPadding;
584	# endif
585	#else
586	/** The legacy allocation mode indicator.
587	* This is determined at initialization time. */
588	bool fLegacyAllocationMode;
589	#endif
590	/** The bound memory mode indicator.
591	* When set, the memory will be bound to a specific VM and never
592	* shared. This is always set if fLegacyAllocationMode is set.
593	* (Also determined at initialization time.) */
594	bool fBoundMemoryMode;
595	/** The number of registered VMs. */
596	uint16_t cRegisteredVMs;
597
598	/** The number of freed chunks ever. This is used a list generation to
599	* avoid restarting the cleanup scanning when the list wasn't modified. */
600	uint32_t cFreedChunks;
601	/** The previous allocated Chunk ID.
602	* Used as a hint to avoid scanning the whole bitmap. */
603	uint32_t idChunkPrev;
604	/** Chunk ID allocation bitmap.
605	* Bits of allocated IDs are set, free ones are clear.
606	* The NIL id (0) is marked allocated. */
607	uint32_t bmChunkId[(GMM_CHUNKID_LAST + 1 + 31) / 32];
608
609	/** The index of the next mutex to use. */
610	uint32_t iNextChunkMtx;
611	/** Chunk locks for reducing lock contention without having to allocate
612	* one lock per chunk. */
613	struct
614	{
615	/** The mutex */
616	RTSEMFASTMUTEX hMtx;
617	/** The number of threads currently using this mutex. */
618	uint32_t volatile cUsers;
619	} aChunkMtx[64];
620	} GMM;
621	/** Pointer to the GMM instance. */
622	typedef GMM *PGMM;
623
624	/** The value of GMM::u32Magic (Katsuhiro Otomo). */
625	#define GMM_MAGIC UINT32_C(0x19540414)
626
627
628	/**
629	* GMM chunk mutex state.
630	*
631	* This is returned by gmmR0ChunkMutexAcquire and is used by the other
632	* gmmR0ChunkMutex* methods.
633	*/
634	typedef struct GMMR0CHUNKMTXSTATE
635	{
636	PGMM pGMM;
637	/** The index of the chunk mutex. */
638	uint8_t iChunkMtx;
639	/** The relevant flags (GMMR0CHUNK_MTX_XXX). */
640	uint8_t fFlags;
641	} GMMR0CHUNKMTXSTATE;
642	/** Pointer to a chunk mutex state. */
643	typedef GMMR0CHUNKMTXSTATE *PGMMR0CHUNKMTXSTATE;
644
645	/** @name GMMR0CHUNK_MTX_XXX
646	* @{ */
647	#define GMMR0CHUNK_MTX_INVALID UINT32_C(0)
648	#define GMMR0CHUNK_MTX_KEEP_GIANT UINT32_C(1)
649	#define GMMR0CHUNK_MTX_RETAKE_GIANT UINT32_C(2)
650	#define GMMR0CHUNK_MTX_DROP_GIANT UINT32_C(3)
651	#define GMMR0CHUNK_MTX_END UINT32_C(4)
652	/** @} */
653
654
655	/** The maximum number of shared modules per-vm. */
656	#define GMM_MAX_SHARED_PER_VM_MODULES 2048
657	/** The maximum number of shared modules GMM is allowed to track. */
658	#define GMM_MAX_SHARED_GLOBAL_MODULES 16834
659
660
661	/**
662	* Argument packet for gmmR0SharedModuleCleanup.
663	*/
664	typedef struct GMMR0SHMODPERVMDTORARGS
665	{
666	PGVM pGVM;
667	PGMM pGMM;
668	} GMMR0SHMODPERVMDTORARGS;
669
670	/**
671	* Argument packet for gmmR0CheckSharedModule.
672	*/
673	typedef struct GMMCHECKSHAREDMODULEINFO
674	{
675	PGVM pGVM;
676	VMCPUID idCpu;
677	} GMMCHECKSHAREDMODULEINFO;
678
679
680	/*********************************************************************************************************************************
681	* Global Variables *
682	*********************************************************************************************************************************/
683	/** Pointer to the GMM instance data. */
684	static PGMM g_pGMM = NULL;
685
686	/** Macro for obtaining and validating the g_pGMM pointer.
687	*
688	* On failure it will return from the invoking function with the specified
689	* return value.
690	*
691	* @param pGMM The name of the pGMM variable.
692	* @param rc The return value on failure. Use VERR_GMM_INSTANCE for VBox
693	* status codes.
694	*/
695	#define GMM_GET_VALID_INSTANCE(pGMM, rc) \
696	do { \
697	(pGMM) = g_pGMM; \
698	AssertPtrReturn((pGMM), (rc)); \
699	AssertMsgReturn((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic), (rc)); \
700	} while (0)
701
702	/** Macro for obtaining and validating the g_pGMM pointer, void function
703	* variant.
704	*
705	* On failure it will return from the invoking function.
706	*
707	* @param pGMM The name of the pGMM variable.
708	*/
709	#define GMM_GET_VALID_INSTANCE_VOID(pGMM) \
710	do { \
711	(pGMM) = g_pGMM; \
712	AssertPtrReturnVoid((pGMM)); \
713	AssertMsgReturnVoid((pGMM)->u32Magic == GMM_MAGIC, ("%p - %#x\n", (pGMM), (pGMM)->u32Magic)); \
714	} while (0)
715
716
717	/** @def GMM_CHECK_SANITY_UPON_ENTERING
718	* Checks the sanity of the GMM instance data before making changes.
719	*
720	* This is macro is a stub by default and must be enabled manually in the code.
721	*
722	* @returns true if sane, false if not.
723	* @param pGMM The name of the pGMM variable.
724	*/
725	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
726	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
727	#else
728	# define GMM_CHECK_SANITY_UPON_ENTERING(pGMM) (true)
729	#endif
730
731	/** @def GMM_CHECK_SANITY_UPON_LEAVING
732	* Checks the sanity of the GMM instance data after making changes.
733	*
734	* This is macro is a stub by default and must be enabled manually in the code.
735	*
736	* @returns true if sane, false if not.
737	* @param pGMM The name of the pGMM variable.
738	*/
739	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
740	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
741	#else
742	# define GMM_CHECK_SANITY_UPON_LEAVING(pGMM) (true)
743	#endif
744
745	/** @def GMM_CHECK_SANITY_IN_LOOPS
746	* Checks the sanity of the GMM instance in the allocation loops.
747	*
748	* This is macro is a stub by default and must be enabled manually in the code.
749	*
750	* @returns true if sane, false if not.
751	* @param pGMM The name of the pGMM variable.
752	*/
753	#if defined(VBOX_STRICT) && defined(GMMR0_WITH_SANITY_CHECK) && 0
754	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (gmmR0SanityCheck((pGMM), __PRETTY_FUNCTION__, __LINE__) == 0)
755	#else
756	# define GMM_CHECK_SANITY_IN_LOOPS(pGMM) (true)
757	#endif
758
759
760	/*********************************************************************************************************************************
761	* Internal Functions *
762	*********************************************************************************************************************************/
763	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM);
764	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
765	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk);
766	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet);
767	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
768	#ifdef GMMR0_WITH_SANITY_CHECK
769	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo);
770	#endif
771	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem);
772	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
773	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage);
774	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk);
775	#ifdef VBOX_WITH_PAGE_SHARING
776	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM);
777	# ifdef VBOX_STRICT
778	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage);
779	# endif
780	#endif
781
782
783
784	/**
785	* Initializes the GMM component.
786	*
787	* This is called when the VMMR0.r0 module is loaded and protected by the
788	* loader semaphore.
789	*
790	* @returns VBox status code.
791	*/
792	GMMR0DECL(int) GMMR0Init(void)
793	{
794	LogFlow(("GMMInit:\n"));
795
796	/*
797	* Allocate the instance data and the locks.
798	*/
799	PGMM pGMM = (PGMM)RTMemAllocZ(sizeof(*pGMM));
800	if (!pGMM)
801	return VERR_NO_MEMORY;
802
803	pGMM->u32Magic = GMM_MAGIC;
804	for (unsigned i = 0; i < RT_ELEMENTS(pGMM->ChunkTLB.aEntries); i++)
805	pGMM->ChunkTLB.aEntries[i].idChunk = NIL_GMM_CHUNKID;
806	RTListInit(&pGMM->ChunkList);
807	ASMBitSet(&pGMM->bmChunkId[0], NIL_GMM_CHUNKID);
808
809	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
810	int rc = RTCritSectInit(&pGMM->GiantCritSect);
811	#else
812	int rc = RTSemFastMutexCreate(&pGMM->hMtx);
813	#endif
814	if (RT_SUCCESS(rc))
815	{
816	unsigned iMtx;
817	for (iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
818	{
819	rc = RTSemFastMutexCreate(&pGMM->aChunkMtx[iMtx].hMtx);
820	if (RT_FAILURE(rc))
821	break;
822	}
823	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
824	if (RT_SUCCESS(rc))
825	rc = RTSpinlockCreate(&pGMM->hSpinLockTree, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "gmm-chunk-tree");
826	if (RT_SUCCESS(rc))
827	{
828	#ifndef GMM_WITH_LEGACY_MODE
829	/*
830	* Figure out how we're going to allocate stuff (only applicable to
831	* host with linear physical memory mappings).
832	*/
833	pGMM->fBoundMemoryMode = false;
834	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
835	pGMM->fHasWorkingAllocPhysNC = false;
836
837	RTR0MEMOBJ hMemObj;
838	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
839	if (RT_SUCCESS(rc))
840	{
841	rc = RTR0MemObjFree(hMemObj, true);
842	AssertRC(rc);
843	pGMM->fHasWorkingAllocPhysNC = true;
844	}
845	else if (rc != VERR_NOT_SUPPORTED)
846	SUPR0Printf("GMMR0Init: Warning! RTR0MemObjAllocPhysNC(, %u, NIL_RTHCPHYS) -> %d!\n", GMM_CHUNK_SIZE, rc);
847	# endif
848	#else /* GMM_WITH_LEGACY_MODE */
849	/*
850	* Check and see if RTR0MemObjAllocPhysNC works.
851	*/
852	# if 0 /* later, see @bufref{3170}. */
853	RTR0MEMOBJ MemObj;
854	rc = RTR0MemObjAllocPhysNC(&MemObj, _64K, NIL_RTHCPHYS);
855	if (RT_SUCCESS(rc))
856	{
857	rc = RTR0MemObjFree(MemObj, true);
858	AssertRC(rc);
859	}
860	else if (rc == VERR_NOT_SUPPORTED)
861	pGMM->fLegacyAllocationMode = pGMM->fBoundMemoryMode = true;
862	else
863	SUPR0Printf("GMMR0Init: RTR0MemObjAllocPhysNC(,64K,Any) -> %d!\n", rc);
864	# else
865	# if defined(RT_OS_WINDOWS) \|\| (defined(RT_OS_SOLARIS) && ARCH_BITS == 64) \|\| defined(RT_OS_LINUX) \|\| defined(RT_OS_FREEBSD)
866	pGMM->fLegacyAllocationMode = false;
867	# if ARCH_BITS == 32
868	/* Don't reuse possibly partial chunks because of the virtual
869	address space limitation. */
870	pGMM->fBoundMemoryMode = true;
871	# else
872	pGMM->fBoundMemoryMode = false;
873	# endif
874	# else
875	pGMM->fLegacyAllocationMode = true;
876	pGMM->fBoundMemoryMode = true;
877	# endif
878	# endif
879	#endif /* GMM_WITH_LEGACY_MODE */
880
881	/*
882	* Query system page count and guess a reasonable cMaxPages value.
883	*/
884	pGMM->cMaxPages = UINT32_MAX; /** @todo IPRT function for query ram size and such. */
885
886	/*
887	* The idFreeGeneration value should be set so we actually trigger the
888	* wrap-around invalidation handling during a typical test run.
889	*/
890	pGMM->idFreeGeneration = UINT64_MAX / 4 - 128;
891
892	g_pGMM = pGMM;
893	#ifdef GMM_WITH_LEGACY_MODE
894	LogFlow(("GMMInit: pGMM=%p fLegacyAllocationMode=%RTbool fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fLegacyAllocationMode, pGMM->fBoundMemoryMode));
895	#elif defined(VBOX_WITH_LINEAR_HOST_PHYS_MEM)
896	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool fHasWorkingAllocPhysNC=%RTbool\n", pGMM, pGMM->fBoundMemoryMode, pGMM->fHasWorkingAllocPhysNC));
897	#else
898	LogFlow(("GMMInit: pGMM=%p fBoundMemoryMode=%RTbool\n", pGMM, pGMM->fBoundMemoryMode));
899	#endif
900	return VINF_SUCCESS;
901	}
902
903	/*
904	* Bail out.
905	*/
906	RTSpinlockDestroy(pGMM->hSpinLockTree);
907	while (iMtx-- > 0)
908	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
909	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
910	RTCritSectDelete(&pGMM->GiantCritSect);
911	#else
912	RTSemFastMutexDestroy(pGMM->hMtx);
913	#endif
914	}
915
916	pGMM->u32Magic = 0;
917	RTMemFree(pGMM);
918	SUPR0Printf("GMMR0Init: failed! rc=%d\n", rc);
919	return rc;
920	}
921
922
923	/**
924	* Terminates the GMM component.
925	*/
926	GMMR0DECL(void) GMMR0Term(void)
927	{
928	LogFlow(("GMMTerm:\n"));
929
930	/*
931	* Take care / be paranoid...
932	*/
933	PGMM pGMM = g_pGMM;
934	if (!RT_VALID_PTR(pGMM))
935	return;
936	if (pGMM->u32Magic != GMM_MAGIC)
937	{
938	SUPR0Printf("GMMR0Term: u32Magic=%#x\n", pGMM->u32Magic);
939	return;
940	}
941
942	/*
943	* Undo what init did and free all the resources we've acquired.
944	*/
945	/* Destroy the fundamentals. */
946	g_pGMM = NULL;
947	pGMM->u32Magic = ~GMM_MAGIC;
948	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
949	RTCritSectDelete(&pGMM->GiantCritSect);
950	#else
951	RTSemFastMutexDestroy(pGMM->hMtx);
952	pGMM->hMtx = NIL_RTSEMFASTMUTEX;
953	#endif
954	RTSpinlockDestroy(pGMM->hSpinLockTree);
955	pGMM->hSpinLockTree = NIL_RTSPINLOCK;
956
957	/* Free any chunks still hanging around. */
958	RTAvlU32Destroy(&pGMM->pChunks, gmmR0TermDestroyChunk, pGMM);
959
960	/* Destroy the chunk locks. */
961	for (unsigned iMtx = 0; iMtx < RT_ELEMENTS(pGMM->aChunkMtx); iMtx++)
962	{
963	Assert(pGMM->aChunkMtx[iMtx].cUsers == 0);
964	RTSemFastMutexDestroy(pGMM->aChunkMtx[iMtx].hMtx);
965	pGMM->aChunkMtx[iMtx].hMtx = NIL_RTSEMFASTMUTEX;
966	}
967
968	/* Finally the instance data itself. */
969	RTMemFree(pGMM);
970	LogFlow(("GMMTerm: done\n"));
971	}
972
973
974	/**
975	* RTAvlU32Destroy callback.
976	*
977	* @returns 0
978	* @param pNode The node to destroy.
979	* @param pvGMM The GMM handle.
980	*/
981	static DECLCALLBACK(int) gmmR0TermDestroyChunk(PAVLU32NODECORE pNode, void *pvGMM)
982	{
983	PGMMCHUNK pChunk = (PGMMCHUNK)pNode;
984
985	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
986	SUPR0Printf("GMMR0Term: %RKv/%#x: cFree=%d cPrivate=%d cShared=%d cMappings=%d\n", pChunk,
987	pChunk->Core.Key, pChunk->cFree, pChunk->cPrivate, pChunk->cShared, pChunk->cMappingsX);
988
989	int rc = RTR0MemObjFree(pChunk->hMemObj, true /* fFreeMappings */);
990	if (RT_FAILURE(rc))
991	{
992	SUPR0Printf("GMMR0Term: %RKv/%#x: RTRMemObjFree(%RKv,true) -> %d (cMappings=%d)\n", pChunk,
993	pChunk->Core.Key, pChunk->hMemObj, rc, pChunk->cMappingsX);
994	AssertRC(rc);
995	}
996	pChunk->hMemObj = NIL_RTR0MEMOBJ;
997
998	RTMemFree(pChunk->paMappingsX);
999	pChunk->paMappingsX = NULL;
1000
1001	RTMemFree(pChunk);
1002	NOREF(pvGMM);
1003	return 0;
1004	}
1005
1006
1007	/**
1008	* Initializes the per-VM data for the GMM.
1009	*
1010	* This is called from within the GVMM lock (from GVMMR0CreateVM)
1011	* and should only initialize the data members so GMMR0CleanupVM
1012	* can deal with them. We reserve no memory or anything here,
1013	* that's done later in GMMR0InitVM.
1014	*
1015	* @param pGVM Pointer to the Global VM structure.
1016	*/
1017	GMMR0DECL(int) GMMR0InitPerVMData(PGVM pGVM)
1018	{
1019	AssertCompile(RT_SIZEOFMEMB(GVM,gmm.s) <= RT_SIZEOFMEMB(GVM,gmm.padding));
1020
1021	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1022	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1023	pGVM->gmm.s.Stats.fMayAllocate = false;
1024
1025	pGVM->gmm.s.hChunkTlbSpinLock = NIL_RTSPINLOCK;
1026	int rc = RTSpinlockCreate(&pGVM->gmm.s.hChunkTlbSpinLock, RTSPINLOCK_FLAGS_INTERRUPT_SAFE, "per-vm-chunk-tlb");
1027	AssertRCReturn(rc, rc);
1028
1029	return VINF_SUCCESS;
1030	}
1031
1032
1033	/**
1034	* Acquires the GMM giant lock.
1035	*
1036	* @returns Assert status code from RTSemFastMutexRequest.
1037	* @param pGMM Pointer to the GMM instance.
1038	*/
1039	static int gmmR0MutexAcquire(PGMM pGMM)
1040	{
1041	ASMAtomicIncU32(&pGMM->cMtxContenders);
1042	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1043	int rc = RTCritSectEnter(&pGMM->GiantCritSect);
1044	#else
1045	int rc = RTSemFastMutexRequest(pGMM->hMtx);
1046	#endif
1047	ASMAtomicDecU32(&pGMM->cMtxContenders);
1048	AssertRC(rc);
1049	#ifdef VBOX_STRICT
1050	pGMM->hMtxOwner = RTThreadNativeSelf();
1051	#endif
1052	return rc;
1053	}
1054
1055
1056	/**
1057	* Releases the GMM giant lock.
1058	*
1059	* @returns Assert status code from RTSemFastMutexRequest.
1060	* @param pGMM Pointer to the GMM instance.
1061	*/
1062	static int gmmR0MutexRelease(PGMM pGMM)
1063	{
1064	#ifdef VBOX_STRICT
1065	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1066	#endif
1067	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1068	int rc = RTCritSectLeave(&pGMM->GiantCritSect);
1069	#else
1070	int rc = RTSemFastMutexRelease(pGMM->hMtx);
1071	AssertRC(rc);
1072	#endif
1073	return rc;
1074	}
1075
1076
1077	/**
1078	* Yields the GMM giant lock if there is contention and a certain minimum time
1079	* has elapsed since we took it.
1080	*
1081	* @returns @c true if the mutex was yielded, @c false if not.
1082	* @param pGMM Pointer to the GMM instance.
1083	* @param puLockNanoTS Where the lock acquisition time stamp is kept
1084	* (in/out).
1085	*/
1086	static bool gmmR0MutexYield(PGMM pGMM, uint64_t *puLockNanoTS)
1087	{
1088	/*
1089	* If nobody is contending the mutex, don't bother checking the time.
1090	*/
1091	if (ASMAtomicReadU32(&pGMM->cMtxContenders) == 0)
1092	return false;
1093
1094	/*
1095	* Don't yield if we haven't executed for at least 2 milliseconds.
1096	*/
1097	uint64_t uNanoNow = RTTimeSystemNanoTS();
1098	if (uNanoNow - *puLockNanoTS < UINT32_C(2000000))
1099	return false;
1100
1101	/*
1102	* Yield the mutex.
1103	*/
1104	#ifdef VBOX_STRICT
1105	pGMM->hMtxOwner = NIL_RTNATIVETHREAD;
1106	#endif
1107	ASMAtomicIncU32(&pGMM->cMtxContenders);
1108	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1109	int rc1 = RTCritSectLeave(&pGMM->GiantCritSect); AssertRC(rc1);
1110	#else
1111	int rc1 = RTSemFastMutexRelease(pGMM->hMtx); AssertRC(rc1);
1112	#endif
1113
1114	RTThreadYield();
1115
1116	#ifdef VBOX_USE_CRIT_SECT_FOR_GIANT
1117	int rc2 = RTCritSectEnter(&pGMM->GiantCritSect); AssertRC(rc2);
1118	#else
1119	int rc2 = RTSemFastMutexRequest(pGMM->hMtx); AssertRC(rc2);
1120	#endif
1121	*puLockNanoTS = RTTimeSystemNanoTS();
1122	ASMAtomicDecU32(&pGMM->cMtxContenders);
1123	#ifdef VBOX_STRICT
1124	pGMM->hMtxOwner = RTThreadNativeSelf();
1125	#endif
1126
1127	return true;
1128	}
1129
1130
1131	/**
1132	* Acquires a chunk lock.
1133	*
1134	* The caller must own the giant lock.
1135	*
1136	* @returns Assert status code from RTSemFastMutexRequest.
1137	* @param pMtxState The chunk mutex state info. (Avoids
1138	* passing the same flags and stuff around
1139	* for subsequent release and drop-giant
1140	* calls.)
1141	* @param pGMM Pointer to the GMM instance.
1142	* @param pChunk Pointer to the chunk.
1143	* @param fFlags Flags regarding the giant lock, GMMR0CHUNK_MTX_XXX.
1144	*/
1145	static int gmmR0ChunkMutexAcquire(PGMMR0CHUNKMTXSTATE pMtxState, PGMM pGMM, PGMMCHUNK pChunk, uint32_t fFlags)
1146	{
1147	Assert(fFlags > GMMR0CHUNK_MTX_INVALID && fFlags < GMMR0CHUNK_MTX_END);
1148	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1149
1150	pMtxState->pGMM = pGMM;
1151	pMtxState->fFlags = (uint8_t)fFlags;
1152
1153	/*
1154	* Get the lock index and reference the lock.
1155	*/
1156	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
1157	uint32_t iChunkMtx = pChunk->iChunkMtx;
1158	if (iChunkMtx == UINT8_MAX)
1159	{
1160	iChunkMtx = pGMM->iNextChunkMtx++;
1161	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1162
1163	/* Try get an unused one... */
1164	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1165	{
1166	iChunkMtx = pGMM->iNextChunkMtx++;
1167	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1168	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1169	{
1170	iChunkMtx = pGMM->iNextChunkMtx++;
1171	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1172	if (pGMM->aChunkMtx[iChunkMtx].cUsers)
1173	{
1174	iChunkMtx = pGMM->iNextChunkMtx++;
1175	iChunkMtx %= RT_ELEMENTS(pGMM->aChunkMtx);
1176	}
1177	}
1178	}
1179
1180	pChunk->iChunkMtx = iChunkMtx;
1181	}
1182	AssertCompile(RT_ELEMENTS(pGMM->aChunkMtx) < UINT8_MAX);
1183	pMtxState->iChunkMtx = (uint8_t)iChunkMtx;
1184	ASMAtomicIncU32(&pGMM->aChunkMtx[iChunkMtx].cUsers);
1185
1186	/*
1187	* Drop the giant?
1188	*/
1189	if (fFlags != GMMR0CHUNK_MTX_KEEP_GIANT)
1190	{
1191	/** @todo GMM life cycle cleanup (we may race someone
1192	* destroying and cleaning up GMM)? */
1193	gmmR0MutexRelease(pGMM);
1194	}
1195
1196	/*
1197	* Take the chunk mutex.
1198	*/
1199	int rc = RTSemFastMutexRequest(pGMM->aChunkMtx[iChunkMtx].hMtx);
1200	AssertRC(rc);
1201	return rc;
1202	}
1203
1204
1205	/**
1206	* Releases the GMM giant lock.
1207	*
1208	* @returns Assert status code from RTSemFastMutexRequest.
1209	* @param pMtxState Pointer to the chunk mutex state.
1210	* @param pChunk Pointer to the chunk if it's still
1211	* alive, NULL if it isn't. This is used to deassociate
1212	* the chunk from the mutex on the way out so a new one
1213	* can be selected next time, thus avoiding contented
1214	* mutexes.
1215	*/
1216	static int gmmR0ChunkMutexRelease(PGMMR0CHUNKMTXSTATE pMtxState, PGMMCHUNK pChunk)
1217	{
1218	PGMM pGMM = pMtxState->pGMM;
1219
1220	/*
1221	* Release the chunk mutex and reacquire the giant if requested.
1222	*/
1223	int rc = RTSemFastMutexRelease(pGMM->aChunkMtx[pMtxState->iChunkMtx].hMtx);
1224	AssertRC(rc);
1225	if (pMtxState->fFlags == GMMR0CHUNK_MTX_RETAKE_GIANT)
1226	rc = gmmR0MutexAcquire(pGMM);
1227	else
1228	Assert((pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT) == (pGMM->hMtxOwner == RTThreadNativeSelf()));
1229
1230	/*
1231	* Drop the chunk mutex user reference and deassociate it from the chunk
1232	* when possible.
1233	*/
1234	if ( ASMAtomicDecU32(&pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers) == 0
1235	&& pChunk
1236	&& RT_SUCCESS(rc) )
1237	{
1238	if (pMtxState->fFlags != GMMR0CHUNK_MTX_DROP_GIANT)
1239	pChunk->iChunkMtx = UINT8_MAX;
1240	else
1241	{
1242	rc = gmmR0MutexAcquire(pGMM);
1243	if (RT_SUCCESS(rc))
1244	{
1245	if (pGMM->aChunkMtx[pMtxState->iChunkMtx].cUsers == 0)
1246	pChunk->iChunkMtx = UINT8_MAX;
1247	rc = gmmR0MutexRelease(pGMM);
1248	}
1249	}
1250	}
1251
1252	pMtxState->pGMM = NULL;
1253	return rc;
1254	}
1255
1256
1257	/**
1258	* Drops the giant GMM lock we kept in gmmR0ChunkMutexAcquire while keeping the
1259	* chunk locked.
1260	*
1261	* This only works if gmmR0ChunkMutexAcquire was called with
1262	* GMMR0CHUNK_MTX_KEEP_GIANT. gmmR0ChunkMutexRelease will retake the giant
1263	* mutex, i.e. behave as if GMMR0CHUNK_MTX_RETAKE_GIANT was used.
1264	*
1265	* @returns VBox status code (assuming success is ok).
1266	* @param pMtxState Pointer to the chunk mutex state.
1267	*/
1268	static int gmmR0ChunkMutexDropGiant(PGMMR0CHUNKMTXSTATE pMtxState)
1269	{
1270	AssertReturn(pMtxState->fFlags == GMMR0CHUNK_MTX_KEEP_GIANT, VERR_GMM_MTX_FLAGS);
1271	Assert(pMtxState->pGMM->hMtxOwner == RTThreadNativeSelf());
1272	pMtxState->fFlags = GMMR0CHUNK_MTX_RETAKE_GIANT;
1273	/** @todo GMM life cycle cleanup (we may race someone
1274	* destroying and cleaning up GMM)? */
1275	return gmmR0MutexRelease(pMtxState->pGMM);
1276	}
1277
1278
1279	/**
1280	* For experimenting with NUMA affinity and such.
1281	*
1282	* @returns The current NUMA Node ID.
1283	*/
1284	static uint16_t gmmR0GetCurrentNumaNodeId(void)
1285	{
1286	#if 1
1287	return GMM_CHUNK_NUMA_ID_UNKNOWN;
1288	#else
1289	return RTMpCpuId() / 16;
1290	#endif
1291	}
1292
1293
1294
1295	/**
1296	* Cleans up when a VM is terminating.
1297	*
1298	* @param pGVM Pointer to the Global VM structure.
1299	*/
1300	GMMR0DECL(void) GMMR0CleanupVM(PGVM pGVM)
1301	{
1302	LogFlow(("GMMR0CleanupVM: pGVM=%p:{.hSelf=%#x}\n", pGVM, pGVM->hSelf));
1303
1304	PGMM pGMM;
1305	GMM_GET_VALID_INSTANCE_VOID(pGMM);
1306
1307	#ifdef VBOX_WITH_PAGE_SHARING
1308	/*
1309	* Clean up all registered shared modules first.
1310	*/
1311	gmmR0SharedModuleCleanup(pGMM, pGVM);
1312	#endif
1313
1314	gmmR0MutexAcquire(pGMM);
1315	uint64_t uLockNanoTS = RTTimeSystemNanoTS();
1316	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
1317
1318	/*
1319	* The policy is 'INVALID' until the initial reservation
1320	* request has been serviced.
1321	*/
1322	if ( pGVM->gmm.s.Stats.enmPolicy > GMMOCPOLICY_INVALID
1323	&& pGVM->gmm.s.Stats.enmPolicy < GMMOCPOLICY_END)
1324	{
1325	/*
1326	* If it's the last VM around, we can skip walking all the chunk looking
1327	* for the pages owned by this VM and instead flush the whole shebang.
1328	*
1329	* This takes care of the eventuality that a VM has left shared page
1330	* references behind (shouldn't happen of course, but you never know).
1331	*/
1332	Assert(pGMM->cRegisteredVMs);
1333	pGMM->cRegisteredVMs--;
1334
1335	/*
1336	* Walk the entire pool looking for pages that belong to this VM
1337	* and leftover mappings. (This'll only catch private pages,
1338	* shared pages will be 'left behind'.)
1339	*/
1340	/** @todo r=bird: This scanning+freeing could be optimized in bound mode! */
1341	uint64_t cPrivatePages = pGVM->gmm.s.Stats.cPrivatePages; /* save */
1342
1343	unsigned iCountDown = 64;
1344	bool fRedoFromStart;
1345	PGMMCHUNK pChunk;
1346	do
1347	{
1348	fRedoFromStart = false;
1349	RTListForEachReverse(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
1350	{
1351	uint32_t const cFreeChunksOld = pGMM->cFreedChunks;
1352	if ( ( !pGMM->fBoundMemoryMode
1353	\|\| pChunk->hGVM == pGVM->hSelf)
1354	&& gmmR0CleanupVMScanChunk(pGMM, pGVM, pChunk))
1355	{
1356	/* We left the giant mutex, so reset the yield counters. */
1357	uLockNanoTS = RTTimeSystemNanoTS();
1358	iCountDown = 64;
1359	}
1360	else
1361	{
1362	/* Didn't leave it, so do normal yielding. */
1363	if (!iCountDown)
1364	gmmR0MutexYield(pGMM, &uLockNanoTS);
1365	else
1366	iCountDown--;
1367	}
1368	if (pGMM->cFreedChunks != cFreeChunksOld)
1369	{
1370	fRedoFromStart = true;
1371	break;
1372	}
1373	}
1374	} while (fRedoFromStart);
1375
1376	if (pGVM->gmm.s.Stats.cPrivatePages)
1377	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x has %#x private pages that cannot be found!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cPrivatePages);
1378
1379	pGMM->cAllocatedPages -= cPrivatePages;
1380
1381	/*
1382	* Free empty chunks.
1383	*/
1384	PGMMCHUNKFREESET pPrivateSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
1385	do
1386	{
1387	fRedoFromStart = false;
1388	iCountDown = 10240;
1389	pChunk = pPrivateSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
1390	while (pChunk)
1391	{
1392	PGMMCHUNK pNext = pChunk->pFreeNext;
1393	Assert(pChunk->cFree == GMM_CHUNK_NUM_PAGES);
1394	if ( !pGMM->fBoundMemoryMode
1395	\|\| pChunk->hGVM == pGVM->hSelf)
1396	{
1397	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1398	if (gmmR0FreeChunk(pGMM, pGVM, pChunk, true /fRelaxedSem/))
1399	{
1400	/* We've left the giant mutex, restart? (+1 for our unlink) */
1401	fRedoFromStart = pPrivateSet->idGeneration != idGenerationOld + 1;
1402	if (fRedoFromStart)
1403	break;
1404	uLockNanoTS = RTTimeSystemNanoTS();
1405	iCountDown = 10240;
1406	}
1407	}
1408
1409	/* Advance and maybe yield the lock. */
1410	pChunk = pNext;
1411	if (--iCountDown == 0)
1412	{
1413	uint64_t const idGenerationOld = pPrivateSet->idGeneration;
1414	fRedoFromStart = gmmR0MutexYield(pGMM, &uLockNanoTS)
1415	&& pPrivateSet->idGeneration != idGenerationOld;
1416	if (fRedoFromStart)
1417	break;
1418	iCountDown = 10240;
1419	}
1420	}
1421	} while (fRedoFromStart);
1422
1423	/*
1424	* Account for shared pages that weren't freed.
1425	*/
1426	if (pGVM->gmm.s.Stats.cSharedPages)
1427	{
1428	Assert(pGMM->cSharedPages >= pGVM->gmm.s.Stats.cSharedPages);
1429	SUPR0Printf("GMMR0CleanupVM: hGVM=%#x left %#x shared pages behind!\n", pGVM->hSelf, pGVM->gmm.s.Stats.cSharedPages);
1430	pGMM->cLeftBehindSharedPages += pGVM->gmm.s.Stats.cSharedPages;
1431	}
1432
1433	/*
1434	* Clean up balloon statistics in case the VM process crashed.
1435	*/
1436	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
1437	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
1438
1439	/*
1440	* Update the over-commitment management statistics.
1441	*/
1442	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1443	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1444	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1445	switch (pGVM->gmm.s.Stats.enmPolicy)
1446	{
1447	case GMMOCPOLICY_NO_OC:
1448	break;
1449	default:
1450	/** @todo Update GMM->cOverCommittedPages */
1451	break;
1452	}
1453	}
1454
1455	/* zap the GVM data. */
1456	pGVM->gmm.s.Stats.enmPolicy = GMMOCPOLICY_INVALID;
1457	pGVM->gmm.s.Stats.enmPriority = GMMPRIORITY_INVALID;
1458	pGVM->gmm.s.Stats.fMayAllocate = false;
1459
1460	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1461	gmmR0MutexRelease(pGMM);
1462
1463	/*
1464	* Destroy the spinlock.
1465	*/
1466	RTSPINLOCK hSpinlock = NIL_RTSPINLOCK;
1467	ASMAtomicXchgHandle(&pGVM->gmm.s.hChunkTlbSpinLock, NIL_RTSPINLOCK, &hSpinlock);
1468	RTSpinlockDestroy(hSpinlock);
1469
1470	LogFlow(("GMMR0CleanupVM: returns\n"));
1471	}
1472
1473
1474	/**
1475	* Scan one chunk for private pages belonging to the specified VM.
1476	*
1477	* @note This function may drop the giant mutex!
1478	*
1479	* @returns @c true if we've temporarily dropped the giant mutex, @c false if
1480	* we didn't.
1481	* @param pGMM Pointer to the GMM instance.
1482	* @param pGVM The global VM handle.
1483	* @param pChunk The chunk to scan.
1484	*/
1485	static bool gmmR0CleanupVMScanChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
1486	{
1487	Assert(!pGMM->fBoundMemoryMode \|\| pChunk->hGVM == pGVM->hSelf);
1488
1489	/*
1490	* Look for pages belonging to the VM.
1491	* (Perform some internal checks while we're scanning.)
1492	*/
1493	#ifndef VBOX_STRICT
1494	if (pChunk->cFree != (GMM_CHUNK_SIZE >> PAGE_SHIFT))
1495	#endif
1496	{
1497	unsigned cPrivate = 0;
1498	unsigned cShared = 0;
1499	unsigned cFree = 0;
1500
1501	gmmR0UnlinkChunk(pChunk); /* avoiding cFreePages updates. */
1502
1503	uint16_t hGVM = pGVM->hSelf;
1504	unsigned iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
1505	while (iPage-- > 0)
1506	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
1507	{
1508	if (pChunk->aPages[iPage].Private.hGVM == hGVM)
1509	{
1510	/*
1511	* Free the page.
1512	*
1513	* The reason for not using gmmR0FreePrivatePage here is that we
1514	* must not cause the chunk to be freed from under us - we're in
1515	* an AVL tree walk here.
1516	*/
1517	pChunk->aPages[iPage].u = 0;
1518	pChunk->aPages[iPage].Free.iNext = pChunk->iFreeHead;
1519	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
1520	pChunk->iFreeHead = iPage;
1521	pChunk->cPrivate--;
1522	pChunk->cFree++;
1523	pGVM->gmm.s.Stats.cPrivatePages--;
1524	cFree++;
1525	}
1526	else
1527	cPrivate++;
1528	}
1529	else if (GMM_PAGE_IS_FREE(&pChunk->aPages[iPage]))
1530	cFree++;
1531	else
1532	cShared++;
1533
1534	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1535
1536	/*
1537	* Did it add up?
1538	*/
1539	if (RT_UNLIKELY( pChunk->cFree != cFree
1540	\|\| pChunk->cPrivate != cPrivate
1541	\|\| pChunk->cShared != cShared))
1542	{
1543	SUPR0Printf("gmmR0CleanupVMScanChunk: Chunk %RKv/%#x has bogus stats - free=%d/%d private=%d/%d shared=%d/%d\n",
1544	pChunk, pChunk->Core.Key, pChunk->cFree, cFree, pChunk->cPrivate, cPrivate, pChunk->cShared, cShared);
1545	pChunk->cFree = cFree;
1546	pChunk->cPrivate = cPrivate;
1547	pChunk->cShared = cShared;
1548	}
1549	}
1550
1551	/*
1552	* If not in bound memory mode, we should reset the hGVM field
1553	* if it has our handle in it.
1554	*/
1555	if (pChunk->hGVM == pGVM->hSelf)
1556	{
1557	if (!g_pGMM->fBoundMemoryMode)
1558	pChunk->hGVM = NIL_GVM_HANDLE;
1559	else if (pChunk->cFree != GMM_CHUNK_NUM_PAGES)
1560	{
1561	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: cFree=%#x - it should be 0 in bound mode!\n",
1562	pChunk, pChunk->Core.Key, pChunk->cFree);
1563	AssertMsgFailed(("%p/%#x: cFree=%#x - it should be 0 in bound mode!\n", pChunk, pChunk->Core.Key, pChunk->cFree));
1564
1565	gmmR0UnlinkChunk(pChunk);
1566	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
1567	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
1568	}
1569	}
1570
1571	/*
1572	* Look for a mapping belonging to the terminating VM.
1573	*/
1574	GMMR0CHUNKMTXSTATE MtxState;
1575	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
1576	unsigned cMappings = pChunk->cMappingsX;
1577	for (unsigned i = 0; i < cMappings; i++)
1578	if (pChunk->paMappingsX[i].pGVM == pGVM)
1579	{
1580	gmmR0ChunkMutexDropGiant(&MtxState);
1581
1582	RTR0MEMOBJ hMemObj = pChunk->paMappingsX[i].hMapObj;
1583
1584	cMappings--;
1585	if (i < cMappings)
1586	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
1587	pChunk->paMappingsX[cMappings].pGVM = NULL;
1588	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
1589	Assert(pChunk->cMappingsX - 1U == cMappings);
1590	pChunk->cMappingsX = cMappings;
1591
1592	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings (NA) */);
1593	if (RT_FAILURE(rc))
1594	{
1595	SUPR0Printf("gmmR0CleanupVMScanChunk: %RKv/%#x: mapping #%x: RTRMemObjFree(%RKv,false) -> %d \n",
1596	pChunk, pChunk->Core.Key, i, hMemObj, rc);
1597	AssertRC(rc);
1598	}
1599
1600	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1601	return true;
1602	}
1603
1604	gmmR0ChunkMutexRelease(&MtxState, pChunk);
1605	return false;
1606	}
1607
1608
1609	/**
1610	* The initial resource reservations.
1611	*
1612	* This will make memory reservations according to policy and priority. If there aren't
1613	* sufficient resources available to sustain the VM this function will fail and all
1614	* future allocations requests will fail as well.
1615	*
1616	* These are just the initial reservations made very very early during the VM creation
1617	* process and will be adjusted later in the GMMR0UpdateReservation call after the
1618	* ring-3 init has completed.
1619	*
1620	* @returns VBox status code.
1621	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1622	* @retval VERR_GMM_
1623	*
1624	* @param pGVM The global (ring-0) VM structure.
1625	* @param idCpu The VCPU id - must be zero.
1626	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1627	* This does not include MMIO2 and similar.
1628	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1629	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1630	* hyper heap, MMIO2 and similar.
1631	* @param enmPolicy The OC policy to use on this VM.
1632	* @param enmPriority The priority in an out-of-memory situation.
1633	*
1634	* @thread The creator thread / EMT(0).
1635	*/
1636	GMMR0DECL(int) GMMR0InitialReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages, uint32_t cShadowPages,
1637	uint32_t cFixedPages, GMMOCPOLICY enmPolicy, GMMPRIORITY enmPriority)
1638	{
1639	LogFlow(("GMMR0InitialReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x enmPolicy=%d enmPriority=%d\n",
1640	pGVM, cBasePages, cShadowPages, cFixedPages, enmPolicy, enmPriority));
1641
1642	/*
1643	* Validate, get basics and take the semaphore.
1644	*/
1645	AssertReturn(idCpu == 0, VERR_INVALID_CPU_ID);
1646	PGMM pGMM;
1647	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1648	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1649	if (RT_FAILURE(rc))
1650	return rc;
1651
1652	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1653	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1654	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1655	AssertReturn(enmPolicy > GMMOCPOLICY_INVALID && enmPolicy < GMMOCPOLICY_END, VERR_INVALID_PARAMETER);
1656	AssertReturn(enmPriority > GMMPRIORITY_INVALID && enmPriority < GMMPRIORITY_END, VERR_INVALID_PARAMETER);
1657
1658	gmmR0MutexAcquire(pGMM);
1659	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1660	{
1661	if ( !pGVM->gmm.s.Stats.Reserved.cBasePages
1662	&& !pGVM->gmm.s.Stats.Reserved.cFixedPages
1663	&& !pGVM->gmm.s.Stats.Reserved.cShadowPages)
1664	{
1665	/*
1666	* Check if we can accommodate this.
1667	*/
1668	/* ... later ... */
1669	if (RT_SUCCESS(rc))
1670	{
1671	/*
1672	* Update the records.
1673	*/
1674	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1675	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1676	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1677	pGVM->gmm.s.Stats.enmPolicy = enmPolicy;
1678	pGVM->gmm.s.Stats.enmPriority = enmPriority;
1679	pGVM->gmm.s.Stats.fMayAllocate = true;
1680
1681	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1682	pGMM->cRegisteredVMs++;
1683	}
1684	}
1685	else
1686	rc = VERR_WRONG_ORDER;
1687	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1688	}
1689	else
1690	rc = VERR_GMM_IS_NOT_SANE;
1691	gmmR0MutexRelease(pGMM);
1692	LogFlow(("GMMR0InitialReservation: returns %Rrc\n", rc));
1693	return rc;
1694	}
1695
1696
1697	/**
1698	* VMMR0 request wrapper for GMMR0InitialReservation.
1699	*
1700	* @returns see GMMR0InitialReservation.
1701	* @param pGVM The global (ring-0) VM structure.
1702	* @param idCpu The VCPU id.
1703	* @param pReq Pointer to the request packet.
1704	*/
1705	GMMR0DECL(int) GMMR0InitialReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMINITIALRESERVATIONREQ pReq)
1706	{
1707	/*
1708	* Validate input and pass it on.
1709	*/
1710	AssertPtrReturn(pGVM, VERR_INVALID_POINTER);
1711	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1712	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1713
1714	return GMMR0InitialReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages,
1715	pReq->cFixedPages, pReq->enmPolicy, pReq->enmPriority);
1716	}
1717
1718
1719	/**
1720	* This updates the memory reservation with the additional MMIO2 and ROM pages.
1721	*
1722	* @returns VBox status code.
1723	* @retval VERR_GMM_MEMORY_RESERVATION_DECLINED
1724	*
1725	* @param pGVM The global (ring-0) VM structure.
1726	* @param idCpu The VCPU id.
1727	* @param cBasePages The number of pages that may be allocated for the base RAM and ROMs.
1728	* This does not include MMIO2 and similar.
1729	* @param cShadowPages The number of pages that may be allocated for shadow paging structures.
1730	* @param cFixedPages The number of pages that may be allocated for fixed objects like the
1731	* hyper heap, MMIO2 and similar.
1732	*
1733	* @thread EMT(idCpu)
1734	*/
1735	GMMR0DECL(int) GMMR0UpdateReservation(PGVM pGVM, VMCPUID idCpu, uint64_t cBasePages,
1736	uint32_t cShadowPages, uint32_t cFixedPages)
1737	{
1738	LogFlow(("GMMR0UpdateReservation: pGVM=%p cBasePages=%#llx cShadowPages=%#x cFixedPages=%#x\n",
1739	pGVM, cBasePages, cShadowPages, cFixedPages));
1740
1741	/*
1742	* Validate, get basics and take the semaphore.
1743	*/
1744	PGMM pGMM;
1745	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
1746	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
1747	if (RT_FAILURE(rc))
1748	return rc;
1749
1750	AssertReturn(cBasePages, VERR_INVALID_PARAMETER);
1751	AssertReturn(cShadowPages, VERR_INVALID_PARAMETER);
1752	AssertReturn(cFixedPages, VERR_INVALID_PARAMETER);
1753
1754	gmmR0MutexAcquire(pGMM);
1755	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
1756	{
1757	if ( pGVM->gmm.s.Stats.Reserved.cBasePages
1758	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
1759	&& pGVM->gmm.s.Stats.Reserved.cShadowPages)
1760	{
1761	/*
1762	* Check if we can accommodate this.
1763	*/
1764	/* ... later ... */
1765	if (RT_SUCCESS(rc))
1766	{
1767	/*
1768	* Update the records.
1769	*/
1770	pGMM->cReservedPages -= pGVM->gmm.s.Stats.Reserved.cBasePages
1771	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
1772	+ pGVM->gmm.s.Stats.Reserved.cShadowPages;
1773	pGMM->cReservedPages += cBasePages + cFixedPages + cShadowPages;
1774
1775	pGVM->gmm.s.Stats.Reserved.cBasePages = cBasePages;
1776	pGVM->gmm.s.Stats.Reserved.cFixedPages = cFixedPages;
1777	pGVM->gmm.s.Stats.Reserved.cShadowPages = cShadowPages;
1778	}
1779	}
1780	else
1781	rc = VERR_WRONG_ORDER;
1782	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
1783	}
1784	else
1785	rc = VERR_GMM_IS_NOT_SANE;
1786	gmmR0MutexRelease(pGMM);
1787	LogFlow(("GMMR0UpdateReservation: returns %Rrc\n", rc));
1788	return rc;
1789	}
1790
1791
1792	/**
1793	* VMMR0 request wrapper for GMMR0UpdateReservation.
1794	*
1795	* @returns see GMMR0UpdateReservation.
1796	* @param pGVM The global (ring-0) VM structure.
1797	* @param idCpu The VCPU id.
1798	* @param pReq Pointer to the request packet.
1799	*/
1800	GMMR0DECL(int) GMMR0UpdateReservationReq(PGVM pGVM, VMCPUID idCpu, PGMMUPDATERESERVATIONREQ pReq)
1801	{
1802	/*
1803	* Validate input and pass it on.
1804	*/
1805	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
1806	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
1807
1808	return GMMR0UpdateReservation(pGVM, idCpu, pReq->cBasePages, pReq->cShadowPages, pReq->cFixedPages);
1809	}
1810
1811	#ifdef GMMR0_WITH_SANITY_CHECK
1812
1813	/**
1814	* Performs sanity checks on a free set.
1815	*
1816	* @returns Error count.
1817	*
1818	* @param pGMM Pointer to the GMM instance.
1819	* @param pSet Pointer to the set.
1820	* @param pszSetName The set name.
1821	* @param pszFunction The function from which it was called.
1822	* @param uLine The line number.
1823	*/
1824	static uint32_t gmmR0SanityCheckSet(PGMM pGMM, PGMMCHUNKFREESET pSet, const char *pszSetName,
1825	const char *pszFunction, unsigned uLineNo)
1826	{
1827	uint32_t cErrors = 0;
1828
1829	/*
1830	* Count the free pages in all the chunks and match it against pSet->cFreePages.
1831	*/
1832	uint32_t cPages = 0;
1833	for (unsigned i = 0; i < RT_ELEMENTS(pSet->apLists); i++)
1834	{
1835	for (PGMMCHUNK pCur = pSet->apLists[i]; pCur; pCur = pCur->pFreeNext)
1836	{
1837	/** @todo check that the chunk is hash into the right set. */
1838	cPages += pCur->cFree;
1839	}
1840	}
1841	if (RT_UNLIKELY(cPages != pSet->cFreePages))
1842	{
1843	SUPR0Printf("GMM insanity: found %#x pages in the %s set, expected %#x. (%s, line %u)\n",
1844	cPages, pszSetName, pSet->cFreePages, pszFunction, uLineNo);
1845	cErrors++;
1846	}
1847
1848	return cErrors;
1849	}
1850
1851
1852	/**
1853	* Performs some sanity checks on the GMM while owning lock.
1854	*
1855	* @returns Error count.
1856	*
1857	* @param pGMM Pointer to the GMM instance.
1858	* @param pszFunction The function from which it is called.
1859	* @param uLineNo The line number.
1860	*/
1861	static uint32_t gmmR0SanityCheck(PGMM pGMM, const char *pszFunction, unsigned uLineNo)
1862	{
1863	uint32_t cErrors = 0;
1864
1865	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->PrivateX, "private", pszFunction, uLineNo);
1866	cErrors += gmmR0SanityCheckSet(pGMM, &pGMM->Shared, "shared", pszFunction, uLineNo);
1867	/** @todo add more sanity checks. */
1868
1869	return cErrors;
1870	}
1871
1872	#endif /* GMMR0_WITH_SANITY_CHECK */
1873
1874	/**
1875	* Looks up a chunk in the tree and fill in the TLB entry for it.
1876	*
1877	* This is not expected to fail and will bitch if it does.
1878	*
1879	* @returns Pointer to the allocation chunk, NULL if not found.
1880	* @param pGMM Pointer to the GMM instance.
1881	* @param idChunk The ID of the chunk to find.
1882	* @param pTlbe Pointer to the TLB entry.
1883	*
1884	* @note Caller owns spinlock.
1885	*/
1886	static PGMMCHUNK gmmR0GetChunkSlow(PGMM pGMM, uint32_t idChunk, PGMMCHUNKTLBE pTlbe)
1887	{
1888	PGMMCHUNK pChunk = (PGMMCHUNK)RTAvlU32Get(&pGMM->pChunks, idChunk);
1889	AssertMsgReturn(pChunk, ("Chunk %#x not found!\n", idChunk), NULL);
1890	pTlbe->idChunk = idChunk;
1891	pTlbe->pChunk = pChunk;
1892	return pChunk;
1893	}
1894
1895
1896	/**
1897	* Finds a allocation chunk, spin-locked.
1898	*
1899	* This is not expected to fail and will bitch if it does.
1900	*
1901	* @returns Pointer to the allocation chunk, NULL if not found.
1902	* @param pGMM Pointer to the GMM instance.
1903	* @param idChunk The ID of the chunk to find.
1904	*/
1905	DECLINLINE(PGMMCHUNK) gmmR0GetChunkLocked(PGMM pGMM, uint32_t idChunk)
1906	{
1907	/*
1908	* Do a TLB lookup, branch if not in the TLB.
1909	*/
1910	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(idChunk)];
1911	PGMMCHUNK pChunk = pTlbe->pChunk;
1912	if ( pChunk == NULL
1913	\|\| pTlbe->idChunk != idChunk)
1914	pChunk = gmmR0GetChunkSlow(pGMM, idChunk, pTlbe);
1915	return pChunk;
1916	}
1917
1918
1919	/**
1920	* Finds a allocation chunk.
1921	*
1922	* This is not expected to fail and will bitch if it does.
1923	*
1924	* @returns Pointer to the allocation chunk, NULL if not found.
1925	* @param pGMM Pointer to the GMM instance.
1926	* @param idChunk The ID of the chunk to find.
1927	*/
1928	DECLINLINE(PGMMCHUNK) gmmR0GetChunk(PGMM pGMM, uint32_t idChunk)
1929	{
1930	RTSpinlockAcquire(pGMM->hSpinLockTree);
1931	PGMMCHUNK pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
1932	RTSpinlockRelease(pGMM->hSpinLockTree);
1933	return pChunk;
1934	}
1935
1936
1937	/**
1938	* Finds a page.
1939	*
1940	* This is not expected to fail and will bitch if it does.
1941	*
1942	* @returns Pointer to the page, NULL if not found.
1943	* @param pGMM Pointer to the GMM instance.
1944	* @param idPage The ID of the page to find.
1945	*/
1946	DECLINLINE(PGMMPAGE) gmmR0GetPage(PGMM pGMM, uint32_t idPage)
1947	{
1948	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1949	if (RT_LIKELY(pChunk))
1950	return &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
1951	return NULL;
1952	}
1953
1954
1955	#if 0 /* unused */
1956	/**
1957	* Gets the host physical address for a page given by it's ID.
1958	*
1959	* @returns The host physical address or NIL_RTHCPHYS.
1960	* @param pGMM Pointer to the GMM instance.
1961	* @param idPage The ID of the page to find.
1962	*/
1963	DECLINLINE(RTHCPHYS) gmmR0GetPageHCPhys(PGMM pGMM, uint32_t idPage)
1964	{
1965	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
1966	if (RT_LIKELY(pChunk))
1967	return RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, idPage & GMM_PAGEID_IDX_MASK);
1968	return NIL_RTHCPHYS;
1969	}
1970	#endif /* unused */
1971
1972
1973	/**
1974	* Selects the appropriate free list given the number of free pages.
1975	*
1976	* @returns Free list index.
1977	* @param cFree The number of free pages in the chunk.
1978	*/
1979	DECLINLINE(unsigned) gmmR0SelectFreeSetList(unsigned cFree)
1980	{
1981	unsigned iList = cFree >> GMM_CHUNK_FREE_SET_SHIFT;
1982	AssertMsg(iList < RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists) / RT_SIZEOFMEMB(GMMCHUNKFREESET, apLists[0]),
1983	("%d (%u)\n", iList, cFree));
1984	return iList;
1985	}
1986
1987
1988	/**
1989	* Unlinks the chunk from the free list it's currently on (if any).
1990	*
1991	* @param pChunk The allocation chunk.
1992	*/
1993	DECLINLINE(void) gmmR0UnlinkChunk(PGMMCHUNK pChunk)
1994	{
1995	PGMMCHUNKFREESET pSet = pChunk->pSet;
1996	if (RT_LIKELY(pSet))
1997	{
1998	pSet->cFreePages -= pChunk->cFree;
1999	pSet->idGeneration++;
2000
2001	PGMMCHUNK pPrev = pChunk->pFreePrev;
2002	PGMMCHUNK pNext = pChunk->pFreeNext;
2003	if (pPrev)
2004	pPrev->pFreeNext = pNext;
2005	else
2006	pSet->apLists[gmmR0SelectFreeSetList(pChunk->cFree)] = pNext;
2007	if (pNext)
2008	pNext->pFreePrev = pPrev;
2009
2010	pChunk->pSet = NULL;
2011	pChunk->pFreeNext = NULL;
2012	pChunk->pFreePrev = NULL;
2013	}
2014	else
2015	{
2016	Assert(!pChunk->pFreeNext);
2017	Assert(!pChunk->pFreePrev);
2018	Assert(!pChunk->cFree);
2019	}
2020	}
2021
2022
2023	/**
2024	* Links the chunk onto the appropriate free list in the specified free set.
2025	*
2026	* If no free entries, it's not linked into any list.
2027	*
2028	* @param pChunk The allocation chunk.
2029	* @param pSet The free set.
2030	*/
2031	DECLINLINE(void) gmmR0LinkChunk(PGMMCHUNK pChunk, PGMMCHUNKFREESET pSet)
2032	{
2033	Assert(!pChunk->pSet);
2034	Assert(!pChunk->pFreeNext);
2035	Assert(!pChunk->pFreePrev);
2036
2037	if (pChunk->cFree > 0)
2038	{
2039	pChunk->pSet = pSet;
2040	pChunk->pFreePrev = NULL;
2041	unsigned const iList = gmmR0SelectFreeSetList(pChunk->cFree);
2042	pChunk->pFreeNext = pSet->apLists[iList];
2043	if (pChunk->pFreeNext)
2044	pChunk->pFreeNext->pFreePrev = pChunk;
2045	pSet->apLists[iList] = pChunk;
2046
2047	pSet->cFreePages += pChunk->cFree;
2048	pSet->idGeneration++;
2049	}
2050	}
2051
2052
2053	/**
2054	* Links the chunk onto the appropriate free list in the specified free set.
2055	*
2056	* If no free entries, it's not linked into any list.
2057	*
2058	* @param pGMM Pointer to the GMM instance.
2059	* @param pGVM Pointer to the kernel-only VM instace data.
2060	* @param pChunk The allocation chunk.
2061	*/
2062	DECLINLINE(void) gmmR0SelectSetAndLinkChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
2063	{
2064	PGMMCHUNKFREESET pSet;
2065	if (pGMM->fBoundMemoryMode)
2066	pSet = &pGVM->gmm.s.Private;
2067	else if (pChunk->cShared)
2068	pSet = &pGMM->Shared;
2069	else
2070	pSet = &pGMM->PrivateX;
2071	gmmR0LinkChunk(pChunk, pSet);
2072	}
2073
2074
2075	/**
2076	* Frees a Chunk ID.
2077	*
2078	* @param pGMM Pointer to the GMM instance.
2079	* @param idChunk The Chunk ID to free.
2080	*/
2081	static void gmmR0FreeChunkId(PGMM pGMM, uint32_t idChunk)
2082	{
2083	AssertReturnVoid(idChunk != NIL_GMM_CHUNKID);
2084	AssertMsg(ASMBitTest(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk));
2085	ASMAtomicBitClear(&pGMM->bmChunkId[0], idChunk);
2086	}
2087
2088
2089	/**
2090	* Allocates a new Chunk ID.
2091	*
2092	* @returns The Chunk ID.
2093	* @param pGMM Pointer to the GMM instance.
2094	*/
2095	static uint32_t gmmR0AllocateChunkId(PGMM pGMM)
2096	{
2097	AssertCompile(!((GMM_CHUNKID_LAST + 1) & 31)); /* must be a multiple of 32 */
2098	AssertCompile(NIL_GMM_CHUNKID == 0);
2099
2100	/*
2101	* Try the next sequential one.
2102	*/
2103	int32_t idChunk = ++pGMM->idChunkPrev;
2104	#if 0 /** @todo enable this code */
2105	if ( idChunk <= GMM_CHUNKID_LAST
2106	&& idChunk > NIL_GMM_CHUNKID
2107	&& !ASMAtomicBitTestAndSet(&pVMM->bmChunkId[0], idChunk))
2108	return idChunk;
2109	#endif
2110
2111	/*
2112	* Scan sequentially from the last one.
2113	*/
2114	if ( (uint32_t)idChunk < GMM_CHUNKID_LAST
2115	&& idChunk > NIL_GMM_CHUNKID)
2116	{
2117	idChunk = ASMBitNextClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1, idChunk - 1);
2118	if (idChunk > NIL_GMM_CHUNKID)
2119	{
2120	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2121	return pGMM->idChunkPrev = idChunk;
2122	}
2123	}
2124
2125	/*
2126	* Ok, scan from the start.
2127	* We're not racing anyone, so there is no need to expect failures or have restart loops.
2128	*/
2129	idChunk = ASMBitFirstClear(&pGMM->bmChunkId[0], GMM_CHUNKID_LAST + 1);
2130	AssertMsgReturn(idChunk > NIL_GMM_CHUNKID, ("%#x\n", idChunk), NIL_GVM_HANDLE);
2131	AssertMsgReturn(!ASMAtomicBitTestAndSet(&pGMM->bmChunkId[0], idChunk), ("%#x\n", idChunk), NIL_GMM_CHUNKID);
2132
2133	return pGMM->idChunkPrev = idChunk;
2134	}
2135
2136
2137	/**
2138	* Allocates one private page.
2139	*
2140	* Worker for gmmR0AllocatePages.
2141	*
2142	* @param pChunk The chunk to allocate it from.
2143	* @param hGVM The GVM handle of the VM requesting memory.
2144	* @param pPageDesc The page descriptor.
2145	*/
2146	static void gmmR0AllocatePage(PGMMCHUNK pChunk, uint32_t hGVM, PGMMPAGEDESC pPageDesc)
2147	{
2148	/* update the chunk stats. */
2149	if (pChunk->hGVM == NIL_GVM_HANDLE)
2150	pChunk->hGVM = hGVM;
2151	Assert(pChunk->cFree);
2152	pChunk->cFree--;
2153	pChunk->cPrivate++;
2154
2155	/* unlink the first free page. */
2156	const uint32_t iPage = pChunk->iFreeHead;
2157	AssertReleaseMsg(iPage < RT_ELEMENTS(pChunk->aPages), ("%d\n", iPage));
2158	PGMMPAGE pPage = &pChunk->aPages[iPage];
2159	Assert(GMM_PAGE_IS_FREE(pPage));
2160	pChunk->iFreeHead = pPage->Free.iNext;
2161	Log3(("A pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x iNext=%#x\n",
2162	pPage, iPage, (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage,
2163	pPage->Common.u2State, pChunk->iFreeHead, pPage->Free.iNext));
2164
2165	/* make the page private. */
2166	pPage->u = 0;
2167	AssertCompile(GMM_PAGE_STATE_PRIVATE == 0);
2168	pPage->Private.hGVM = hGVM;
2169	AssertCompile(NIL_RTHCPHYS >= GMM_GCPHYS_LAST);
2170	AssertCompile(GMM_GCPHYS_UNSHAREABLE >= GMM_GCPHYS_LAST);
2171	if (pPageDesc->HCPhysGCPhys <= GMM_GCPHYS_LAST)
2172	pPage->Private.pfn = pPageDesc->HCPhysGCPhys >> PAGE_SHIFT;
2173	else
2174	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE; /* unshareable / unassigned - same thing. */
2175
2176	/* update the page descriptor. */
2177	pPageDesc->HCPhysGCPhys = RTR0MemObjGetPagePhysAddr(pChunk->hMemObj, iPage);
2178	Assert(pPageDesc->HCPhysGCPhys != NIL_RTHCPHYS);
2179	pPageDesc->idPage = (pChunk->Core.Key << GMM_CHUNKID_SHIFT) \| iPage;
2180	pPageDesc->idSharedPage = NIL_GMM_PAGEID;
2181	}
2182
2183
2184	/**
2185	* Picks the free pages from a chunk.
2186	*
2187	* @returns The new page descriptor table index.
2188	* @param pChunk The chunk.
2189	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2190	* affinity.
2191	* @param iPage The current page descriptor table index.
2192	* @param cPages The total number of pages to allocate.
2193	* @param paPages The page descriptor table (input + ouput).
2194	*/
2195	static uint32_t gmmR0AllocatePagesFromChunk(PGMMCHUNK pChunk, uint16_t const hGVM, uint32_t iPage, uint32_t cPages,
2196	PGMMPAGEDESC paPages)
2197	{
2198	PGMMCHUNKFREESET pSet = pChunk->pSet; Assert(pSet);
2199	gmmR0UnlinkChunk(pChunk);
2200
2201	for (; pChunk->cFree && iPage < cPages; iPage++)
2202	gmmR0AllocatePage(pChunk, hGVM, &paPages[iPage]);
2203
2204	gmmR0LinkChunk(pChunk, pSet);
2205	return iPage;
2206	}
2207
2208
2209	/**
2210	* Registers a new chunk of memory.
2211	*
2212	* This is called by both gmmR0AllocateOneChunk and GMMR0SeedChunk.
2213	*
2214	* @returns VBox status code. On success, the giant GMM lock will be held, the
2215	* caller must release it (ugly).
2216	* @param pGMM Pointer to the GMM instance.
2217	* @param pSet Pointer to the set.
2218	* @param hMemObj The memory object for the chunk.
2219	* @param hGVM The affinity of the chunk. NIL_GVM_HANDLE for no
2220	* affinity.
2221	* @param fChunkFlags The chunk flags, GMM_CHUNK_FLAGS_XXX.
2222	* @param ppChunk Chunk address (out). Optional.
2223	*
2224	* @remarks The caller must not own the giant GMM mutex.
2225	* The giant GMM mutex will be acquired and returned acquired in
2226	* the success path. On failure, no locks will be held.
2227	*/
2228	static int gmmR0RegisterChunk(PGMM pGMM, PGMMCHUNKFREESET pSet, RTR0MEMOBJ hMemObj, uint16_t hGVM, uint16_t fChunkFlags,
2229	PGMMCHUNK *ppChunk)
2230	{
2231	Assert(pGMM->hMtxOwner != RTThreadNativeSelf());
2232	Assert(hGVM != NIL_GVM_HANDLE \|\| pGMM->fBoundMemoryMode);
2233	#ifdef GMM_WITH_LEGACY_MODE
2234	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE \|\| fChunkFlags == GMM_CHUNK_FLAGS_SEEDED);
2235	#else
2236	Assert(fChunkFlags == 0 \|\| fChunkFlags == GMM_CHUNK_FLAGS_LARGE_PAGE);
2237	#endif
2238
2239	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2240	/*
2241	* Get a ring-0 mapping of the object.
2242	*/
2243	# ifdef GMM_WITH_LEGACY_MODE
2244	uint8_t pbMapping = !(fChunkFlags & GMM_CHUNK_FLAGS_SEEDED) ? (uint8_t )RTR0MemObjAddress(hMemObj) : NULL;
2245	# else
2246	uint8_t pbMapping = (uint8_t )RTR0MemObjAddress(hMemObj);
2247	# endif
2248	if (!pbMapping)
2249	{
2250	RTR0MEMOBJ hMapObj;
2251	int rc = RTR0MemObjMapKernel(&hMapObj, hMemObj, (void *)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE);
2252	if (RT_SUCCESS(rc))
2253	pbMapping = (uint8_t *)RTR0MemObjAddress(hMapObj);
2254	else
2255	return rc;
2256	AssertPtr(pbMapping);
2257	}
2258	#endif
2259
2260	/*
2261	* Allocate a chunk.
2262	*/
2263	int rc;
2264	PGMMCHUNK pChunk = (PGMMCHUNK)RTMemAllocZ(sizeof(*pChunk));
2265	if (pChunk)
2266	{
2267	/*
2268	* Initialize it.
2269	*/
2270	pChunk->hMemObj = hMemObj;
2271	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2272	pChunk->pbMapping = pbMapping;
2273	#endif
2274	pChunk->cFree = GMM_CHUNK_NUM_PAGES;
2275	pChunk->hGVM = hGVM;
2276	/pChunk->iFreeHead = 0;/
2277	pChunk->idNumaNode = gmmR0GetCurrentNumaNodeId();
2278	pChunk->iChunkMtx = UINT8_MAX;
2279	pChunk->fFlags = fChunkFlags;
2280	for (unsigned iPage = 0; iPage < RT_ELEMENTS(pChunk->aPages) - 1; iPage++)
2281	{
2282	pChunk->aPages[iPage].Free.u2State = GMM_PAGE_STATE_FREE;
2283	pChunk->aPages[iPage].Free.iNext = iPage + 1;
2284	}
2285	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.u2State = GMM_PAGE_STATE_FREE;
2286	pChunk->aPages[RT_ELEMENTS(pChunk->aPages) - 1].Free.iNext = UINT16_MAX;
2287
2288	/*
2289	* Allocate a Chunk ID and insert it into the tree.
2290	* This has to be done behind the mutex of course.
2291	*/
2292	rc = gmmR0MutexAcquire(pGMM);
2293	if (RT_SUCCESS(rc))
2294	{
2295	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2296	{
2297	pChunk->Core.Key = gmmR0AllocateChunkId(pGMM);
2298	if ( pChunk->Core.Key != NIL_GMM_CHUNKID
2299	&& pChunk->Core.Key <= GMM_CHUNKID_LAST)
2300	{
2301	RTSpinlockAcquire(pGMM->hSpinLockTree);
2302	if (RTAvlU32Insert(&pGMM->pChunks, &pChunk->Core))
2303	{
2304	pGMM->cChunks++;
2305	RTListAppend(&pGMM->ChunkList, &pChunk->ListNode);
2306	RTSpinlockRelease(pGMM->hSpinLockTree);
2307
2308	gmmR0LinkChunk(pChunk, pSet);
2309
2310	LogFlow(("gmmR0RegisterChunk: pChunk=%p id=%#x cChunks=%d\n", pChunk, pChunk->Core.Key, pGMM->cChunks));
2311
2312	if (ppChunk)
2313	*ppChunk = pChunk;
2314	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
2315	return VINF_SUCCESS;
2316	}
2317	RTSpinlockRelease(pGMM->hSpinLockTree);
2318	}
2319
2320	/* bail out */
2321	rc = VERR_GMM_CHUNK_INSERT;
2322	}
2323	else
2324	rc = VERR_GMM_IS_NOT_SANE;
2325	gmmR0MutexRelease(pGMM);
2326	}
2327
2328	RTMemFree(pChunk);
2329	}
2330	else
2331	rc = VERR_NO_MEMORY;
2332	return rc;
2333	}
2334
2335
2336	/**
2337	* Allocate a new chunk, immediately pick the requested pages from it, and adds
2338	* what's remaining to the specified free set.
2339	*
2340	* @note This will leave the giant mutex while allocating the new chunk!
2341	*
2342	* @returns VBox status code.
2343	* @param pGMM Pointer to the GMM instance data.
2344	* @param pGVM Pointer to the kernel-only VM instace data.
2345	* @param pSet Pointer to the free set.
2346	* @param cPages The number of pages requested.
2347	* @param paPages The page descriptor table (input + output).
2348	* @param piPage The pointer to the page descriptor table index variable.
2349	* This will be updated.
2350	*/
2351	static int gmmR0AllocateChunkNew(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet, uint32_t cPages,
2352	PGMMPAGEDESC paPages, uint32_t *piPage)
2353	{
2354	gmmR0MutexRelease(pGMM);
2355
2356	RTR0MEMOBJ hMemObj;
2357	#ifndef GMM_WITH_LEGACY_MODE
2358	int rc;
2359	# ifdef VBOX_WITH_LINEAR_HOST_PHYS_MEM
2360	if (pGMM->fHasWorkingAllocPhysNC)
2361	rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2362	else
2363	# endif
2364	rc = RTR0MemObjAllocPage(&hMemObj, GMM_CHUNK_SIZE, false /fExecutable/);
2365	#else
2366	int rc = RTR0MemObjAllocPhysNC(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS);
2367	#endif
2368	if (RT_SUCCESS(rc))
2369	{
2370	/** @todo Duplicate gmmR0RegisterChunk here so we can avoid chaining up the
2371	* free pages first and then unchaining them right afterwards. Instead
2372	* do as much work as possible without holding the giant lock. */
2373	PGMMCHUNK pChunk;
2374	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, 0 /fChunkFlags/, &pChunk);
2375	if (RT_SUCCESS(rc))
2376	{
2377	piPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, piPage, cPages, paPages);
2378	return VINF_SUCCESS;
2379	}
2380
2381	/* bail out */
2382	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
2383	}
2384
2385	int rc2 = gmmR0MutexAcquire(pGMM);
2386	AssertRCReturn(rc2, RT_FAILURE(rc) ? rc : rc2);
2387	return rc;
2388
2389	}
2390
2391
2392	/**
2393	* As a last restort we'll pick any page we can get.
2394	*
2395	* @returns The new page descriptor table index.
2396	* @param pSet The set to pick from.
2397	* @param pGVM Pointer to the global VM structure.
2398	* @param iPage The current page descriptor table index.
2399	* @param cPages The total number of pages to allocate.
2400	* @param paPages The page descriptor table (input + ouput).
2401	*/
2402	static uint32_t gmmR0AllocatePagesIndiscriminately(PGMMCHUNKFREESET pSet, PGVM pGVM,
2403	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2404	{
2405	unsigned iList = RT_ELEMENTS(pSet->apLists);
2406	while (iList-- > 0)
2407	{
2408	PGMMCHUNK pChunk = pSet->apLists[iList];
2409	while (pChunk)
2410	{
2411	PGMMCHUNK pNext = pChunk->pFreeNext;
2412
2413	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2414	if (iPage >= cPages)
2415	return iPage;
2416
2417	pChunk = pNext;
2418	}
2419	}
2420	return iPage;
2421	}
2422
2423
2424	/**
2425	* Pick pages from empty chunks on the same NUMA node.
2426	*
2427	* @returns The new page descriptor table index.
2428	* @param pSet The set to pick from.
2429	* @param pGVM Pointer to the global VM structure.
2430	* @param iPage The current page descriptor table index.
2431	* @param cPages The total number of pages to allocate.
2432	* @param paPages The page descriptor table (input + ouput).
2433	*/
2434	static uint32_t gmmR0AllocatePagesFromEmptyChunksOnSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2435	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2436	{
2437	PGMMCHUNK pChunk = pSet->apLists[GMM_CHUNK_FREE_SET_UNUSED_LIST];
2438	if (pChunk)
2439	{
2440	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2441	while (pChunk)
2442	{
2443	PGMMCHUNK pNext = pChunk->pFreeNext;
2444
2445	if (pChunk->idNumaNode == idNumaNode)
2446	{
2447	pChunk->hGVM = pGVM->hSelf;
2448	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2449	if (iPage >= cPages)
2450	{
2451	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2452	return iPage;
2453	}
2454	}
2455
2456	pChunk = pNext;
2457	}
2458	}
2459	return iPage;
2460	}
2461
2462
2463	/**
2464	* Pick pages from non-empty chunks on the same NUMA node.
2465	*
2466	* @returns The new page descriptor table index.
2467	* @param pSet The set to pick from.
2468	* @param pGVM Pointer to the global VM structure.
2469	* @param iPage The current page descriptor table index.
2470	* @param cPages The total number of pages to allocate.
2471	* @param paPages The page descriptor table (input + ouput).
2472	*/
2473	static uint32_t gmmR0AllocatePagesFromSameNode(PGMMCHUNKFREESET pSet, PGVM pGVM,
2474	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2475	{
2476	/** @todo start by picking from chunks with about the right size first? */
2477	uint16_t const idNumaNode = gmmR0GetCurrentNumaNodeId();
2478	unsigned iList = GMM_CHUNK_FREE_SET_UNUSED_LIST;
2479	while (iList-- > 0)
2480	{
2481	PGMMCHUNK pChunk = pSet->apLists[iList];
2482	while (pChunk)
2483	{
2484	PGMMCHUNK pNext = pChunk->pFreeNext;
2485
2486	if (pChunk->idNumaNode == idNumaNode)
2487	{
2488	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2489	if (iPage >= cPages)
2490	{
2491	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2492	return iPage;
2493	}
2494	}
2495
2496	pChunk = pNext;
2497	}
2498	}
2499	return iPage;
2500	}
2501
2502
2503	/**
2504	* Pick pages that are in chunks already associated with the VM.
2505	*
2506	* @returns The new page descriptor table index.
2507	* @param pGMM Pointer to the GMM instance data.
2508	* @param pGVM Pointer to the global VM structure.
2509	* @param pSet The set to pick from.
2510	* @param iPage The current page descriptor table index.
2511	* @param cPages The total number of pages to allocate.
2512	* @param paPages The page descriptor table (input + ouput).
2513	*/
2514	static uint32_t gmmR0AllocatePagesAssociatedWithVM(PGMM pGMM, PGVM pGVM, PGMMCHUNKFREESET pSet,
2515	uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2516	{
2517	uint16_t const hGVM = pGVM->hSelf;
2518
2519	/* Hint. */
2520	if (pGVM->gmm.s.idLastChunkHint != NIL_GMM_CHUNKID)
2521	{
2522	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pGVM->gmm.s.idLastChunkHint);
2523	if (pChunk && pChunk->cFree)
2524	{
2525	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2526	if (iPage >= cPages)
2527	return iPage;
2528	}
2529	}
2530
2531	/* Scan. */
2532	for (unsigned iList = 0; iList < RT_ELEMENTS(pSet->apLists); iList++)
2533	{
2534	PGMMCHUNK pChunk = pSet->apLists[iList];
2535	while (pChunk)
2536	{
2537	PGMMCHUNK pNext = pChunk->pFreeNext;
2538
2539	if (pChunk->hGVM == hGVM)
2540	{
2541	iPage = gmmR0AllocatePagesFromChunk(pChunk, hGVM, iPage, cPages, paPages);
2542	if (iPage >= cPages)
2543	{
2544	pGVM->gmm.s.idLastChunkHint = pChunk->cFree ? pChunk->Core.Key : NIL_GMM_CHUNKID;
2545	return iPage;
2546	}
2547	}
2548
2549	pChunk = pNext;
2550	}
2551	}
2552	return iPage;
2553	}
2554
2555
2556
2557	/**
2558	* Pick pages in bound memory mode.
2559	*
2560	* @returns The new page descriptor table index.
2561	* @param pGVM Pointer to the global VM structure.
2562	* @param iPage The current page descriptor table index.
2563	* @param cPages The total number of pages to allocate.
2564	* @param paPages The page descriptor table (input + ouput).
2565	*/
2566	static uint32_t gmmR0AllocatePagesInBoundMode(PGVM pGVM, uint32_t iPage, uint32_t cPages, PGMMPAGEDESC paPages)
2567	{
2568	for (unsigned iList = 0; iList < RT_ELEMENTS(pGVM->gmm.s.Private.apLists); iList++)
2569	{
2570	PGMMCHUNK pChunk = pGVM->gmm.s.Private.apLists[iList];
2571	while (pChunk)
2572	{
2573	Assert(pChunk->hGVM == pGVM->hSelf);
2574	PGMMCHUNK pNext = pChunk->pFreeNext;
2575	iPage = gmmR0AllocatePagesFromChunk(pChunk, pGVM->hSelf, iPage, cPages, paPages);
2576	if (iPage >= cPages)
2577	return iPage;
2578	pChunk = pNext;
2579	}
2580	}
2581	return iPage;
2582	}
2583
2584
2585	/**
2586	* Checks if we should start picking pages from chunks of other VMs because
2587	* we're getting close to the system memory or reserved limit.
2588	*
2589	* @returns @c true if we should, @c false if we should first try allocate more
2590	* chunks.
2591	*/
2592	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(PGVM pGVM)
2593	{
2594	/*
2595	* Don't allocate a new chunk if we're
2596	*/
2597	uint64_t cPgReserved = pGVM->gmm.s.Stats.Reserved.cBasePages
2598	+ pGVM->gmm.s.Stats.Reserved.cFixedPages
2599	- pGVM->gmm.s.Stats.cBalloonedPages
2600	/** @todo what about shared pages? */;
2601	uint64_t cPgAllocated = pGVM->gmm.s.Stats.Allocated.cBasePages
2602	+ pGVM->gmm.s.Stats.Allocated.cFixedPages;
2603	uint64_t cPgDelta = cPgReserved - cPgAllocated;
2604	if (cPgDelta < GMM_CHUNK_NUM_PAGES * 4)
2605	return true;
2606	/** @todo make the threshold configurable, also test the code to see if
2607	* this ever kicks in (we might be reserving too much or smth). */
2608
2609	/*
2610	* Check how close we're to the max memory limit and how many fragments
2611	* there are?...
2612	*/
2613	/** @todo */
2614
2615	return false;
2616	}
2617
2618
2619	/**
2620	* Checks if we should start picking pages from chunks of other VMs because
2621	* there is a lot of free pages around.
2622	*
2623	* @returns @c true if we should, @c false if we should first try allocate more
2624	* chunks.
2625	*/
2626	static bool gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(PGMM pGMM)
2627	{
2628	/*
2629	* Setting the limit at 16 chunks (32 MB) at the moment.
2630	*/
2631	if (pGMM->PrivateX.cFreePages >= GMM_CHUNK_NUM_PAGES * 16)
2632	return true;
2633	return false;
2634	}
2635
2636
2637	/**
2638	* Common worker for GMMR0AllocateHandyPages and GMMR0AllocatePages.
2639	*
2640	* @returns VBox status code:
2641	* @retval VINF_SUCCESS on success.
2642	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk or
2643	* gmmR0AllocateMoreChunks is necessary.
2644	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2645	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2646	* that is we're trying to allocate more than we've reserved.
2647	*
2648	* @param pGMM Pointer to the GMM instance data.
2649	* @param pGVM Pointer to the VM.
2650	* @param cPages The number of pages to allocate.
2651	* @param paPages Pointer to the page descriptors. See GMMPAGEDESC for
2652	* details on what is expected on input.
2653	* @param enmAccount The account to charge.
2654	*
2655	* @remarks Call takes the giant GMM lock.
2656	*/
2657	static int gmmR0AllocatePagesNew(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
2658	{
2659	Assert(pGMM->hMtxOwner == RTThreadNativeSelf());
2660
2661	/*
2662	* Check allocation limits.
2663	*/
2664	if (RT_UNLIKELY(pGMM->cAllocatedPages + cPages > pGMM->cMaxPages))
2665	return VERR_GMM_HIT_GLOBAL_LIMIT;
2666
2667	switch (enmAccount)
2668	{
2669	case GMMACCOUNT_BASE:
2670	if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
2671	> pGVM->gmm.s.Stats.Reserved.cBasePages))
2672	{
2673	Log(("gmmR0AllocatePages:Base: Reserved=%#llx Allocated+Ballooned+Requested=%#llx+%#llx+%#x!\n",
2674	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages,
2675	pGVM->gmm.s.Stats.cBalloonedPages, cPages));
2676	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2677	}
2678	break;
2679	case GMMACCOUNT_SHADOW:
2680	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages + cPages > pGVM->gmm.s.Stats.Reserved.cShadowPages))
2681	{
2682	Log(("gmmR0AllocatePages:Shadow: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2683	pGVM->gmm.s.Stats.Reserved.cShadowPages, pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
2684	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2685	}
2686	break;
2687	case GMMACCOUNT_FIXED:
2688	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages + cPages > pGVM->gmm.s.Stats.Reserved.cFixedPages))
2689	{
2690	Log(("gmmR0AllocatePages:Fixed: Reserved=%#x Allocated+Requested=%#x+%#x!\n",
2691	pGVM->gmm.s.Stats.Reserved.cFixedPages, pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
2692	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
2693	}
2694	break;
2695	default:
2696	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2697	}
2698
2699	#ifdef GMM_WITH_LEGACY_MODE
2700	/*
2701	* If we're in legacy memory mode, it's easy to figure if we have
2702	* sufficient number of pages up-front.
2703	*/
2704	if ( pGMM->fLegacyAllocationMode
2705	&& pGVM->gmm.s.Private.cFreePages < cPages)
2706	{
2707	Assert(pGMM->fBoundMemoryMode);
2708	return VERR_GMM_SEED_ME;
2709	}
2710	#endif
2711
2712	/*
2713	* Update the accounts before we proceed because we might be leaving the
2714	* protection of the global mutex and thus run the risk of permitting
2715	* too much memory to be allocated.
2716	*/
2717	switch (enmAccount)
2718	{
2719	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages += cPages; break;
2720	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages += cPages; break;
2721	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages += cPages; break;
2722	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2723	}
2724	pGVM->gmm.s.Stats.cPrivatePages += cPages;
2725	pGMM->cAllocatedPages += cPages;
2726
2727	#ifdef GMM_WITH_LEGACY_MODE
2728	/*
2729	* Part two of it's-easy-in-legacy-memory-mode.
2730	*/
2731	if (pGMM->fLegacyAllocationMode)
2732	{
2733	uint32_t iPage = gmmR0AllocatePagesInBoundMode(pGVM, 0, cPages, paPages);
2734	AssertReleaseReturn(iPage == cPages, VERR_GMM_ALLOC_PAGES_IPE);
2735	return VINF_SUCCESS;
2736	}
2737	#endif
2738
2739	/*
2740	* Bound mode is also relatively straightforward.
2741	*/
2742	uint32_t iPage = 0;
2743	int rc = VINF_SUCCESS;
2744	if (pGMM->fBoundMemoryMode)
2745	{
2746	iPage = gmmR0AllocatePagesInBoundMode(pGVM, iPage, cPages, paPages);
2747	if (iPage < cPages)
2748	do
2749	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGVM->gmm.s.Private, cPages, paPages, &iPage);
2750	while (iPage < cPages && RT_SUCCESS(rc));
2751	}
2752	/*
2753	* Shared mode is trickier as we should try archive the same locality as
2754	* in bound mode, but smartly make use of non-full chunks allocated by
2755	* other VMs if we're low on memory.
2756	*/
2757	else
2758	{
2759	/* Pick the most optimal pages first. */
2760	iPage = gmmR0AllocatePagesAssociatedWithVM(pGMM, pGVM, &pGMM->PrivateX, iPage, cPages, paPages);
2761	if (iPage < cPages)
2762	{
2763	/* Maybe we should try getting pages from chunks "belonging" to
2764	other VMs before allocating more chunks? */
2765	bool fTriedOnSameAlready = false;
2766	if (gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLimits(pGVM))
2767	{
2768	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2769	fTriedOnSameAlready = true;
2770	}
2771
2772	/* Allocate memory from empty chunks. */
2773	if (iPage < cPages)
2774	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2775
2776	/* Grab empty shared chunks. */
2777	if (iPage < cPages)
2778	iPage = gmmR0AllocatePagesFromEmptyChunksOnSameNode(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2779
2780	/* If there is a lof of free pages spread around, try not waste
2781	system memory on more chunks. (Should trigger defragmentation.) */
2782	if ( !fTriedOnSameAlready
2783	&& gmmR0ShouldAllocatePagesInOtherChunksBecauseOfLotsFree(pGMM))
2784	{
2785	iPage = gmmR0AllocatePagesFromSameNode(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2786	if (iPage < cPages)
2787	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2788	}
2789
2790	/*
2791	* Ok, try allocate new chunks.
2792	*/
2793	if (iPage < cPages)
2794	{
2795	do
2796	rc = gmmR0AllocateChunkNew(pGMM, pGVM, &pGMM->PrivateX, cPages, paPages, &iPage);
2797	while (iPage < cPages && RT_SUCCESS(rc));
2798
2799	/* If the host is out of memory, take whatever we can get. */
2800	if ( (rc == VERR_NO_MEMORY \|\| rc == VERR_NO_PHYS_MEMORY)
2801	&& pGMM->PrivateX.cFreePages + pGMM->Shared.cFreePages >= cPages - iPage)
2802	{
2803	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->PrivateX, pGVM, iPage, cPages, paPages);
2804	if (iPage < cPages)
2805	iPage = gmmR0AllocatePagesIndiscriminately(&pGMM->Shared, pGVM, iPage, cPages, paPages);
2806	AssertRelease(iPage == cPages);
2807	rc = VINF_SUCCESS;
2808	}
2809	}
2810	}
2811	}
2812
2813	/*
2814	* Clean up on failure. Since this is bound to be a low-memory condition
2815	* we will give back any empty chunks that might be hanging around.
2816	*/
2817	if (RT_FAILURE(rc))
2818	{
2819	/* Update the statistics. */
2820	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
2821	pGMM->cAllocatedPages -= cPages - iPage;
2822	switch (enmAccount)
2823	{
2824	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages; break;
2825	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= cPages; break;
2826	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= cPages; break;
2827	default: AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
2828	}
2829
2830	/* Release the pages. */
2831	while (iPage-- > 0)
2832	{
2833	uint32_t idPage = paPages[iPage].idPage;
2834	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
2835	if (RT_LIKELY(pPage))
2836	{
2837	Assert(GMM_PAGE_IS_PRIVATE(pPage));
2838	Assert(pPage->Private.hGVM == pGVM->hSelf);
2839	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
2840	}
2841	else
2842	AssertMsgFailed(("idPage=%#x\n", idPage));
2843
2844	paPages[iPage].idPage = NIL_GMM_PAGEID;
2845	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
2846	paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2847	}
2848
2849	/* Free empty chunks. */
2850	/** @todo */
2851
2852	/* return the fail status on failure */
2853	return rc;
2854	}
2855	return VINF_SUCCESS;
2856	}
2857
2858
2859	/**
2860	* Updates the previous allocations and allocates more pages.
2861	*
2862	* The handy pages are always taken from the 'base' memory account.
2863	* The allocated pages are not cleared and will contains random garbage.
2864	*
2865	* @returns VBox status code:
2866	* @retval VINF_SUCCESS on success.
2867	* @retval VERR_NOT_OWNER if the caller is not an EMT.
2868	* @retval VERR_GMM_PAGE_NOT_FOUND if one of the pages to update wasn't found.
2869	* @retval VERR_GMM_PAGE_NOT_PRIVATE if one of the pages to update wasn't a
2870	* private page.
2871	* @retval VERR_GMM_PAGE_NOT_SHARED if one of the pages to update wasn't a
2872	* shared page.
2873	* @retval VERR_GMM_NOT_PAGE_OWNER if one of the pages to be updated wasn't
2874	* owned by the VM.
2875	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
2876	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
2877	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
2878	* that is we're trying to allocate more than we've reserved.
2879	*
2880	* @param pGVM The global (ring-0) VM structure.
2881	* @param idCpu The VCPU id.
2882	* @param cPagesToUpdate The number of pages to update (starting from the head).
2883	* @param cPagesToAlloc The number of pages to allocate (starting from the head).
2884	* @param paPages The array of page descriptors.
2885	* See GMMPAGEDESC for details on what is expected on input.
2886	* @thread EMT(idCpu)
2887	*/
2888	GMMR0DECL(int) GMMR0AllocateHandyPages(PGVM pGVM, VMCPUID idCpu, uint32_t cPagesToUpdate,
2889	uint32_t cPagesToAlloc, PGMMPAGEDESC paPages)
2890	{
2891	LogFlow(("GMMR0AllocateHandyPages: pGVM=%p cPagesToUpdate=%#x cPagesToAlloc=%#x paPages=%p\n",
2892	pGVM, cPagesToUpdate, cPagesToAlloc, paPages));
2893
2894	/*
2895	* Validate, get basics and take the semaphore.
2896	* (This is a relatively busy path, so make predictions where possible.)
2897	*/
2898	PGMM pGMM;
2899	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
2900	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
2901	if (RT_FAILURE(rc))
2902	return rc;
2903
2904	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
2905	AssertMsgReturn( (cPagesToUpdate && cPagesToUpdate < 1024)
2906	\|\| (cPagesToAlloc && cPagesToAlloc < 1024),
2907	("cPagesToUpdate=%#x cPagesToAlloc=%#x\n", cPagesToUpdate, cPagesToAlloc),
2908	VERR_INVALID_PARAMETER);
2909
2910	unsigned iPage = 0;
2911	for (; iPage < cPagesToUpdate; iPage++)
2912	{
2913	AssertMsgReturn( ( paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
2914	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK))
2915	\|\| paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
2916	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE,
2917	("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys),
2918	VERR_INVALID_PARAMETER);
2919	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2920	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
2921	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2922	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
2923	/\|\| paPages[iPage].idSharedPage == NIL_GMM_PAGEID/,
2924	("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2925	}
2926
2927	for (; iPage < cPagesToAlloc; iPage++)
2928	{
2929	AssertMsgReturn(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS, ("#%#x: %RHp\n", iPage, paPages[iPage].HCPhysGCPhys), VERR_INVALID_PARAMETER);
2930	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
2931	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
2932	}
2933
2934	gmmR0MutexAcquire(pGMM);
2935	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
2936	{
2937	/* No allocations before the initial reservation has been made! */
2938	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
2939	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
2940	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
2941	{
2942	/*
2943	* Perform the updates.
2944	* Stop on the first error.
2945	*/
2946	for (iPage = 0; iPage < cPagesToUpdate; iPage++)
2947	{
2948	if (paPages[iPage].idPage != NIL_GMM_PAGEID)
2949	{
2950	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idPage);
2951	if (RT_LIKELY(pPage))
2952	{
2953	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
2954	{
2955	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
2956	{
2957	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2958	if (RT_LIKELY(paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST))
2959	pPage->Private.pfn = paPages[iPage].HCPhysGCPhys >> PAGE_SHIFT;
2960	else if (paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE)
2961	pPage->Private.pfn = GMM_PAGE_PFN_UNSHAREABLE;
2962	/* else: NIL_RTHCPHYS nothing */
2963
2964	paPages[iPage].idPage = NIL_GMM_PAGEID;
2965	paPages[iPage].HCPhysGCPhys = NIL_RTHCPHYS;
2966	}
2967	else
2968	{
2969	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not owner! hGVM=%#x hSelf=%#x\n",
2970	iPage, paPages[iPage].idPage, pPage->Private.hGVM, pGVM->hSelf));
2971	rc = VERR_GMM_NOT_PAGE_OWNER;
2972	break;
2973	}
2974	}
2975	else
2976	{
2977	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not private! %.Rhxs (type %d)\n", iPage, paPages[iPage].idPage, sizeof(pPage), pPage, pPage->Common.u2State));
2978	rc = VERR_GMM_PAGE_NOT_PRIVATE;
2979	break;
2980	}
2981	}
2982	else
2983	{
2984	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (private)\n", iPage, paPages[iPage].idPage));
2985	rc = VERR_GMM_PAGE_NOT_FOUND;
2986	break;
2987	}
2988	}
2989
2990	if (paPages[iPage].idSharedPage != NIL_GMM_PAGEID)
2991	{
2992	PGMMPAGE pPage = gmmR0GetPage(pGMM, paPages[iPage].idSharedPage);
2993	if (RT_LIKELY(pPage))
2994	{
2995	if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
2996	{
2997	AssertCompile(NIL_RTHCPHYS > GMM_GCPHYS_LAST && GMM_GCPHYS_UNSHAREABLE > GMM_GCPHYS_LAST);
2998	Assert(pPage->Shared.cRefs);
2999	Assert(pGVM->gmm.s.Stats.cSharedPages);
3000	Assert(pGVM->gmm.s.Stats.Allocated.cBasePages);
3001
3002	Log(("GMMR0AllocateHandyPages: free shared page %x cRefs=%d\n", paPages[iPage].idSharedPage, pPage->Shared.cRefs));
3003	pGVM->gmm.s.Stats.cSharedPages--;
3004	pGVM->gmm.s.Stats.Allocated.cBasePages--;
3005	if (!--pPage->Shared.cRefs)
3006	gmmR0FreeSharedPage(pGMM, pGVM, paPages[iPage].idSharedPage, pPage);
3007	else
3008	{
3009	Assert(pGMM->cDuplicatePages);
3010	pGMM->cDuplicatePages--;
3011	}
3012
3013	paPages[iPage].idSharedPage = NIL_GMM_PAGEID;
3014	}
3015	else
3016	{
3017	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not shared!\n", iPage, paPages[iPage].idSharedPage));
3018	rc = VERR_GMM_PAGE_NOT_SHARED;
3019	break;
3020	}
3021	}
3022	else
3023	{
3024	Log(("GMMR0AllocateHandyPages: #%#x/%#x: Not found! (shared)\n", iPage, paPages[iPage].idSharedPage));
3025	rc = VERR_GMM_PAGE_NOT_FOUND;
3026	break;
3027	}
3028	}
3029	} /* for each page to update */
3030
3031	if (RT_SUCCESS(rc) && cPagesToAlloc > 0)
3032	{
3033	#if defined(VBOX_STRICT) && 0 /** @todo re-test this later. Appeared to be a PGM init bug. */
3034	for (iPage = 0; iPage < cPagesToAlloc; iPage++)
3035	{
3036	Assert(paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS);
3037	Assert(paPages[iPage].idPage == NIL_GMM_PAGEID);
3038	Assert(paPages[iPage].idSharedPage == NIL_GMM_PAGEID);
3039	}
3040	#endif
3041
3042	/*
3043	* Join paths with GMMR0AllocatePages for the allocation.
3044	* Note! gmmR0AllocateMoreChunks may leave the protection of the mutex!
3045	*/
3046	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPagesToAlloc, paPages, GMMACCOUNT_BASE);
3047	}
3048	}
3049	else
3050	rc = VERR_WRONG_ORDER;
3051	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3052	}
3053	else
3054	rc = VERR_GMM_IS_NOT_SANE;
3055	gmmR0MutexRelease(pGMM);
3056	LogFlow(("GMMR0AllocateHandyPages: returns %Rrc\n", rc));
3057	return rc;
3058	}
3059
3060
3061	/**
3062	* Allocate one or more pages.
3063	*
3064	* This is typically used for ROMs and MMIO2 (VRAM) during VM creation.
3065	* The allocated pages are not cleared and will contain random garbage.
3066	*
3067	* @returns VBox status code:
3068	* @retval VINF_SUCCESS on success.
3069	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3070	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
3071	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3072	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3073	* that is we're trying to allocate more than we've reserved.
3074	*
3075	* @param pGVM The global (ring-0) VM structure.
3076	* @param idCpu The VCPU id.
3077	* @param cPages The number of pages to allocate.
3078	* @param paPages Pointer to the page descriptors.
3079	* See GMMPAGEDESC for details on what is expected on
3080	* input.
3081	* @param enmAccount The account to charge.
3082	*
3083	* @thread EMT.
3084	*/
3085	GMMR0DECL(int) GMMR0AllocatePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMPAGEDESC paPages, GMMACCOUNT enmAccount)
3086	{
3087	LogFlow(("GMMR0AllocatePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3088
3089	/*
3090	* Validate, get basics and take the semaphore.
3091	*/
3092	PGMM pGMM;
3093	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3094	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3095	if (RT_FAILURE(rc))
3096	return rc;
3097
3098	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3099	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3100	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3101
3102	for (unsigned iPage = 0; iPage < cPages; iPage++)
3103	{
3104	AssertMsgReturn( paPages[iPage].HCPhysGCPhys == NIL_RTHCPHYS
3105	\|\| paPages[iPage].HCPhysGCPhys == GMM_GCPHYS_UNSHAREABLE
3106	\|\| ( enmAccount == GMMACCOUNT_BASE
3107	&& paPages[iPage].HCPhysGCPhys <= GMM_GCPHYS_LAST
3108	&& !(paPages[iPage].HCPhysGCPhys & PAGE_OFFSET_MASK)),
3109	("#%#x: %RHp enmAccount=%d\n", iPage, paPages[iPage].HCPhysGCPhys, enmAccount),
3110	VERR_INVALID_PARAMETER);
3111	AssertMsgReturn(paPages[iPage].idPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3112	AssertMsgReturn(paPages[iPage].idSharedPage == NIL_GMM_PAGEID, ("#%#x: %#x\n", iPage, paPages[iPage].idSharedPage), VERR_INVALID_PARAMETER);
3113	}
3114
3115	gmmR0MutexAcquire(pGMM);
3116	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3117	{
3118
3119	/* No allocations before the initial reservation has been made! */
3120	if (RT_LIKELY( pGVM->gmm.s.Stats.Reserved.cBasePages
3121	&& pGVM->gmm.s.Stats.Reserved.cFixedPages
3122	&& pGVM->gmm.s.Stats.Reserved.cShadowPages))
3123	rc = gmmR0AllocatePagesNew(pGMM, pGVM, cPages, paPages, enmAccount);
3124	else
3125	rc = VERR_WRONG_ORDER;
3126	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3127	}
3128	else
3129	rc = VERR_GMM_IS_NOT_SANE;
3130	gmmR0MutexRelease(pGMM);
3131	LogFlow(("GMMR0AllocatePages: returns %Rrc\n", rc));
3132	return rc;
3133	}
3134
3135
3136	/**
3137	* VMMR0 request wrapper for GMMR0AllocatePages.
3138	*
3139	* @returns see GMMR0AllocatePages.
3140	* @param pGVM The global (ring-0) VM structure.
3141	* @param idCpu The VCPU id.
3142	* @param pReq Pointer to the request packet.
3143	*/
3144	GMMR0DECL(int) GMMR0AllocatePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMALLOCATEPAGESREQ pReq)
3145	{
3146	/*
3147	* Validate input and pass it on.
3148	*/
3149	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3150	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0]),
3151	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMALLOCATEPAGESREQ, aPages[0])),
3152	VERR_INVALID_PARAMETER);
3153	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages]),
3154	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMALLOCATEPAGESREQ, aPages[pReq->cPages])),
3155	VERR_INVALID_PARAMETER);
3156
3157	return GMMR0AllocatePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3158	}
3159
3160
3161	/**
3162	* Allocate a large page to represent guest RAM
3163	*
3164	* The allocated pages are not cleared and will contains random garbage.
3165	*
3166	* @returns VBox status code:
3167	* @retval VINF_SUCCESS on success.
3168	* @retval VERR_NOT_OWNER if the caller is not an EMT.
3169	* @retval VERR_GMM_SEED_ME if seeding via GMMR0SeedChunk is necessary.
3170	* @retval VERR_GMM_HIT_GLOBAL_LIMIT if we've exhausted the available pages.
3171	* @retval VERR_GMM_HIT_VM_ACCOUNT_LIMIT if we've hit the VM account limit,
3172	* that is we're trying to allocate more than we've reserved.
3173	* @returns see GMMR0AllocatePages.
3174	*
3175	* @param pGVM The global (ring-0) VM structure.
3176	* @param idCpu The VCPU id.
3177	* @param cbPage Large page size.
3178	* @param pIdPage Where to return the GMM page ID of the page.
3179	* @param pHCPhys Where to return the host physical address of the page.
3180	*/
3181	GMMR0DECL(int) GMMR0AllocateLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t cbPage, uint32_t pIdPage, RTHCPHYS pHCPhys)
3182	{
3183	LogFlow(("GMMR0AllocateLargePage: pGVM=%p cbPage=%x\n", pGVM, cbPage));
3184
3185	AssertReturn(cbPage == GMM_CHUNK_SIZE, VERR_INVALID_PARAMETER);
3186	AssertPtrReturn(pIdPage, VERR_INVALID_PARAMETER);
3187	AssertPtrReturn(pHCPhys, VERR_INVALID_PARAMETER);
3188
3189	/*
3190	* Validate, get basics and take the semaphore.
3191	*/
3192	PGMM pGMM;
3193	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3194	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3195	if (RT_FAILURE(rc))
3196	return rc;
3197
3198	#ifdef GMM_WITH_LEGACY_MODE
3199	// /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3200	// if (pGMM->fLegacyAllocationMode)
3201	// return VERR_NOT_SUPPORTED;
3202	#endif
3203
3204	*pHCPhys = NIL_RTHCPHYS;
3205	*pIdPage = NIL_GMM_PAGEID;
3206
3207	gmmR0MutexAcquire(pGMM);
3208	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3209	{
3210	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3211	if (RT_UNLIKELY( pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cPages
3212	> pGVM->gmm.s.Stats.Reserved.cBasePages))
3213	{
3214	Log(("GMMR0AllocateLargePage: Reserved=%#llx Allocated+Requested=%#llx+%#x!\n",
3215	pGVM->gmm.s.Stats.Reserved.cBasePages, pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3216	gmmR0MutexRelease(pGMM);
3217	return VERR_GMM_HIT_VM_ACCOUNT_LIMIT;
3218	}
3219
3220	/*
3221	* Allocate a new large page chunk.
3222	*
3223	* Note! We leave the giant GMM lock temporarily as the allocation might
3224	* take a long time. gmmR0RegisterChunk will retake it (ugly).
3225	*/
3226	AssertCompile(GMM_CHUNK_SIZE == _2M);
3227	gmmR0MutexRelease(pGMM);
3228
3229	RTR0MEMOBJ hMemObj;
3230	rc = RTR0MemObjAllocPhysEx(&hMemObj, GMM_CHUNK_SIZE, NIL_RTHCPHYS, GMM_CHUNK_SIZE);
3231	if (RT_SUCCESS(rc))
3232	{
3233	PGMMCHUNKFREESET pSet = pGMM->fBoundMemoryMode ? &pGVM->gmm.s.Private : &pGMM->PrivateX;
3234	PGMMCHUNK pChunk;
3235	rc = gmmR0RegisterChunk(pGMM, pSet, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_LARGE_PAGE, &pChunk);
3236	if (RT_SUCCESS(rc))
3237	{
3238	/*
3239	* Allocate all the pages in the chunk.
3240	*/
3241	/* Unlink the new chunk from the free list. */
3242	gmmR0UnlinkChunk(pChunk);
3243
3244	/** @todo rewrite this to skip the looping. */
3245	/* Allocate all pages. */
3246	GMMPAGEDESC PageDesc;
3247	gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3248
3249	/* Return the first page as we'll use the whole chunk as one big page. */
3250	*pIdPage = PageDesc.idPage;
3251	*pHCPhys = PageDesc.HCPhysGCPhys;
3252
3253	for (unsigned i = 1; i < cPages; i++)
3254	gmmR0AllocatePage(pChunk, pGVM->hSelf, &PageDesc);
3255
3256	/* Update accounting. */
3257	pGVM->gmm.s.Stats.Allocated.cBasePages += cPages;
3258	pGVM->gmm.s.Stats.cPrivatePages += cPages;
3259	pGMM->cAllocatedPages += cPages;
3260
3261	gmmR0LinkChunk(pChunk, pSet);
3262	gmmR0MutexRelease(pGMM);
3263	LogFlow(("GMMR0AllocateLargePage: returns VINF_SUCCESS\n"));
3264	return VINF_SUCCESS;
3265	}
3266	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3267	}
3268	}
3269	else
3270	{
3271	gmmR0MutexRelease(pGMM);
3272	rc = VERR_GMM_IS_NOT_SANE;
3273	}
3274
3275	LogFlow(("GMMR0AllocateLargePage: returns %Rrc\n", rc));
3276	return rc;
3277	}
3278
3279
3280	/**
3281	* Free a large page.
3282	*
3283	* @returns VBox status code:
3284	* @param pGVM The global (ring-0) VM structure.
3285	* @param idCpu The VCPU id.
3286	* @param idPage The large page id.
3287	*/
3288	GMMR0DECL(int) GMMR0FreeLargePage(PGVM pGVM, VMCPUID idCpu, uint32_t idPage)
3289	{
3290	LogFlow(("GMMR0FreeLargePage: pGVM=%p idPage=%x\n", pGVM, idPage));
3291
3292	/*
3293	* Validate, get basics and take the semaphore.
3294	*/
3295	PGMM pGMM;
3296	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3297	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3298	if (RT_FAILURE(rc))
3299	return rc;
3300
3301	#ifdef GMM_WITH_LEGACY_MODE
3302	// /* Not supported in legacy mode where we allocate the memory in ring 3 and lock it in ring 0. */
3303	// if (pGMM->fLegacyAllocationMode)
3304	// return VERR_NOT_SUPPORTED;
3305	#endif
3306
3307	gmmR0MutexAcquire(pGMM);
3308	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3309	{
3310	const unsigned cPages = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
3311
3312	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3313	{
3314	Log(("GMMR0FreeLargePage: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3315	gmmR0MutexRelease(pGMM);
3316	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3317	}
3318
3319	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3320	if (RT_LIKELY( pPage
3321	&& GMM_PAGE_IS_PRIVATE(pPage)))
3322	{
3323	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3324	Assert(pChunk);
3325	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3326	Assert(pChunk->cPrivate > 0);
3327
3328	/* Release the memory immediately. */
3329	gmmR0FreeChunk(pGMM, NULL, pChunk, false /fRelaxedSem/); /** @todo this can be relaxed too! */
3330
3331	/* Update accounting. */
3332	pGVM->gmm.s.Stats.Allocated.cBasePages -= cPages;
3333	pGVM->gmm.s.Stats.cPrivatePages -= cPages;
3334	pGMM->cAllocatedPages -= cPages;
3335	}
3336	else
3337	rc = VERR_GMM_PAGE_NOT_FOUND;
3338	}
3339	else
3340	rc = VERR_GMM_IS_NOT_SANE;
3341
3342	gmmR0MutexRelease(pGMM);
3343	LogFlow(("GMMR0FreeLargePage: returns %Rrc\n", rc));
3344	return rc;
3345	}
3346
3347
3348	/**
3349	* VMMR0 request wrapper for GMMR0FreeLargePage.
3350	*
3351	* @returns see GMMR0FreeLargePage.
3352	* @param pGVM The global (ring-0) VM structure.
3353	* @param idCpu The VCPU id.
3354	* @param pReq Pointer to the request packet.
3355	*/
3356	GMMR0DECL(int) GMMR0FreeLargePageReq(PGVM pGVM, VMCPUID idCpu, PGMMFREELARGEPAGEREQ pReq)
3357	{
3358	/*
3359	* Validate input and pass it on.
3360	*/
3361	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3362	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMFREEPAGESREQ),
3363	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(GMMFREEPAGESREQ)),
3364	VERR_INVALID_PARAMETER);
3365
3366	return GMMR0FreeLargePage(pGVM, idCpu, pReq->idPage);
3367	}
3368
3369
3370	/**
3371	* @callback_method_impl{FNGVMMR0ENUMCALLBACK,
3372	* Used by gmmR0FreeChunkFlushPerVmTlbs().}
3373	*/
3374	static DECLCALLBACK(int) gmmR0InvalidatePerVmChunkTlbCallback(PGVM pGVM, void *pvUser)
3375	{
3376	RT_NOREF(pvUser);
3377	if (pGVM->gmm.s.hChunkTlbSpinLock != NIL_RTSPINLOCK)
3378	{
3379	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
3380	uintptr_t i = RT_ELEMENTS(pGVM->gmm.s.aChunkTlbEntries);
3381	while (i-- > 0)
3382	{
3383	pGVM->gmm.s.aChunkTlbEntries[i].idGeneration = UINT64_MAX;
3384	pGVM->gmm.s.aChunkTlbEntries[i].pChunk = NULL;
3385	}
3386	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
3387	}
3388	return VINF_SUCCESS;
3389	}
3390
3391
3392	/**
3393	* Called by gmmR0FreeChunk when we reach the threshold for wrapping around the
3394	* free generation ID value.
3395	*
3396	* This is done at 2^62 - 1, which allows us to drop all locks and as it will
3397	* take a while before 12 exa (2 305 843 009 213 693 952) calls to
3398	* gmmR0FreeChunk can be made and causes a real wrap-around. We do two
3399	* invalidation passes and resets the generation ID between then. This will
3400	* make sure there are no false positives.
3401	*
3402	* @param pGMM Pointer to the GMM instance.
3403	*/
3404	static void gmmR0FreeChunkFlushPerVmTlbs(PGMM pGMM)
3405	{
3406	/*
3407	* First invalidation pass.
3408	*/
3409	int rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3410	AssertRCSuccess(rc);
3411
3412	/*
3413	* Reset the generation number.
3414	*/
3415	RTSpinlockAcquire(pGMM->hSpinLockTree);
3416	ASMAtomicWriteU64(&pGMM->idFreeGeneration, 1);
3417	RTSpinlockRelease(pGMM->hSpinLockTree);
3418
3419	/*
3420	* Second invalidation pass.
3421	*/
3422	rc = GVMMR0EnumVMs(gmmR0InvalidatePerVmChunkTlbCallback, NULL);
3423	AssertRCSuccess(rc);
3424	}
3425
3426
3427	/**
3428	* Frees a chunk, giving it back to the host OS.
3429	*
3430	* @param pGMM Pointer to the GMM instance.
3431	* @param pGVM This is set when called from GMMR0CleanupVM so we can
3432	* unmap and free the chunk in one go.
3433	* @param pChunk The chunk to free.
3434	* @param fRelaxedSem Whether we can release the semaphore while doing the
3435	* freeing (@c true) or not.
3436	*/
3437	static bool gmmR0FreeChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
3438	{
3439	Assert(pChunk->Core.Key != NIL_GMM_CHUNKID);
3440
3441	GMMR0CHUNKMTXSTATE MtxState;
3442	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
3443
3444	/*
3445	* Cleanup hack! Unmap the chunk from the callers address space.
3446	* This shouldn't happen, so screw lock contention...
3447	*/
3448	if ( pChunk->cMappingsX
3449	#ifdef GMM_WITH_LEGACY_MODE
3450	&& (!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
3451	#endif
3452	&& pGVM)
3453	gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
3454
3455	/*
3456	* If there are current mappings of the chunk, then request the
3457	* VMs to unmap them. Reposition the chunk in the free list so
3458	* it won't be a likely candidate for allocations.
3459	*/
3460	if (pChunk->cMappingsX)
3461	{
3462	/** @todo R0 -> VM request */
3463	/* The chunk can be mapped by more than one VM if fBoundMemoryMode is false! */
3464	Log(("gmmR0FreeChunk: chunk still has %d mappings; don't free!\n", pChunk->cMappingsX));
3465	gmmR0ChunkMutexRelease(&MtxState, pChunk);
3466	return false;
3467	}
3468
3469
3470	/*
3471	* Save and trash the handle.
3472	*/
3473	RTR0MEMOBJ const hMemObj = pChunk->hMemObj;
3474	pChunk->hMemObj = NIL_RTR0MEMOBJ;
3475
3476	/*
3477	* Unlink it from everywhere.
3478	*/
3479	gmmR0UnlinkChunk(pChunk);
3480
3481	RTSpinlockAcquire(pGMM->hSpinLockTree);
3482
3483	RTListNodeRemove(&pChunk->ListNode);
3484
3485	PAVLU32NODECORE pCore = RTAvlU32Remove(&pGMM->pChunks, pChunk->Core.Key);
3486	Assert(pCore == &pChunk->Core); NOREF(pCore);
3487
3488	PGMMCHUNKTLBE pTlbe = &pGMM->ChunkTLB.aEntries[GMM_CHUNKTLB_IDX(pChunk->Core.Key)];
3489	if (pTlbe->pChunk == pChunk)
3490	{
3491	pTlbe->idChunk = NIL_GMM_CHUNKID;
3492	pTlbe->pChunk = NULL;
3493	}
3494
3495	Assert(pGMM->cChunks > 0);
3496	pGMM->cChunks--;
3497
3498	uint64_t const idFreeGeneration = ASMAtomicIncU64(&pGMM->idFreeGeneration);
3499
3500	RTSpinlockRelease(pGMM->hSpinLockTree);
3501
3502	/*
3503	* Free the Chunk ID before dropping the locks and freeing the rest.
3504	*/
3505	gmmR0FreeChunkId(pGMM, pChunk->Core.Key);
3506	pChunk->Core.Key = NIL_GMM_CHUNKID;
3507
3508	pGMM->cFreedChunks++;
3509
3510	gmmR0ChunkMutexRelease(&MtxState, NULL);
3511	if (fRelaxedSem)
3512	gmmR0MutexRelease(pGMM);
3513
3514	if (idFreeGeneration == UINT64_MAX / 4)
3515	gmmR0FreeChunkFlushPerVmTlbs(pGMM);
3516
3517	RTMemFree(pChunk->paMappingsX);
3518	pChunk->paMappingsX = NULL;
3519
3520	RTMemFree(pChunk);
3521
3522	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
3523	int rc = RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
3524	#else
3525	int rc = RTR0MemObjFree(hMemObj, false /* fFreeMappings */);
3526	#endif
3527	AssertLogRelRC(rc);
3528
3529	if (fRelaxedSem)
3530	gmmR0MutexAcquire(pGMM);
3531	return fRelaxedSem;
3532	}
3533
3534
3535	/**
3536	* Free page worker.
3537	*
3538	* The caller does all the statistic decrementing, we do all the incrementing.
3539	*
3540	* @param pGMM Pointer to the GMM instance data.
3541	* @param pGVM Pointer to the GVM instance.
3542	* @param pChunk Pointer to the chunk this page belongs to.
3543	* @param idPage The Page ID.
3544	* @param pPage Pointer to the page.
3545	*/
3546	static void gmmR0FreePageWorker(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint32_t idPage, PGMMPAGE pPage)
3547	{
3548	Log3(("F pPage=%p iPage=%#x/%#x u2State=%d iFreeHead=%#x\n",
3549	pPage, pPage - &pChunk->aPages[0], idPage, pPage->Common.u2State, pChunk->iFreeHead)); NOREF(idPage);
3550
3551	/*
3552	* Put the page on the free list.
3553	*/
3554	pPage->u = 0;
3555	pPage->Free.u2State = GMM_PAGE_STATE_FREE;
3556	Assert(pChunk->iFreeHead < RT_ELEMENTS(pChunk->aPages) \|\| pChunk->iFreeHead == UINT16_MAX);
3557	pPage->Free.iNext = pChunk->iFreeHead;
3558	pChunk->iFreeHead = pPage - &pChunk->aPages[0];
3559
3560	/*
3561	* Update statistics (the cShared/cPrivate stats are up to date already),
3562	* and relink the chunk if necessary.
3563	*/
3564	unsigned const cFree = pChunk->cFree;
3565	if ( !cFree
3566	\|\| gmmR0SelectFreeSetList(cFree) != gmmR0SelectFreeSetList(cFree + 1))
3567	{
3568	gmmR0UnlinkChunk(pChunk);
3569	pChunk->cFree++;
3570	gmmR0SelectSetAndLinkChunk(pGMM, pGVM, pChunk);
3571	}
3572	else
3573	{
3574	pChunk->cFree = cFree + 1;
3575	pChunk->pSet->cFreePages++;
3576	}
3577
3578	/*
3579	* If the chunk becomes empty, consider giving memory back to the host OS.
3580	*
3581	* The current strategy is to try give it back if there are other chunks
3582	* in this free list, meaning if there are at least 240 free pages in this
3583	* category. Note that since there are probably mappings of the chunk,
3584	* it won't be freed up instantly, which probably screws up this logic
3585	* a bit...
3586	*/
3587	/** @todo Do this on the way out. */
3588	if (RT_LIKELY( pChunk->cFree != GMM_CHUNK_NUM_PAGES
3589	\|\| pChunk->pFreeNext == NULL
3590	\|\| pChunk->pFreePrev == NULL /** @todo this is probably misfiring, see reset... */))
3591	{ /* likely */ }
3592	#ifdef GMM_WITH_LEGACY_MODE
3593	else if (RT_LIKELY(pGMM->fLegacyAllocationMode && !(pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE)))
3594	{ /* likely */ }
3595	#endif
3596	else
3597	gmmR0FreeChunk(pGMM, NULL, pChunk, false);
3598
3599	}
3600
3601
3602	/**
3603	* Frees a shared page, the page is known to exist and be valid and such.
3604	*
3605	* @param pGMM Pointer to the GMM instance.
3606	* @param pGVM Pointer to the GVM instance.
3607	* @param idPage The page id.
3608	* @param pPage The page structure.
3609	*/
3610	DECLINLINE(void) gmmR0FreeSharedPage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3611	{
3612	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3613	Assert(pChunk);
3614	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3615	Assert(pChunk->cShared > 0);
3616	Assert(pGMM->cSharedPages > 0);
3617	Assert(pGMM->cAllocatedPages > 0);
3618	Assert(!pPage->Shared.cRefs);
3619
3620	pChunk->cShared--;
3621	pGMM->cAllocatedPages--;
3622	pGMM->cSharedPages--;
3623	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3624	}
3625
3626
3627	/**
3628	* Frees a private page, the page is known to exist and be valid and such.
3629	*
3630	* @param pGMM Pointer to the GMM instance.
3631	* @param pGVM Pointer to the GVM instance.
3632	* @param idPage The page id.
3633	* @param pPage The page structure.
3634	*/
3635	DECLINLINE(void) gmmR0FreePrivatePage(PGMM pGMM, PGVM pGVM, uint32_t idPage, PGMMPAGE pPage)
3636	{
3637	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
3638	Assert(pChunk);
3639	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
3640	Assert(pChunk->cPrivate > 0);
3641	Assert(pGMM->cAllocatedPages > 0);
3642
3643	pChunk->cPrivate--;
3644	pGMM->cAllocatedPages--;
3645	gmmR0FreePageWorker(pGMM, pGVM, pChunk, idPage, pPage);
3646	}
3647
3648
3649	/**
3650	* Common worker for GMMR0FreePages and GMMR0BalloonedPages.
3651	*
3652	* @returns VBox status code:
3653	* @retval xxx
3654	*
3655	* @param pGMM Pointer to the GMM instance data.
3656	* @param pGVM Pointer to the VM.
3657	* @param cPages The number of pages to free.
3658	* @param paPages Pointer to the page descriptors.
3659	* @param enmAccount The account this relates to.
3660	*/
3661	static int gmmR0FreePages(PGMM pGMM, PGVM pGVM, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3662	{
3663	/*
3664	* Check that the request isn't impossible wrt to the account status.
3665	*/
3666	switch (enmAccount)
3667	{
3668	case GMMACCOUNT_BASE:
3669	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages < cPages))
3670	{
3671	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cBasePages, cPages));
3672	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3673	}
3674	break;
3675	case GMMACCOUNT_SHADOW:
3676	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cShadowPages < cPages))
3677	{
3678	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cShadowPages, cPages));
3679	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3680	}
3681	break;
3682	case GMMACCOUNT_FIXED:
3683	if (RT_UNLIKELY(pGVM->gmm.s.Stats.Allocated.cFixedPages < cPages))
3684	{
3685	Log(("gmmR0FreePages: allocated=%#llx cPages=%#x!\n", pGVM->gmm.s.Stats.Allocated.cFixedPages, cPages));
3686	return VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3687	}
3688	break;
3689	default:
3690	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3691	}
3692
3693	/*
3694	* Walk the descriptors and free the pages.
3695	*
3696	* Statistics (except the account) are being updated as we go along,
3697	* unlike the alloc code. Also, stop on the first error.
3698	*/
3699	int rc = VINF_SUCCESS;
3700	uint32_t iPage;
3701	for (iPage = 0; iPage < cPages; iPage++)
3702	{
3703	uint32_t idPage = paPages[iPage].idPage;
3704	PGMMPAGE pPage = gmmR0GetPage(pGMM, idPage);
3705	if (RT_LIKELY(pPage))
3706	{
3707	if (RT_LIKELY(GMM_PAGE_IS_PRIVATE(pPage)))
3708	{
3709	if (RT_LIKELY(pPage->Private.hGVM == pGVM->hSelf))
3710	{
3711	Assert(pGVM->gmm.s.Stats.cPrivatePages);
3712	pGVM->gmm.s.Stats.cPrivatePages--;
3713	gmmR0FreePrivatePage(pGMM, pGVM, idPage, pPage);
3714	}
3715	else
3716	{
3717	Log(("gmmR0AllocatePages: #%#x/%#x: not owner! hGVM=%#x hSelf=%#x\n", iPage, idPage,
3718	pPage->Private.hGVM, pGVM->hSelf));
3719	rc = VERR_GMM_NOT_PAGE_OWNER;
3720	break;
3721	}
3722	}
3723	else if (RT_LIKELY(GMM_PAGE_IS_SHARED(pPage)))
3724	{
3725	Assert(pGVM->gmm.s.Stats.cSharedPages);
3726	Assert(pPage->Shared.cRefs);
3727	#if defined(VBOX_WITH_PAGE_SHARING) && defined(VBOX_STRICT) && HC_ARCH_BITS == 64
3728	if (pPage->Shared.u14Checksum)
3729	{
3730	uint32_t uChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
3731	uChecksum &= UINT32_C(0x00003fff);
3732	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum,
3733	("%#x vs %#x - idPage=%#x\n", uChecksum, pPage->Shared.u14Checksum, idPage));
3734	}
3735	#endif
3736	pGVM->gmm.s.Stats.cSharedPages--;
3737	if (!--pPage->Shared.cRefs)
3738	gmmR0FreeSharedPage(pGMM, pGVM, idPage, pPage);
3739	else
3740	{
3741	Assert(pGMM->cDuplicatePages);
3742	pGMM->cDuplicatePages--;
3743	}
3744	}
3745	else
3746	{
3747	Log(("gmmR0AllocatePages: #%#x/%#x: already free!\n", iPage, idPage));
3748	rc = VERR_GMM_PAGE_ALREADY_FREE;
3749	break;
3750	}
3751	}
3752	else
3753	{
3754	Log(("gmmR0AllocatePages: #%#x/%#x: not found!\n", iPage, idPage));
3755	rc = VERR_GMM_PAGE_NOT_FOUND;
3756	break;
3757	}
3758	paPages[iPage].idPage = NIL_GMM_PAGEID;
3759	}
3760
3761	/*
3762	* Update the account.
3763	*/
3764	switch (enmAccount)
3765	{
3766	case GMMACCOUNT_BASE: pGVM->gmm.s.Stats.Allocated.cBasePages -= iPage; break;
3767	case GMMACCOUNT_SHADOW: pGVM->gmm.s.Stats.Allocated.cShadowPages -= iPage; break;
3768	case GMMACCOUNT_FIXED: pGVM->gmm.s.Stats.Allocated.cFixedPages -= iPage; break;
3769	default:
3770	AssertMsgFailedReturn(("enmAccount=%d\n", enmAccount), VERR_IPE_NOT_REACHED_DEFAULT_CASE);
3771	}
3772
3773	/*
3774	* Any threshold stuff to be done here?
3775	*/
3776
3777	return rc;
3778	}
3779
3780
3781	/**
3782	* Free one or more pages.
3783	*
3784	* This is typically used at reset time or power off.
3785	*
3786	* @returns VBox status code:
3787	* @retval xxx
3788	*
3789	* @param pGVM The global (ring-0) VM structure.
3790	* @param idCpu The VCPU id.
3791	* @param cPages The number of pages to allocate.
3792	* @param paPages Pointer to the page descriptors containing the page IDs
3793	* for each page.
3794	* @param enmAccount The account this relates to.
3795	* @thread EMT.
3796	*/
3797	GMMR0DECL(int) GMMR0FreePages(PGVM pGVM, VMCPUID idCpu, uint32_t cPages, PGMMFREEPAGEDESC paPages, GMMACCOUNT enmAccount)
3798	{
3799	LogFlow(("GMMR0FreePages: pGVM=%p cPages=%#x paPages=%p enmAccount=%d\n", pGVM, cPages, paPages, enmAccount));
3800
3801	/*
3802	* Validate input and get the basics.
3803	*/
3804	PGMM pGMM;
3805	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3806	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3807	if (RT_FAILURE(rc))
3808	return rc;
3809
3810	AssertPtrReturn(paPages, VERR_INVALID_PARAMETER);
3811	AssertMsgReturn(enmAccount > GMMACCOUNT_INVALID && enmAccount < GMMACCOUNT_END, ("%d\n", enmAccount), VERR_INVALID_PARAMETER);
3812	AssertMsgReturn(cPages > 0 && cPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cPages), VERR_INVALID_PARAMETER);
3813
3814	for (unsigned iPage = 0; iPage < cPages; iPage++)
3815	AssertMsgReturn( paPages[iPage].idPage <= GMM_PAGEID_LAST
3816	/\|\| paPages[iPage].idPage == NIL_GMM_PAGEID/,
3817	("#%#x: %#x\n", iPage, paPages[iPage].idPage), VERR_INVALID_PARAMETER);
3818
3819	/*
3820	* Take the semaphore and call the worker function.
3821	*/
3822	gmmR0MutexAcquire(pGMM);
3823	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3824	{
3825	rc = gmmR0FreePages(pGMM, pGVM, cPages, paPages, enmAccount);
3826	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3827	}
3828	else
3829	rc = VERR_GMM_IS_NOT_SANE;
3830	gmmR0MutexRelease(pGMM);
3831	LogFlow(("GMMR0FreePages: returns %Rrc\n", rc));
3832	return rc;
3833	}
3834
3835
3836	/**
3837	* VMMR0 request wrapper for GMMR0FreePages.
3838	*
3839	* @returns see GMMR0FreePages.
3840	* @param pGVM The global (ring-0) VM structure.
3841	* @param idCpu The VCPU id.
3842	* @param pReq Pointer to the request packet.
3843	*/
3844	GMMR0DECL(int) GMMR0FreePagesReq(PGVM pGVM, VMCPUID idCpu, PGMMFREEPAGESREQ pReq)
3845	{
3846	/*
3847	* Validate input and pass it on.
3848	*/
3849	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
3850	AssertMsgReturn(pReq->Hdr.cbReq >= RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0]),
3851	("%#x < %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF(GMMFREEPAGESREQ, aPages[0])),
3852	VERR_INVALID_PARAMETER);
3853	AssertMsgReturn(pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages]),
3854	("%#x != %#x\n", pReq->Hdr.cbReq, RT_UOFFSETOF_DYN(GMMFREEPAGESREQ, aPages[pReq->cPages])),
3855	VERR_INVALID_PARAMETER);
3856
3857	return GMMR0FreePages(pGVM, idCpu, pReq->cPages, &pReq->aPages[0], pReq->enmAccount);
3858	}
3859
3860
3861	/**
3862	* Report back on a memory ballooning request.
3863	*
3864	* The request may or may not have been initiated by the GMM. If it was initiated
3865	* by the GMM it is important that this function is called even if no pages were
3866	* ballooned.
3867	*
3868	* @returns VBox status code:
3869	* @retval VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH
3870	* @retval VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH
3871	* @retval VERR_GMM_OVERCOMMITTED_TRY_AGAIN_IN_A_BIT - reset condition
3872	* indicating that we won't necessarily have sufficient RAM to boot
3873	* the VM again and that it should pause until this changes (we'll try
3874	* balloon some other VM). (For standard deflate we have little choice
3875	* but to hope the VM won't use the memory that was returned to it.)
3876	*
3877	* @param pGVM The global (ring-0) VM structure.
3878	* @param idCpu The VCPU id.
3879	* @param enmAction Inflate/deflate/reset.
3880	* @param cBalloonedPages The number of pages that was ballooned.
3881	*
3882	* @thread EMT(idCpu)
3883	*/
3884	GMMR0DECL(int) GMMR0BalloonedPages(PGVM pGVM, VMCPUID idCpu, GMMBALLOONACTION enmAction, uint32_t cBalloonedPages)
3885	{
3886	LogFlow(("GMMR0BalloonedPages: pGVM=%p enmAction=%d cBalloonedPages=%#x\n",
3887	pGVM, enmAction, cBalloonedPages));
3888
3889	AssertMsgReturn(cBalloonedPages < RT_BIT(32 - PAGE_SHIFT), ("%#x\n", cBalloonedPages), VERR_INVALID_PARAMETER);
3890
3891	/*
3892	* Validate input and get the basics.
3893	*/
3894	PGMM pGMM;
3895	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
3896	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
3897	if (RT_FAILURE(rc))
3898	return rc;
3899
3900	/*
3901	* Take the semaphore and do some more validations.
3902	*/
3903	gmmR0MutexAcquire(pGMM);
3904	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
3905	{
3906	switch (enmAction)
3907	{
3908	case GMMBALLOONACTION_INFLATE:
3909	{
3910	if (RT_LIKELY(pGVM->gmm.s.Stats.Allocated.cBasePages + pGVM->gmm.s.Stats.cBalloonedPages + cBalloonedPages
3911	<= pGVM->gmm.s.Stats.Reserved.cBasePages))
3912	{
3913	/*
3914	* Record the ballooned memory.
3915	*/
3916	pGMM->cBalloonedPages += cBalloonedPages;
3917	if (pGVM->gmm.s.Stats.cReqBalloonedPages)
3918	{
3919	/* Codepath never taken. Might be interesting in the future to request ballooned memory from guests in low memory conditions.. */
3920	AssertFailed();
3921
3922	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3923	pGVM->gmm.s.Stats.cReqActuallyBalloonedPages += cBalloonedPages;
3924	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx Req=%#llx Actual=%#llx (pending)\n",
3925	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages,
3926	pGVM->gmm.s.Stats.cReqBalloonedPages, pGVM->gmm.s.Stats.cReqActuallyBalloonedPages));
3927	}
3928	else
3929	{
3930	pGVM->gmm.s.Stats.cBalloonedPages += cBalloonedPages;
3931	Log(("GMMR0BalloonedPages: +%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3932	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3933	}
3934	}
3935	else
3936	{
3937	Log(("GMMR0BalloonedPages: cBasePages=%#llx Total=%#llx cBalloonedPages=%#llx Reserved=%#llx\n",
3938	pGVM->gmm.s.Stats.Allocated.cBasePages, pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages,
3939	pGVM->gmm.s.Stats.Reserved.cBasePages));
3940	rc = VERR_GMM_ATTEMPT_TO_FREE_TOO_MUCH;
3941	}
3942	break;
3943	}
3944
3945	case GMMBALLOONACTION_DEFLATE:
3946	{
3947	/* Deflate. */
3948	if (pGVM->gmm.s.Stats.cBalloonedPages >= cBalloonedPages)
3949	{
3950	/*
3951	* Record the ballooned memory.
3952	*/
3953	Assert(pGMM->cBalloonedPages >= cBalloonedPages);
3954	pGMM->cBalloonedPages -= cBalloonedPages;
3955	pGVM->gmm.s.Stats.cBalloonedPages -= cBalloonedPages;
3956	if (pGVM->gmm.s.Stats.cReqDeflatePages)
3957	{
3958	AssertFailed(); /* This is path is for later. */
3959	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx Req=%#llx\n",
3960	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages, pGVM->gmm.s.Stats.cReqDeflatePages));
3961
3962	/*
3963	* Anything we need to do here now when the request has been completed?
3964	*/
3965	pGVM->gmm.s.Stats.cReqDeflatePages = 0;
3966	}
3967	else
3968	Log(("GMMR0BalloonedPages: -%#x - Global=%#llx / VM: Total=%#llx (user)\n",
3969	cBalloonedPages, pGMM->cBalloonedPages, pGVM->gmm.s.Stats.cBalloonedPages));
3970	}
3971	else
3972	{
3973	Log(("GMMR0BalloonedPages: Total=%#llx cBalloonedPages=%#llx\n", pGVM->gmm.s.Stats.cBalloonedPages, cBalloonedPages));
3974	rc = VERR_GMM_ATTEMPT_TO_DEFLATE_TOO_MUCH;
3975	}
3976	break;
3977	}
3978
3979	case GMMBALLOONACTION_RESET:
3980	{
3981	/* Reset to an empty balloon. */
3982	Assert(pGMM->cBalloonedPages >= pGVM->gmm.s.Stats.cBalloonedPages);
3983
3984	pGMM->cBalloonedPages -= pGVM->gmm.s.Stats.cBalloonedPages;
3985	pGVM->gmm.s.Stats.cBalloonedPages = 0;
3986	break;
3987	}
3988
3989	default:
3990	rc = VERR_INVALID_PARAMETER;
3991	break;
3992	}
3993	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
3994	}
3995	else
3996	rc = VERR_GMM_IS_NOT_SANE;
3997
3998	gmmR0MutexRelease(pGMM);
3999	LogFlow(("GMMR0BalloonedPages: returns %Rrc\n", rc));
4000	return rc;
4001	}
4002
4003
4004	/**
4005	* VMMR0 request wrapper for GMMR0BalloonedPages.
4006	*
4007	* @returns see GMMR0BalloonedPages.
4008	* @param pGVM The global (ring-0) VM structure.
4009	* @param idCpu The VCPU id.
4010	* @param pReq Pointer to the request packet.
4011	*/
4012	GMMR0DECL(int) GMMR0BalloonedPagesReq(PGVM pGVM, VMCPUID idCpu, PGMMBALLOONEDPAGESREQ pReq)
4013	{
4014	/*
4015	* Validate input and pass it on.
4016	*/
4017	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4018	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMBALLOONEDPAGESREQ),
4019	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMBALLOONEDPAGESREQ)),
4020	VERR_INVALID_PARAMETER);
4021
4022	return GMMR0BalloonedPages(pGVM, idCpu, pReq->enmAction, pReq->cBalloonedPages);
4023	}
4024
4025
4026	/**
4027	* Return memory statistics for the hypervisor
4028	*
4029	* @returns VBox status code.
4030	* @param pReq Pointer to the request packet.
4031	*/
4032	GMMR0DECL(int) GMMR0QueryHypervisorMemoryStatsReq(PGMMMEMSTATSREQ pReq)
4033	{
4034	/*
4035	* Validate input and pass it on.
4036	*/
4037	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4038	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4039	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4040	VERR_INVALID_PARAMETER);
4041
4042	/*
4043	* Validate input and get the basics.
4044	*/
4045	PGMM pGMM;
4046	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4047	pReq->cAllocPages = pGMM->cAllocatedPages;
4048	pReq->cFreePages = (pGMM->cChunks << (GMM_CHUNK_SHIFT- PAGE_SHIFT)) - pGMM->cAllocatedPages;
4049	pReq->cBalloonedPages = pGMM->cBalloonedPages;
4050	pReq->cMaxPages = pGMM->cMaxPages;
4051	pReq->cSharedPages = pGMM->cDuplicatePages;
4052	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4053
4054	return VINF_SUCCESS;
4055	}
4056
4057
4058	/**
4059	* Return memory statistics for the VM
4060	*
4061	* @returns VBox status code.
4062	* @param pGVM The global (ring-0) VM structure.
4063	* @param idCpu Cpu id.
4064	* @param pReq Pointer to the request packet.
4065	*
4066	* @thread EMT(idCpu)
4067	*/
4068	GMMR0DECL(int) GMMR0QueryMemoryStatsReq(PGVM pGVM, VMCPUID idCpu, PGMMMEMSTATSREQ pReq)
4069	{
4070	/*
4071	* Validate input and pass it on.
4072	*/
4073	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4074	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(GMMMEMSTATSREQ),
4075	("%#x < %#x\n", pReq->Hdr.cbReq, sizeof(GMMMEMSTATSREQ)),
4076	VERR_INVALID_PARAMETER);
4077
4078	/*
4079	* Validate input and get the basics.
4080	*/
4081	PGMM pGMM;
4082	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4083	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4084	if (RT_FAILURE(rc))
4085	return rc;
4086
4087	/*
4088	* Take the semaphore and do some more validations.
4089	*/
4090	gmmR0MutexAcquire(pGMM);
4091	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4092	{
4093	pReq->cAllocPages = pGVM->gmm.s.Stats.Allocated.cBasePages;
4094	pReq->cBalloonedPages = pGVM->gmm.s.Stats.cBalloonedPages;
4095	pReq->cMaxPages = pGVM->gmm.s.Stats.Reserved.cBasePages;
4096	pReq->cFreePages = pReq->cMaxPages - pReq->cAllocPages;
4097	}
4098	else
4099	rc = VERR_GMM_IS_NOT_SANE;
4100
4101	gmmR0MutexRelease(pGMM);
4102	LogFlow(("GMMR3QueryVMMemoryStats: returns %Rrc\n", rc));
4103	return rc;
4104	}
4105
4106
4107	/**
4108	* Worker for gmmR0UnmapChunk and gmmr0FreeChunk.
4109	*
4110	* Don't call this in legacy allocation mode!
4111	*
4112	* @returns VBox status code.
4113	* @param pGMM Pointer to the GMM instance data.
4114	* @param pGVM Pointer to the Global VM structure.
4115	* @param pChunk Pointer to the chunk to be unmapped.
4116	*/
4117	static int gmmR0UnmapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk)
4118	{
4119	RT_NOREF_PV(pGMM);
4120	#ifdef GMM_WITH_LEGACY_MODE
4121	Assert(!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE));
4122	#endif
4123
4124	/*
4125	* Find the mapping and try unmapping it.
4126	*/
4127	uint32_t cMappings = pChunk->cMappingsX;
4128	for (uint32_t i = 0; i < cMappings; i++)
4129	{
4130	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4131	if (pChunk->paMappingsX[i].pGVM == pGVM)
4132	{
4133	/* unmap */
4134	int rc = RTR0MemObjFree(pChunk->paMappingsX[i].hMapObj, false /* fFreeMappings (NA) */);
4135	if (RT_SUCCESS(rc))
4136	{
4137	/* update the record. */
4138	cMappings--;
4139	if (i < cMappings)
4140	pChunk->paMappingsX[i] = pChunk->paMappingsX[cMappings];
4141	pChunk->paMappingsX[cMappings].hMapObj = NIL_RTR0MEMOBJ;
4142	pChunk->paMappingsX[cMappings].pGVM = NULL;
4143	Assert(pChunk->cMappingsX - 1U == cMappings);
4144	pChunk->cMappingsX = cMappings;
4145	}
4146
4147	return rc;
4148	}
4149	}
4150
4151	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
4152	return VERR_GMM_CHUNK_NOT_MAPPED;
4153	}
4154
4155
4156	/**
4157	* Unmaps a chunk previously mapped into the address space of the current process.
4158	*
4159	* @returns VBox status code.
4160	* @param pGMM Pointer to the GMM instance data.
4161	* @param pGVM Pointer to the Global VM structure.
4162	* @param pChunk Pointer to the chunk to be unmapped.
4163	* @param fRelaxedSem Whether we can release the semaphore while doing the
4164	* mapping (@c true) or not.
4165	*/
4166	static int gmmR0UnmapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem)
4167	{
4168	#ifdef GMM_WITH_LEGACY_MODE
4169	if (!pGMM->fLegacyAllocationMode \|\| (pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
4170	{
4171	#endif
4172	/*
4173	* Lock the chunk and if possible leave the giant GMM lock.
4174	*/
4175	GMMR0CHUNKMTXSTATE MtxState;
4176	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4177	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4178	if (RT_SUCCESS(rc))
4179	{
4180	rc = gmmR0UnmapChunkLocked(pGMM, pGVM, pChunk);
4181	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4182	}
4183	return rc;
4184	#ifdef GMM_WITH_LEGACY_MODE
4185	}
4186
4187	if (pChunk->hGVM == pGVM->hSelf)
4188	return VINF_SUCCESS;
4189
4190	Log(("gmmR0UnmapChunk: Chunk %#x is not mapped into pGVM=%p/%#x (legacy)\n", pChunk->Core.Key, pGVM, pGVM->hSelf));
4191	return VERR_GMM_CHUNK_NOT_MAPPED;
4192	#endif
4193	}
4194
4195
4196	/**
4197	* Worker for gmmR0MapChunk.
4198	*
4199	* @returns VBox status code.
4200	* @param pGMM Pointer to the GMM instance data.
4201	* @param pGVM Pointer to the Global VM structure.
4202	* @param pChunk Pointer to the chunk to be mapped.
4203	* @param ppvR3 Where to store the ring-3 address of the mapping.
4204	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4205	* contain the address of the existing mapping.
4206	*/
4207	static int gmmR0MapChunkLocked(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4208	{
4209	#ifdef GMM_WITH_LEGACY_MODE
4210	/*
4211	* If we're in legacy mode this is simple.
4212	*/
4213	if (pGMM->fLegacyAllocationMode && !(pChunk->fFlags & GMM_CHUNK_FLAGS_LARGE_PAGE))
4214	{
4215	if (pChunk->hGVM != pGVM->hSelf)
4216	{
4217	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
4218	return VERR_GMM_CHUNK_NOT_FOUND;
4219	}
4220
4221	*ppvR3 = RTR0MemObjAddressR3(pChunk->hMemObj);
4222	return VINF_SUCCESS;
4223	}
4224	#else
4225	RT_NOREF(pGMM);
4226	#endif
4227
4228	/*
4229	* Check to see if the chunk is already mapped.
4230	*/
4231	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4232	{
4233	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4234	if (pChunk->paMappingsX[i].pGVM == pGVM)
4235	{
4236	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4237	Log(("gmmR0MapChunk: chunk %#x is already mapped at %p!\n", pChunk->Core.Key, *ppvR3));
4238	#ifdef VBOX_WITH_PAGE_SHARING
4239	/* The ring-3 chunk cache can be out of sync; don't fail. */
4240	return VINF_SUCCESS;
4241	#else
4242	return VERR_GMM_CHUNK_ALREADY_MAPPED;
4243	#endif
4244	}
4245	}
4246
4247	/*
4248	* Do the mapping.
4249	*/
4250	RTR0MEMOBJ hMapObj;
4251	int rc = RTR0MemObjMapUser(&hMapObj, pChunk->hMemObj, (RTR3PTR)-1, 0, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4252	if (RT_SUCCESS(rc))
4253	{
4254	/* reallocate the array? assumes few users per chunk (usually one). */
4255	unsigned iMapping = pChunk->cMappingsX;
4256	if ( iMapping <= 3
4257	\|\| (iMapping & 3) == 0)
4258	{
4259	unsigned cNewSize = iMapping <= 3
4260	? iMapping + 1
4261	: iMapping + 4;
4262	Assert(cNewSize < 4 \|\| RT_ALIGN_32(cNewSize, 4) == cNewSize);
4263	if (RT_UNLIKELY(cNewSize > UINT16_MAX))
4264	{
4265	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4266	return VERR_GMM_TOO_MANY_CHUNK_MAPPINGS;
4267	}
4268
4269	void pvMappings = RTMemRealloc(pChunk->paMappingsX, cNewSize sizeof(pChunk->paMappingsX[0]));
4270	if (RT_UNLIKELY(!pvMappings))
4271	{
4272	rc = RTR0MemObjFree(hMapObj, false /* fFreeMappings (NA) */); AssertRC(rc);
4273	return VERR_NO_MEMORY;
4274	}
4275	pChunk->paMappingsX = (PGMMCHUNKMAP)pvMappings;
4276	}
4277
4278	/* insert new entry */
4279	pChunk->paMappingsX[iMapping].hMapObj = hMapObj;
4280	pChunk->paMappingsX[iMapping].pGVM = pGVM;
4281	Assert(pChunk->cMappingsX == iMapping);
4282	pChunk->cMappingsX = iMapping + 1;
4283
4284	*ppvR3 = RTR0MemObjAddressR3(hMapObj);
4285	}
4286
4287	return rc;
4288	}
4289
4290
4291	/**
4292	* Maps a chunk into the user address space of the current process.
4293	*
4294	* @returns VBox status code.
4295	* @param pGMM Pointer to the GMM instance data.
4296	* @param pGVM Pointer to the Global VM structure.
4297	* @param pChunk Pointer to the chunk to be mapped.
4298	* @param fRelaxedSem Whether we can release the semaphore while doing the
4299	* mapping (@c true) or not.
4300	* @param ppvR3 Where to store the ring-3 address of the mapping.
4301	* In the VERR_GMM_CHUNK_ALREADY_MAPPED case, this will be
4302	* contain the address of the existing mapping.
4303	*/
4304	static int gmmR0MapChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, bool fRelaxedSem, PRTR3PTR ppvR3)
4305	{
4306	/*
4307	* Take the chunk lock and leave the giant GMM lock when possible, then
4308	* call the worker function.
4309	*/
4310	GMMR0CHUNKMTXSTATE MtxState;
4311	int rc = gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk,
4312	fRelaxedSem ? GMMR0CHUNK_MTX_RETAKE_GIANT : GMMR0CHUNK_MTX_KEEP_GIANT);
4313	if (RT_SUCCESS(rc))
4314	{
4315	rc = gmmR0MapChunkLocked(pGMM, pGVM, pChunk, ppvR3);
4316	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4317	}
4318
4319	return rc;
4320	}
4321
4322
4323
4324	#if defined(VBOX_WITH_PAGE_SHARING) \|\| (defined(VBOX_STRICT) && HC_ARCH_BITS == 64)
4325	/**
4326	* Check if a chunk is mapped into the specified VM
4327	*
4328	* @returns mapped yes/no
4329	* @param pGMM Pointer to the GMM instance.
4330	* @param pGVM Pointer to the Global VM structure.
4331	* @param pChunk Pointer to the chunk to be mapped.
4332	* @param ppvR3 Where to store the ring-3 address of the mapping.
4333	*/
4334	static bool gmmR0IsChunkMapped(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, PRTR3PTR ppvR3)
4335	{
4336	GMMR0CHUNKMTXSTATE MtxState;
4337	gmmR0ChunkMutexAcquire(&MtxState, pGMM, pChunk, GMMR0CHUNK_MTX_KEEP_GIANT);
4338	for (uint32_t i = 0; i < pChunk->cMappingsX; i++)
4339	{
4340	Assert(pChunk->paMappingsX[i].pGVM && pChunk->paMappingsX[i].hMapObj != NIL_RTR0MEMOBJ);
4341	if (pChunk->paMappingsX[i].pGVM == pGVM)
4342	{
4343	*ppvR3 = RTR0MemObjAddressR3(pChunk->paMappingsX[i].hMapObj);
4344	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4345	return true;
4346	}
4347	}
4348	*ppvR3 = NULL;
4349	gmmR0ChunkMutexRelease(&MtxState, pChunk);
4350	return false;
4351	}
4352	#endif /* VBOX_WITH_PAGE_SHARING \|\| (VBOX_STRICT && 64-BIT) */
4353
4354
4355	/**
4356	* Map a chunk and/or unmap another chunk.
4357	*
4358	* The mapping and unmapping applies to the current process.
4359	*
4360	* This API does two things because it saves a kernel call per mapping when
4361	* when the ring-3 mapping cache is full.
4362	*
4363	* @returns VBox status code.
4364	* @param pGVM The global (ring-0) VM structure.
4365	* @param idChunkMap The chunk to map. NIL_GMM_CHUNKID if nothing to map.
4366	* @param idChunkUnmap The chunk to unmap. NIL_GMM_CHUNKID if nothing to unmap.
4367	* @param ppvR3 Where to store the address of the mapped chunk. NULL is ok if nothing to map.
4368	* @thread EMT ???
4369	*/
4370	GMMR0DECL(int) GMMR0MapUnmapChunk(PGVM pGVM, uint32_t idChunkMap, uint32_t idChunkUnmap, PRTR3PTR ppvR3)
4371	{
4372	LogFlow(("GMMR0MapUnmapChunk: pGVM=%p idChunkMap=%#x idChunkUnmap=%#x ppvR3=%p\n",
4373	pGVM, idChunkMap, idChunkUnmap, ppvR3));
4374
4375	/*
4376	* Validate input and get the basics.
4377	*/
4378	PGMM pGMM;
4379	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4380	int rc = GVMMR0ValidateGVM(pGVM);
4381	if (RT_FAILURE(rc))
4382	return rc;
4383
4384	AssertCompile(NIL_GMM_CHUNKID == 0);
4385	AssertMsgReturn(idChunkMap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkMap), VERR_INVALID_PARAMETER);
4386	AssertMsgReturn(idChunkUnmap <= GMM_CHUNKID_LAST, ("%#x\n", idChunkUnmap), VERR_INVALID_PARAMETER);
4387
4388	if ( idChunkMap == NIL_GMM_CHUNKID
4389	&& idChunkUnmap == NIL_GMM_CHUNKID)
4390	return VERR_INVALID_PARAMETER;
4391
4392	if (idChunkMap != NIL_GMM_CHUNKID)
4393	{
4394	AssertPtrReturn(ppvR3, VERR_INVALID_POINTER);
4395	*ppvR3 = NIL_RTR3PTR;
4396	}
4397
4398	/*
4399	* Take the semaphore and do the work.
4400	*
4401	* The unmapping is done last since it's easier to undo a mapping than
4402	* undoing an unmapping. The ring-3 mapping cache cannot not be so big
4403	* that it pushes the user virtual address space to within a chunk of
4404	* it it's limits, so, no problem here.
4405	*/
4406	gmmR0MutexAcquire(pGMM);
4407	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4408	{
4409	PGMMCHUNK pMap = NULL;
4410	if (idChunkMap != NIL_GVM_HANDLE)
4411	{
4412	pMap = gmmR0GetChunk(pGMM, idChunkMap);
4413	if (RT_LIKELY(pMap))
4414	rc = gmmR0MapChunk(pGMM, pGVM, pMap, true /fRelaxedSem/, ppvR3);
4415	else
4416	{
4417	Log(("GMMR0MapUnmapChunk: idChunkMap=%#x\n", idChunkMap));
4418	rc = VERR_GMM_CHUNK_NOT_FOUND;
4419	}
4420	}
4421	/** @todo split this operation, the bail out might (theoretcially) not be
4422	* entirely safe. */
4423
4424	if ( idChunkUnmap != NIL_GMM_CHUNKID
4425	&& RT_SUCCESS(rc))
4426	{
4427	PGMMCHUNK pUnmap = gmmR0GetChunk(pGMM, idChunkUnmap);
4428	if (RT_LIKELY(pUnmap))
4429	rc = gmmR0UnmapChunk(pGMM, pGVM, pUnmap, true /fRelaxedSem/);
4430	else
4431	{
4432	Log(("GMMR0MapUnmapChunk: idChunkUnmap=%#x\n", idChunkUnmap));
4433	rc = VERR_GMM_CHUNK_NOT_FOUND;
4434	}
4435
4436	if (RT_FAILURE(rc) && pMap)
4437	gmmR0UnmapChunk(pGMM, pGVM, pMap, false /fRelaxedSem/);
4438	}
4439
4440	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4441	}
4442	else
4443	rc = VERR_GMM_IS_NOT_SANE;
4444	gmmR0MutexRelease(pGMM);
4445
4446	LogFlow(("GMMR0MapUnmapChunk: returns %Rrc\n", rc));
4447	return rc;
4448	}
4449
4450
4451	/**
4452	* VMMR0 request wrapper for GMMR0MapUnmapChunk.
4453	*
4454	* @returns see GMMR0MapUnmapChunk.
4455	* @param pGVM The global (ring-0) VM structure.
4456	* @param pReq Pointer to the request packet.
4457	*/
4458	GMMR0DECL(int) GMMR0MapUnmapChunkReq(PGVM pGVM, PGMMMAPUNMAPCHUNKREQ pReq)
4459	{
4460	/*
4461	* Validate input and pass it on.
4462	*/
4463	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
4464	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
4465
4466	return GMMR0MapUnmapChunk(pGVM, pReq->idChunkMap, pReq->idChunkUnmap, &pReq->pvR3);
4467	}
4468
4469
4470	/**
4471	* Legacy mode API for supplying pages.
4472	*
4473	* The specified user address points to a allocation chunk sized block that
4474	* will be locked down and used by the GMM when the GM asks for pages.
4475	*
4476	* @returns VBox status code.
4477	* @param pGVM The global (ring-0) VM structure.
4478	* @param idCpu The VCPU id.
4479	* @param pvR3 Pointer to the chunk size memory block to lock down.
4480	*/
4481	GMMR0DECL(int) GMMR0SeedChunk(PGVM pGVM, VMCPUID idCpu, RTR3PTR pvR3)
4482	{
4483	#ifdef GMM_WITH_LEGACY_MODE
4484	/*
4485	* Validate input and get the basics.
4486	*/
4487	PGMM pGMM;
4488	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4489	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4490	if (RT_FAILURE(rc))
4491	return rc;
4492
4493	AssertPtrReturn(pvR3, VERR_INVALID_POINTER);
4494	AssertReturn(!(PAGE_OFFSET_MASK & pvR3), VERR_INVALID_POINTER);
4495
4496	if (!pGMM->fLegacyAllocationMode)
4497	{
4498	Log(("GMMR0SeedChunk: not in legacy allocation mode!\n"));
4499	return VERR_NOT_SUPPORTED;
4500	}
4501
4502	/*
4503	* Lock the memory and add it as new chunk with our hGVM.
4504	* (The GMM locking is done inside gmmR0RegisterChunk.)
4505	*/
4506	RTR0MEMOBJ hMemObj;
4507	rc = RTR0MemObjLockUser(&hMemObj, pvR3, GMM_CHUNK_SIZE, RTMEM_PROT_READ \| RTMEM_PROT_WRITE, NIL_RTR0PROCESS);
4508	if (RT_SUCCESS(rc))
4509	{
4510	rc = gmmR0RegisterChunk(pGMM, &pGVM->gmm.s.Private, hMemObj, pGVM->hSelf, GMM_CHUNK_FLAGS_SEEDED, NULL);
4511	if (RT_SUCCESS(rc))
4512	gmmR0MutexRelease(pGMM);
4513	else
4514	RTR0MemObjFree(hMemObj, true /* fFreeMappings */);
4515	}
4516
4517	LogFlow(("GMMR0SeedChunk: rc=%d (pvR3=%p)\n", rc, pvR3));
4518	return rc;
4519	#else
4520	RT_NOREF(pGVM, idCpu, pvR3);
4521	return VERR_NOT_SUPPORTED;
4522	#endif
4523	}
4524
4525
4526	#ifndef VBOX_WITH_LINEAR_HOST_PHYS_MEM
4527	/**
4528	* Gets the ring-0 virtual address for the given page.
4529	*
4530	* This is used by PGM when IEM and such wants to access guest RAM from ring-0.
4531	* One of the ASSUMPTIONS here is that the @a idPage is used by the VM and the
4532	* corresponding chunk will remain valid beyond the call (at least till the EMT
4533	* returns to ring-3).
4534	*
4535	* @returns VBox status code.
4536	* @param pGVM Pointer to the kernel-only VM instace data.
4537	* @param idPage The page ID.
4538	* @param ppv Where to store the address.
4539	* @thread EMT
4540	*/
4541	GMMR0DECL(int) GMMR0PageIdToVirt(PGVM pGVM, uint32_t idPage, void **ppv)
4542	{
4543	*ppv = NULL;
4544	PGMM pGMM;
4545	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4546
4547	uint32_t const idChunk = idPage >> GMM_CHUNKID_SHIFT;
4548
4549	/*
4550	* Start with the per-VM TLB.
4551	*/
4552	RTSpinlockAcquire(pGVM->gmm.s.hChunkTlbSpinLock);
4553
4554	PGMMPERVMCHUNKTLBE pTlbe = &pGVM->gmm.s.aChunkTlbEntries[GMMPERVM_CHUNKTLB_IDX(idChunk)];
4555	PGMMCHUNK pChunk = pTlbe->pChunk;
4556	if ( pChunk != NULL
4557	&& pTlbe->idGeneration == ASMAtomicUoReadU64(&pGMM->idFreeGeneration)
4558	&& pChunk->Core.Key == idChunk)
4559	pGVM->R0Stats.gmm.cChunkTlbHits++; /* hopefully this is a likely outcome */
4560	else
4561	{
4562	pGVM->R0Stats.gmm.cChunkTlbMisses++;
4563
4564	/*
4565	* Look it up in the chunk tree.
4566	*/
4567	RTSpinlockAcquire(pGMM->hSpinLockTree);
4568	pChunk = gmmR0GetChunkLocked(pGMM, idChunk);
4569	if (RT_LIKELY(pChunk))
4570	{
4571	pTlbe->idGeneration = pGMM->idFreeGeneration;
4572	RTSpinlockRelease(pGMM->hSpinLockTree);
4573	pTlbe->pChunk = pChunk;
4574	}
4575	else
4576	{
4577	RTSpinlockRelease(pGMM->hSpinLockTree);
4578	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4579	AssertMsgFailed(("idPage=%#x\n", idPage));
4580	return VERR_GMM_PAGE_NOT_FOUND;
4581	}
4582	}
4583
4584	RTSpinlockRelease(pGVM->gmm.s.hChunkTlbSpinLock);
4585
4586	/*
4587	* Got a chunk, now validate the page ownership and calcuate it's address.
4588	*/
4589	const GMMPAGE * const pPage = &pChunk->aPages[idPage & GMM_PAGEID_IDX_MASK];
4590	if (RT_LIKELY( ( GMM_PAGE_IS_PRIVATE(pPage)
4591	&& pPage->Private.hGVM == pGVM->hSelf)
4592	\|\| GMM_PAGE_IS_SHARED(pPage)))
4593	{
4594	AssertPtr(pChunk->pbMapping);
4595	*ppv = &pChunk->pbMapping[(idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT];
4596	return VINF_SUCCESS;
4597	}
4598	AssertMsgFailed(("idPage=%#x is-private=%RTbool Private.hGVM=%u pGVM->hGVM=%u\n",
4599	idPage, GMM_PAGE_IS_PRIVATE(pPage), pPage->Private.hGVM, pGVM->hSelf));
4600	return VERR_GMM_NOT_PAGE_OWNER;
4601	}
4602	#endif /* !VBOX_WITH_LINEAR_HOST_PHYS_MEM */
4603
4604	#ifdef VBOX_WITH_PAGE_SHARING
4605
4606	# ifdef VBOX_STRICT
4607	/**
4608	* For checksumming shared pages in strict builds.
4609	*
4610	* The purpose is making sure that a page doesn't change.
4611	*
4612	* @returns Checksum, 0 on failure.
4613	* @param pGMM The GMM instance data.
4614	* @param pGVM Pointer to the kernel-only VM instace data.
4615	* @param idPage The page ID.
4616	*/
4617	static uint32_t gmmR0StrictPageChecksum(PGMM pGMM, PGVM pGVM, uint32_t idPage)
4618	{
4619	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
4620	AssertMsgReturn(pChunk, ("idPage=%#x\n", idPage), 0);
4621
4622	uint8_t *pbChunk;
4623	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
4624	return 0;
4625	uint8_t const *pbPage = pbChunk + ((idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
4626
4627	return RTCrc32(pbPage, PAGE_SIZE);
4628	}
4629	# endif /* VBOX_STRICT */
4630
4631
4632	/**
4633	* Calculates the module hash value.
4634	*
4635	* @returns Hash value.
4636	* @param pszModuleName The module name.
4637	* @param pszVersion The module version string.
4638	*/
4639	static uint32_t gmmR0ShModCalcHash(const char pszModuleName, const char pszVersion)
4640	{
4641	return RTStrHash1ExN(3, pszModuleName, RTSTR_MAX, "::", (size_t)2, pszVersion, RTSTR_MAX);
4642	}
4643
4644
4645	/**
4646	* Finds a global module.
4647	*
4648	* @returns Pointer to the global module on success, NULL if not found.
4649	* @param pGMM The GMM instance data.
4650	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4651	* @param cbModule The module size.
4652	* @param enmGuestOS The guest OS type.
4653	* @param cRegions The number of regions.
4654	* @param pszModuleName The module name.
4655	* @param pszVersion The module version.
4656	* @param paRegions The region descriptions.
4657	*/
4658	static PGMMSHAREDMODULE gmmR0ShModFindGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4659	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4660	struct VMMDEVSHAREDREGIONDESC const *paRegions)
4661	{
4662	for (PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTAvllU32Get(&pGMM->pGlobalSharedModuleTree, uHash);
4663	pGblMod;
4664	pGblMod = (PGMMSHAREDMODULE)pGblMod->Core.pList)
4665	{
4666	if (pGblMod->cbModule != cbModule)
4667	continue;
4668	if (pGblMod->enmGuestOS != enmGuestOS)
4669	continue;
4670	if (pGblMod->cRegions != cRegions)
4671	continue;
4672	if (strcmp(pGblMod->szName, pszModuleName))
4673	continue;
4674	if (strcmp(pGblMod->szVersion, pszVersion))
4675	continue;
4676
4677	uint32_t i;
4678	for (i = 0; i < cRegions; i++)
4679	{
4680	uint32_t off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4681	if (pGblMod->aRegions[i].off != off)
4682	break;
4683
4684	uint32_t cb = RT_ALIGN_32(paRegions[i].cbRegion + off, PAGE_SIZE);
4685	if (pGblMod->aRegions[i].cb != cb)
4686	break;
4687	}
4688
4689	if (i == cRegions)
4690	return pGblMod;
4691	}
4692
4693	return NULL;
4694	}
4695
4696
4697	/**
4698	* Creates a new global module.
4699	*
4700	* @returns VBox status code.
4701	* @param pGMM The GMM instance data.
4702	* @param uHash The hash as calculated by gmmR0ShModCalcHash.
4703	* @param cbModule The module size.
4704	* @param enmGuestOS The guest OS type.
4705	* @param cRegions The number of regions.
4706	* @param pszModuleName The module name.
4707	* @param pszVersion The module version.
4708	* @param paRegions The region descriptions.
4709	* @param ppGblMod Where to return the new module on success.
4710	*/
4711	static int gmmR0ShModNewGlobal(PGMM pGMM, uint32_t uHash, uint32_t cbModule, VBOXOSFAMILY enmGuestOS,
4712	uint32_t cRegions, const char pszModuleName, const char pszVersion,
4713	struct VMMDEVSHAREDREGIONDESC const paRegions, PGMMSHAREDMODULE ppGblMod)
4714	{
4715	Log(("gmmR0ShModNewGlobal: %s %s size %#x os %u rgn %u\n", pszModuleName, pszVersion, cbModule, enmGuestOS, cRegions));
4716	if (pGMM->cShareableModules >= GMM_MAX_SHARED_GLOBAL_MODULES)
4717	{
4718	Log(("gmmR0ShModNewGlobal: Too many modules\n"));
4719	return VERR_GMM_TOO_MANY_GLOBAL_MODULES;
4720	}
4721
4722	PGMMSHAREDMODULE pGblMod = (PGMMSHAREDMODULE)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULE, aRegions[cRegions]));
4723	if (!pGblMod)
4724	{
4725	Log(("gmmR0ShModNewGlobal: No memory\n"));
4726	return VERR_NO_MEMORY;
4727	}
4728
4729	pGblMod->Core.Key = uHash;
4730	pGblMod->cbModule = cbModule;
4731	pGblMod->cRegions = cRegions;
4732	pGblMod->cUsers = 1;
4733	pGblMod->enmGuestOS = enmGuestOS;
4734	strcpy(pGblMod->szName, pszModuleName);
4735	strcpy(pGblMod->szVersion, pszVersion);
4736
4737	for (uint32_t i = 0; i < cRegions; i++)
4738	{
4739	Log(("gmmR0ShModNewGlobal: rgn[%u]=%RGvLB%#x\n", i, paRegions[i].GCRegionAddr, paRegions[i].cbRegion));
4740	pGblMod->aRegions[i].off = paRegions[i].GCRegionAddr & PAGE_OFFSET_MASK;
4741	pGblMod->aRegions[i].cb = paRegions[i].cbRegion + pGblMod->aRegions[i].off;
4742	pGblMod->aRegions[i].cb = RT_ALIGN_32(pGblMod->aRegions[i].cb, PAGE_SIZE);
4743	pGblMod->aRegions[i].paidPages = NULL; /* allocated when needed. */
4744	}
4745
4746	bool fInsert = RTAvllU32Insert(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4747	Assert(fInsert); NOREF(fInsert);
4748	pGMM->cShareableModules++;
4749
4750	*ppGblMod = pGblMod;
4751	return VINF_SUCCESS;
4752	}
4753
4754
4755	/**
4756	* Deletes a global module which is no longer referenced by anyone.
4757	*
4758	* @param pGMM The GMM instance data.
4759	* @param pGblMod The module to delete.
4760	*/
4761	static void gmmR0ShModDeleteGlobal(PGMM pGMM, PGMMSHAREDMODULE pGblMod)
4762	{
4763	Assert(pGblMod->cUsers == 0);
4764	Assert(pGMM->cShareableModules > 0 && pGMM->cShareableModules <= GMM_MAX_SHARED_GLOBAL_MODULES);
4765
4766	void *pvTest = RTAvllU32RemoveNode(&pGMM->pGlobalSharedModuleTree, &pGblMod->Core);
4767	Assert(pvTest == pGblMod); NOREF(pvTest);
4768	pGMM->cShareableModules--;
4769
4770	uint32_t i = pGblMod->cRegions;
4771	while (i-- > 0)
4772	{
4773	if (pGblMod->aRegions[i].paidPages)
4774	{
4775	/* We don't doing anything to the pages as they are handled by the
4776	copy-on-write mechanism in PGM. */
4777	RTMemFree(pGblMod->aRegions[i].paidPages);
4778	pGblMod->aRegions[i].paidPages = NULL;
4779	}
4780	}
4781	RTMemFree(pGblMod);
4782	}
4783
4784
4785	static int gmmR0ShModNewPerVM(PGVM pGVM, RTGCPTR GCBaseAddr, uint32_t cRegions, const VMMDEVSHAREDREGIONDESC *paRegions,
4786	PGMMSHAREDMODULEPERVM *ppRecVM)
4787	{
4788	if (pGVM->gmm.s.Stats.cShareableModules >= GMM_MAX_SHARED_PER_VM_MODULES)
4789	return VERR_GMM_TOO_MANY_PER_VM_MODULES;
4790
4791	PGMMSHAREDMODULEPERVM pRecVM;
4792	pRecVM = (PGMMSHAREDMODULEPERVM)RTMemAllocZ(RT_UOFFSETOF_DYN(GMMSHAREDMODULEPERVM, aRegionsGCPtrs[cRegions]));
4793	if (!pRecVM)
4794	return VERR_NO_MEMORY;
4795
4796	pRecVM->Core.Key = GCBaseAddr;
4797	for (uint32_t i = 0; i < cRegions; i++)
4798	pRecVM->aRegionsGCPtrs[i] = paRegions[i].GCRegionAddr;
4799
4800	bool fInsert = RTAvlGCPtrInsert(&pGVM->gmm.s.pSharedModuleTree, &pRecVM->Core);
4801	Assert(fInsert); NOREF(fInsert);
4802	pGVM->gmm.s.Stats.cShareableModules++;
4803
4804	*ppRecVM = pRecVM;
4805	return VINF_SUCCESS;
4806	}
4807
4808
4809	static void gmmR0ShModDeletePerVM(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULEPERVM pRecVM, bool fRemove)
4810	{
4811	/*
4812	* Free the per-VM module.
4813	*/
4814	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
4815	pRecVM->pGlobalModule = NULL;
4816
4817	if (fRemove)
4818	{
4819	void *pvTest = RTAvlGCPtrRemove(&pGVM->gmm.s.pSharedModuleTree, pRecVM->Core.Key);
4820	Assert(pvTest == &pRecVM->Core); NOREF(pvTest);
4821	}
4822
4823	RTMemFree(pRecVM);
4824
4825	/*
4826	* Release the global module.
4827	* (In the registration bailout case, it might not be.)
4828	*/
4829	if (pGblMod)
4830	{
4831	Assert(pGblMod->cUsers > 0);
4832	pGblMod->cUsers--;
4833	if (pGblMod->cUsers == 0)
4834	gmmR0ShModDeleteGlobal(pGMM, pGblMod);
4835	}
4836	}
4837
4838	#endif /* VBOX_WITH_PAGE_SHARING */
4839
4840	/**
4841	* Registers a new shared module for the VM.
4842	*
4843	* @returns VBox status code.
4844	* @param pGVM The global (ring-0) VM structure.
4845	* @param idCpu The VCPU id.
4846	* @param enmGuestOS The guest OS type.
4847	* @param pszModuleName The module name.
4848	* @param pszVersion The module version.
4849	* @param GCPtrModBase The module base address.
4850	* @param cbModule The module size.
4851	* @param cRegions The mumber of shared region descriptors.
4852	* @param paRegions Pointer to an array of shared region(s).
4853	* @thread EMT(idCpu)
4854	*/
4855	GMMR0DECL(int) GMMR0RegisterSharedModule(PGVM pGVM, VMCPUID idCpu, VBOXOSFAMILY enmGuestOS, char *pszModuleName,
4856	char *pszVersion, RTGCPTR GCPtrModBase, uint32_t cbModule,
4857	uint32_t cRegions, struct VMMDEVSHAREDREGIONDESC const *paRegions)
4858	{
4859	#ifdef VBOX_WITH_PAGE_SHARING
4860	/*
4861	* Validate input and get the basics.
4862	*
4863	* Note! Turns out the module size does necessarily match the size of the
4864	* regions. (iTunes on XP)
4865	*/
4866	PGMM pGMM;
4867	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
4868	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
4869	if (RT_FAILURE(rc))
4870	return rc;
4871
4872	if (RT_UNLIKELY(cRegions > VMMDEVSHAREDREGIONDESC_MAX))
4873	return VERR_GMM_TOO_MANY_REGIONS;
4874
4875	if (RT_UNLIKELY(cbModule == 0 \|\| cbModule > _1G))
4876	return VERR_GMM_BAD_SHARED_MODULE_SIZE;
4877
4878	uint32_t cbTotal = 0;
4879	for (uint32_t i = 0; i < cRegions; i++)
4880	{
4881	if (RT_UNLIKELY(paRegions[i].cbRegion == 0 \|\| paRegions[i].cbRegion > _1G))
4882	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4883
4884	cbTotal += paRegions[i].cbRegion;
4885	if (RT_UNLIKELY(cbTotal > _1G))
4886	return VERR_GMM_SHARED_MODULE_BAD_REGIONS_SIZE;
4887	}
4888
4889	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
4890	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
4891	return VERR_GMM_MODULE_NAME_TOO_LONG;
4892
4893	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
4894	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
4895	return VERR_GMM_MODULE_NAME_TOO_LONG;
4896
4897	uint32_t const uHash = gmmR0ShModCalcHash(pszModuleName, pszVersion);
4898	Log(("GMMR0RegisterSharedModule %s %s base %RGv size %x hash %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule, uHash));
4899
4900	/*
4901	* Take the semaphore and do some more validations.
4902	*/
4903	gmmR0MutexAcquire(pGMM);
4904	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
4905	{
4906	/*
4907	* Check if this module is already locally registered and register
4908	* it if it isn't. The base address is a unique module identifier
4909	* locally.
4910	*/
4911	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
4912	bool fNewModule = pRecVM == NULL;
4913	if (fNewModule)
4914	{
4915	rc = gmmR0ShModNewPerVM(pGVM, GCPtrModBase, cRegions, paRegions, &pRecVM);
4916	if (RT_SUCCESS(rc))
4917	{
4918	/*
4919	* Find a matching global module, register a new one if needed.
4920	*/
4921	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4922	pszModuleName, pszVersion, paRegions);
4923	if (!pGblMod)
4924	{
4925	Assert(fNewModule);
4926	rc = gmmR0ShModNewGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4927	pszModuleName, pszVersion, paRegions, &pGblMod);
4928	if (RT_SUCCESS(rc))
4929	{
4930	pRecVM->pGlobalModule = pGblMod; /* (One referenced returned by gmmR0ShModNewGlobal.) */
4931	Log(("GMMR0RegisterSharedModule: new module %s %s\n", pszModuleName, pszVersion));
4932	}
4933	else
4934	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
4935	}
4936	else
4937	{
4938	Assert(pGblMod->cUsers > 0 && pGblMod->cUsers < UINT32_MAX / 2);
4939	pGblMod->cUsers++;
4940	pRecVM->pGlobalModule = pGblMod;
4941
4942	Log(("GMMR0RegisterSharedModule: new per vm module %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4943	}
4944	}
4945	}
4946	else
4947	{
4948	/*
4949	* Attempt to re-register an existing module.
4950	*/
4951	PGMMSHAREDMODULE pGblMod = gmmR0ShModFindGlobal(pGMM, uHash, cbModule, enmGuestOS, cRegions,
4952	pszModuleName, pszVersion, paRegions);
4953	if (pRecVM->pGlobalModule == pGblMod)
4954	{
4955	Log(("GMMR0RegisterSharedModule: already registered %s %s, gbl users %d\n", pszModuleName, pszVersion, pGblMod->cUsers));
4956	rc = VINF_GMM_SHARED_MODULE_ALREADY_REGISTERED;
4957	}
4958	else
4959	{
4960	/** @todo may have to unregister+register when this happens in case it's caused
4961	* by VBoxService crashing and being restarted... */
4962	Log(("GMMR0RegisterSharedModule: Address clash!\n"
4963	" incoming at %RGvLB%#x %s %s rgns %u\n"
4964	" existing at %RGvLB%#x %s %s rgns %u\n",
4965	GCPtrModBase, cbModule, pszModuleName, pszVersion, cRegions,
4966	pRecVM->Core.Key, pRecVM->pGlobalModule->cbModule, pRecVM->pGlobalModule->szName,
4967	pRecVM->pGlobalModule->szVersion, pRecVM->pGlobalModule->cRegions));
4968	rc = VERR_GMM_SHARED_MODULE_ADDRESS_CLASH;
4969	}
4970	}
4971	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
4972	}
4973	else
4974	rc = VERR_GMM_IS_NOT_SANE;
4975
4976	gmmR0MutexRelease(pGMM);
4977	return rc;
4978	#else
4979
4980	NOREF(pGVM); NOREF(idCpu); NOREF(enmGuestOS); NOREF(pszModuleName); NOREF(pszVersion);
4981	NOREF(GCPtrModBase); NOREF(cbModule); NOREF(cRegions); NOREF(paRegions);
4982	return VERR_NOT_IMPLEMENTED;
4983	#endif
4984	}
4985
4986
4987	/**
4988	* VMMR0 request wrapper for GMMR0RegisterSharedModule.
4989	*
4990	* @returns see GMMR0RegisterSharedModule.
4991	* @param pGVM The global (ring-0) VM structure.
4992	* @param idCpu The VCPU id.
4993	* @param pReq Pointer to the request packet.
4994	*/
4995	GMMR0DECL(int) GMMR0RegisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMREGISTERSHAREDMODULEREQ pReq)
4996	{
4997	/*
4998	* Validate input and pass it on.
4999	*/
5000	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5001	AssertMsgReturn( pReq->Hdr.cbReq >= sizeof(*pReq)
5002	&& pReq->Hdr.cbReq == RT_UOFFSETOF_DYN(GMMREGISTERSHAREDMODULEREQ, aRegions[pReq->cRegions]),
5003	("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(*pReq)), VERR_INVALID_PARAMETER);
5004
5005	/* Pass back return code in the request packet to preserve informational codes. (VMMR3CallR0 chokes on them) */
5006	pReq->rc = GMMR0RegisterSharedModule(pGVM, idCpu, pReq->enmGuestOS, pReq->szName, pReq->szVersion,
5007	pReq->GCBaseAddr, pReq->cbModule, pReq->cRegions, pReq->aRegions);
5008	return VINF_SUCCESS;
5009	}
5010
5011
5012	/**
5013	* Unregisters a shared module for the VM
5014	*
5015	* @returns VBox status code.
5016	* @param pGVM The global (ring-0) VM structure.
5017	* @param idCpu The VCPU id.
5018	* @param pszModuleName The module name.
5019	* @param pszVersion The module version.
5020	* @param GCPtrModBase The module base address.
5021	* @param cbModule The module size.
5022	*/
5023	GMMR0DECL(int) GMMR0UnregisterSharedModule(PGVM pGVM, VMCPUID idCpu, char pszModuleName, char pszVersion,
5024	RTGCPTR GCPtrModBase, uint32_t cbModule)
5025	{
5026	#ifdef VBOX_WITH_PAGE_SHARING
5027	/*
5028	* Validate input and get the basics.
5029	*/
5030	PGMM pGMM;
5031	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5032	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5033	if (RT_FAILURE(rc))
5034	return rc;
5035
5036	AssertPtrReturn(pszModuleName, VERR_INVALID_POINTER);
5037	AssertPtrReturn(pszVersion, VERR_INVALID_POINTER);
5038	if (RT_UNLIKELY(!memchr(pszModuleName, '\0', GMM_SHARED_MODULE_MAX_NAME_STRING)))
5039	return VERR_GMM_MODULE_NAME_TOO_LONG;
5040	if (RT_UNLIKELY(!memchr(pszVersion, '\0', GMM_SHARED_MODULE_MAX_VERSION_STRING)))
5041	return VERR_GMM_MODULE_NAME_TOO_LONG;
5042
5043	Log(("GMMR0UnregisterSharedModule %s %s base=%RGv size %x\n", pszModuleName, pszVersion, GCPtrModBase, cbModule));
5044
5045	/*
5046	* Take the semaphore and do some more validations.
5047	*/
5048	gmmR0MutexAcquire(pGMM);
5049	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5050	{
5051	/*
5052	* Locate and remove the specified module.
5053	*/
5054	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)RTAvlGCPtrGet(&pGVM->gmm.s.pSharedModuleTree, GCPtrModBase);
5055	if (pRecVM)
5056	{
5057	/** @todo Do we need to do more validations here, like that the
5058	* name + version + cbModule matches? */
5059	NOREF(cbModule);
5060	Assert(pRecVM->pGlobalModule);
5061	gmmR0ShModDeletePerVM(pGMM, pGVM, pRecVM, true /fRemove/);
5062	}
5063	else
5064	rc = VERR_GMM_SHARED_MODULE_NOT_FOUND;
5065
5066	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5067	}
5068	else
5069	rc = VERR_GMM_IS_NOT_SANE;
5070
5071	gmmR0MutexRelease(pGMM);
5072	return rc;
5073	#else
5074
5075	NOREF(pGVM); NOREF(idCpu); NOREF(pszModuleName); NOREF(pszVersion); NOREF(GCPtrModBase); NOREF(cbModule);
5076	return VERR_NOT_IMPLEMENTED;
5077	#endif
5078	}
5079
5080
5081	/**
5082	* VMMR0 request wrapper for GMMR0UnregisterSharedModule.
5083	*
5084	* @returns see GMMR0UnregisterSharedModule.
5085	* @param pGVM The global (ring-0) VM structure.
5086	* @param idCpu The VCPU id.
5087	* @param pReq Pointer to the request packet.
5088	*/
5089	GMMR0DECL(int) GMMR0UnregisterSharedModuleReq(PGVM pGVM, VMCPUID idCpu, PGMMUNREGISTERSHAREDMODULEREQ pReq)
5090	{
5091	/*
5092	* Validate input and pass it on.
5093	*/
5094	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5095	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5096
5097	return GMMR0UnregisterSharedModule(pGVM, idCpu, pReq->szName, pReq->szVersion, pReq->GCBaseAddr, pReq->cbModule);
5098	}
5099
5100	#ifdef VBOX_WITH_PAGE_SHARING
5101
5102	/**
5103	* Increase the use count of a shared page, the page is known to exist and be valid and such.
5104	*
5105	* @param pGMM Pointer to the GMM instance.
5106	* @param pGVM Pointer to the GVM instance.
5107	* @param pPage The page structure.
5108	*/
5109	DECLINLINE(void) gmmR0UseSharedPage(PGMM pGMM, PGVM pGVM, PGMMPAGE pPage)
5110	{
5111	Assert(pGMM->cSharedPages > 0);
5112	Assert(pGMM->cAllocatedPages > 0);
5113
5114	pGMM->cDuplicatePages++;
5115
5116	pPage->Shared.cRefs++;
5117	pGVM->gmm.s.Stats.cSharedPages++;
5118	pGVM->gmm.s.Stats.Allocated.cBasePages++;
5119	}
5120
5121
5122	/**
5123	* Converts a private page to a shared page, the page is known to exist and be valid and such.
5124	*
5125	* @param pGMM Pointer to the GMM instance.
5126	* @param pGVM Pointer to the GVM instance.
5127	* @param HCPhys Host physical address
5128	* @param idPage The Page ID
5129	* @param pPage The page structure.
5130	* @param pPageDesc Shared page descriptor
5131	*/
5132	DECLINLINE(void) gmmR0ConvertToSharedPage(PGMM pGMM, PGVM pGVM, RTHCPHYS HCPhys, uint32_t idPage, PGMMPAGE pPage,
5133	PGMMSHAREDPAGEDESC pPageDesc)
5134	{
5135	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, idPage >> GMM_CHUNKID_SHIFT);
5136	Assert(pChunk);
5137	Assert(pChunk->cFree < GMM_CHUNK_NUM_PAGES);
5138	Assert(GMM_PAGE_IS_PRIVATE(pPage));
5139
5140	pChunk->cPrivate--;
5141	pChunk->cShared++;
5142
5143	pGMM->cSharedPages++;
5144
5145	pGVM->gmm.s.Stats.cSharedPages++;
5146	pGVM->gmm.s.Stats.cPrivatePages--;
5147
5148	/* Modify the page structure. */
5149	pPage->Shared.pfn = (uint32_t)(uint64_t)(HCPhys >> PAGE_SHIFT);
5150	pPage->Shared.cRefs = 1;
5151	#ifdef VBOX_STRICT
5152	pPageDesc->u32StrictChecksum = gmmR0StrictPageChecksum(pGMM, pGVM, idPage);
5153	pPage->Shared.u14Checksum = pPageDesc->u32StrictChecksum;
5154	#else
5155	NOREF(pPageDesc);
5156	pPage->Shared.u14Checksum = 0;
5157	#endif
5158	pPage->Shared.u2State = GMM_PAGE_STATE_SHARED;
5159	}
5160
5161
5162	static int gmmR0SharedModuleCheckPageFirstTime(PGMM pGMM, PGVM pGVM, PGMMSHAREDMODULE pModule,
5163	unsigned idxRegion, unsigned idxPage,
5164	PGMMSHAREDPAGEDESC pPageDesc, PGMMSHAREDREGIONDESC pGlobalRegion)
5165	{
5166	NOREF(pModule);
5167
5168	/* Easy case: just change the internal page type. */
5169	PGMMPAGE pPage = gmmR0GetPage(pGMM, pPageDesc->idPage);
5170	AssertMsgReturn(pPage, ("idPage=%#x (GCPhys=%RGp HCPhys=%RHp idxRegion=%#x idxPage=%#x) #1\n",
5171	pPageDesc->idPage, pPageDesc->GCPhys, pPageDesc->HCPhys, idxRegion, idxPage),
5172	VERR_PGM_PHYS_INVALID_PAGE_ID);
5173	NOREF(idxRegion);
5174
5175	AssertMsg(pPageDesc->GCPhys == (pPage->Private.pfn << 12), ("desc %RGp gmm %RGp\n", pPageDesc->HCPhys, (pPage->Private.pfn << 12)));
5176
5177	gmmR0ConvertToSharedPage(pGMM, pGVM, pPageDesc->HCPhys, pPageDesc->idPage, pPage, pPageDesc);
5178
5179	/* Keep track of these references. */
5180	pGlobalRegion->paidPages[idxPage] = pPageDesc->idPage;
5181
5182	return VINF_SUCCESS;
5183	}
5184
5185	/**
5186	* Checks specified shared module range for changes
5187	*
5188	* Performs the following tasks:
5189	* - If a shared page is new, then it changes the GMM page type to shared and
5190	* returns it in the pPageDesc descriptor.
5191	* - If a shared page already exists, then it checks if the VM page is
5192	* identical and if so frees the VM page and returns the shared page in
5193	* pPageDesc descriptor.
5194	*
5195	* @remarks ASSUMES the caller has acquired the GMM semaphore!!
5196	*
5197	* @returns VBox status code.
5198	* @param pGVM Pointer to the GVM instance data.
5199	* @param pModule Module description
5200	* @param idxRegion Region index
5201	* @param idxPage Page index
5202	* @param pPageDesc Page descriptor
5203	*/
5204	GMMR0DECL(int) GMMR0SharedModuleCheckPage(PGVM pGVM, PGMMSHAREDMODULE pModule, uint32_t idxRegion, uint32_t idxPage,
5205	PGMMSHAREDPAGEDESC pPageDesc)
5206	{
5207	int rc;
5208	PGMM pGMM;
5209	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5210	pPageDesc->u32StrictChecksum = 0;
5211
5212	AssertMsgReturn(idxRegion < pModule->cRegions,
5213	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5214	VERR_INVALID_PARAMETER);
5215
5216	uint32_t const cPages = pModule->aRegions[idxRegion].cb >> PAGE_SHIFT;
5217	AssertMsgReturn(idxPage < cPages,
5218	("idxRegion=%#x cRegions=%#x %s %s\n", idxRegion, pModule->cRegions, pModule->szName, pModule->szVersion),
5219	VERR_INVALID_PARAMETER);
5220
5221	LogFlow(("GMMR0SharedModuleCheckRange %s base %RGv region %d idxPage %d\n", pModule->szName, pModule->Core.Key, idxRegion, idxPage));
5222
5223	/*
5224	* First time; create a page descriptor array.
5225	*/
5226	PGMMSHAREDREGIONDESC pGlobalRegion = &pModule->aRegions[idxRegion];
5227	if (!pGlobalRegion->paidPages)
5228	{
5229	Log(("Allocate page descriptor array for %d pages\n", cPages));
5230	pGlobalRegion->paidPages = (uint32_t )RTMemAlloc(cPages sizeof(pGlobalRegion->paidPages[0]));
5231	AssertReturn(pGlobalRegion->paidPages, VERR_NO_MEMORY);
5232
5233	/* Invalidate all descriptors. */
5234	uint32_t i = cPages;
5235	while (i-- > 0)
5236	pGlobalRegion->paidPages[i] = NIL_GMM_PAGEID;
5237	}
5238
5239	/*
5240	* We've seen this shared page for the first time?
5241	*/
5242	if (pGlobalRegion->paidPages[idxPage] == NIL_GMM_PAGEID)
5243	{
5244	Log(("New shared page guest %RGp host %RHp\n", pPageDesc->GCPhys, pPageDesc->HCPhys));
5245	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5246	}
5247
5248	/*
5249	* We've seen it before...
5250	*/
5251	Log(("Replace existing page guest %RGp host %RHp id %#x -> id %#x\n",
5252	pPageDesc->GCPhys, pPageDesc->HCPhys, pPageDesc->idPage, pGlobalRegion->paidPages[idxPage]));
5253	Assert(pPageDesc->idPage != pGlobalRegion->paidPages[idxPage]);
5254
5255	/*
5256	* Get the shared page source.
5257	*/
5258	PGMMPAGE pPage = gmmR0GetPage(pGMM, pGlobalRegion->paidPages[idxPage]);
5259	AssertMsgReturn(pPage, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #2\n", pPageDesc->idPage, idxRegion, idxPage),
5260	VERR_PGM_PHYS_INVALID_PAGE_ID);
5261
5262	if (pPage->Common.u2State != GMM_PAGE_STATE_SHARED)
5263	{
5264	/*
5265	* Page was freed at some point; invalidate this entry.
5266	*/
5267	/** @todo this isn't really bullet proof. */
5268	Log(("Old shared page was freed -> create a new one\n"));
5269	pGlobalRegion->paidPages[idxPage] = NIL_GMM_PAGEID;
5270	return gmmR0SharedModuleCheckPageFirstTime(pGMM, pGVM, pModule, idxRegion, idxPage, pPageDesc, pGlobalRegion);
5271	}
5272
5273	Log(("Replace existing page guest host %RHp -> %RHp\n", pPageDesc->HCPhys, ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT));
5274
5275	/*
5276	* Calculate the virtual address of the local page.
5277	*/
5278	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pPageDesc->idPage >> GMM_CHUNKID_SHIFT);
5279	AssertMsgReturn(pChunk, ("idPage=%#x (idxRegion=%#x idxPage=%#x) #4\n", pPageDesc->idPage, idxRegion, idxPage),
5280	VERR_PGM_PHYS_INVALID_PAGE_ID);
5281
5282	uint8_t *pbChunk;
5283	AssertMsgReturn(gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk),
5284	("idPage=%#x (idxRegion=%#x idxPage=%#x) #3\n", pPageDesc->idPage, idxRegion, idxPage),
5285	VERR_PGM_PHYS_INVALID_PAGE_ID);
5286	uint8_t *pbLocalPage = pbChunk + ((pPageDesc->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5287
5288	/*
5289	* Calculate the virtual address of the shared page.
5290	*/
5291	pChunk = gmmR0GetChunk(pGMM, pGlobalRegion->paidPages[idxPage] >> GMM_CHUNKID_SHIFT);
5292	Assert(pChunk); /* can't fail as gmmR0GetPage succeeded. */
5293
5294	/*
5295	* Get the virtual address of the physical page; map the chunk into the VM
5296	* process if not already done.
5297	*/
5298	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5299	{
5300	Log(("Map chunk into process!\n"));
5301	rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5302	AssertRCReturn(rc, rc);
5303	}
5304	uint8_t *pbSharedPage = pbChunk + ((pGlobalRegion->paidPages[idxPage] & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5305
5306	#ifdef VBOX_STRICT
5307	pPageDesc->u32StrictChecksum = RTCrc32(pbSharedPage, PAGE_SIZE);
5308	uint32_t uChecksum = pPageDesc->u32StrictChecksum & UINT32_C(0x00003fff);
5309	AssertMsg(!uChecksum \|\| uChecksum == pPage->Shared.u14Checksum \|\| !pPage->Shared.u14Checksum,
5310	("%#x vs %#x - idPage=%#x - %s %s\n", uChecksum, pPage->Shared.u14Checksum,
5311	pGlobalRegion->paidPages[idxPage], pModule->szName, pModule->szVersion));
5312	#endif
5313
5314	/** @todo write ASMMemComparePage. */
5315	if (memcmp(pbSharedPage, pbLocalPage, PAGE_SIZE))
5316	{
5317	Log(("Unexpected differences found between local and shared page; skip\n"));
5318	/* Signal to the caller that this one hasn't changed. */
5319	pPageDesc->idPage = NIL_GMM_PAGEID;
5320	return VINF_SUCCESS;
5321	}
5322
5323	/*
5324	* Free the old local page.
5325	*/
5326	GMMFREEPAGEDESC PageDesc;
5327	PageDesc.idPage = pPageDesc->idPage;
5328	rc = gmmR0FreePages(pGMM, pGVM, 1, &PageDesc, GMMACCOUNT_BASE);
5329	AssertRCReturn(rc, rc);
5330
5331	gmmR0UseSharedPage(pGMM, pGVM, pPage);
5332
5333	/*
5334	* Pass along the new physical address & page id.
5335	*/
5336	pPageDesc->HCPhys = ((uint64_t)pPage->Shared.pfn) << PAGE_SHIFT;
5337	pPageDesc->idPage = pGlobalRegion->paidPages[idxPage];
5338
5339	return VINF_SUCCESS;
5340	}
5341
5342
5343	/**
5344	* RTAvlGCPtrDestroy callback.
5345	*
5346	* @returns 0 or VERR_GMM_INSTANCE.
5347	* @param pNode The node to destroy.
5348	* @param pvArgs Pointer to an argument packet.
5349	*/
5350	static DECLCALLBACK(int) gmmR0CleanupSharedModule(PAVLGCPTRNODECORE pNode, void *pvArgs)
5351	{
5352	gmmR0ShModDeletePerVM(((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGMM,
5353	((GMMR0SHMODPERVMDTORARGS *)pvArgs)->pGVM,
5354	(PGMMSHAREDMODULEPERVM)pNode,
5355	false /fRemove/);
5356	return VINF_SUCCESS;
5357	}
5358
5359
5360	/**
5361	* Used by GMMR0CleanupVM to clean up shared modules.
5362	*
5363	* This is called without taking the GMM lock so that it can be yielded as
5364	* needed here.
5365	*
5366	* @param pGMM The GMM handle.
5367	* @param pGVM The global VM handle.
5368	*/
5369	static void gmmR0SharedModuleCleanup(PGMM pGMM, PGVM pGVM)
5370	{
5371	gmmR0MutexAcquire(pGMM);
5372	GMM_CHECK_SANITY_UPON_ENTERING(pGMM);
5373
5374	GMMR0SHMODPERVMDTORARGS Args;
5375	Args.pGVM = pGVM;
5376	Args.pGMM = pGMM;
5377	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5378
5379	AssertMsg(pGVM->gmm.s.Stats.cShareableModules == 0, ("%d\n", pGVM->gmm.s.Stats.cShareableModules));
5380	pGVM->gmm.s.Stats.cShareableModules = 0;
5381
5382	gmmR0MutexRelease(pGMM);
5383	}
5384
5385	#endif /* VBOX_WITH_PAGE_SHARING */
5386
5387	/**
5388	* Removes all shared modules for the specified VM
5389	*
5390	* @returns VBox status code.
5391	* @param pGVM The global (ring-0) VM structure.
5392	* @param idCpu The VCPU id.
5393	*/
5394	GMMR0DECL(int) GMMR0ResetSharedModules(PGVM pGVM, VMCPUID idCpu)
5395	{
5396	#ifdef VBOX_WITH_PAGE_SHARING
5397	/*
5398	* Validate input and get the basics.
5399	*/
5400	PGMM pGMM;
5401	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5402	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5403	if (RT_FAILURE(rc))
5404	return rc;
5405
5406	/*
5407	* Take the semaphore and do some more validations.
5408	*/
5409	gmmR0MutexAcquire(pGMM);
5410	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5411	{
5412	Log(("GMMR0ResetSharedModules\n"));
5413	GMMR0SHMODPERVMDTORARGS Args;
5414	Args.pGVM = pGVM;
5415	Args.pGMM = pGMM;
5416	RTAvlGCPtrDestroy(&pGVM->gmm.s.pSharedModuleTree, gmmR0CleanupSharedModule, &Args);
5417	pGVM->gmm.s.Stats.cShareableModules = 0;
5418
5419	rc = VINF_SUCCESS;
5420	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5421	}
5422	else
5423	rc = VERR_GMM_IS_NOT_SANE;
5424
5425	gmmR0MutexRelease(pGMM);
5426	return rc;
5427	#else
5428	RT_NOREF(pGVM, idCpu);
5429	return VERR_NOT_IMPLEMENTED;
5430	#endif
5431	}
5432
5433	#ifdef VBOX_WITH_PAGE_SHARING
5434
5435	/**
5436	* Tree enumeration callback for checking a shared module.
5437	*/
5438	static DECLCALLBACK(int) gmmR0CheckSharedModule(PAVLGCPTRNODECORE pNode, void *pvUser)
5439	{
5440	GMMCHECKSHAREDMODULEINFO pArgs = (GMMCHECKSHAREDMODULEINFO)pvUser;
5441	PGMMSHAREDMODULEPERVM pRecVM = (PGMMSHAREDMODULEPERVM)pNode;
5442	PGMMSHAREDMODULE pGblMod = pRecVM->pGlobalModule;
5443
5444	Log(("gmmR0CheckSharedModule: check %s %s base=%RGv size=%x\n",
5445	pGblMod->szName, pGblMod->szVersion, pGblMod->Core.Key, pGblMod->cbModule));
5446
5447	int rc = PGMR0SharedModuleCheck(pArgs->pGVM, pArgs->pGVM, pArgs->idCpu, pGblMod, pRecVM->aRegionsGCPtrs);
5448	if (RT_FAILURE(rc))
5449	return rc;
5450	return VINF_SUCCESS;
5451	}
5452
5453	#endif /* VBOX_WITH_PAGE_SHARING */
5454
5455	/**
5456	* Check all shared modules for the specified VM.
5457	*
5458	* @returns VBox status code.
5459	* @param pGVM The global (ring-0) VM structure.
5460	* @param idCpu The calling EMT number.
5461	* @thread EMT(idCpu)
5462	*/
5463	GMMR0DECL(int) GMMR0CheckSharedModules(PGVM pGVM, VMCPUID idCpu)
5464	{
5465	#ifdef VBOX_WITH_PAGE_SHARING
5466	/*
5467	* Validate input and get the basics.
5468	*/
5469	PGMM pGMM;
5470	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5471	int rc = GVMMR0ValidateGVMandEMT(pGVM, idCpu);
5472	if (RT_FAILURE(rc))
5473	return rc;
5474
5475	# ifndef DEBUG_sandervl
5476	/*
5477	* Take the semaphore and do some more validations.
5478	*/
5479	gmmR0MutexAcquire(pGMM);
5480	# endif
5481	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5482	{
5483	/*
5484	* Walk the tree, checking each module.
5485	*/
5486	Log(("GMMR0CheckSharedModules\n"));
5487
5488	GMMCHECKSHAREDMODULEINFO Args;
5489	Args.pGVM = pGVM;
5490	Args.idCpu = idCpu;
5491	rc = RTAvlGCPtrDoWithAll(&pGVM->gmm.s.pSharedModuleTree, true /* fFromLeft */, gmmR0CheckSharedModule, &Args);
5492
5493	Log(("GMMR0CheckSharedModules done (rc=%Rrc)!\n", rc));
5494	GMM_CHECK_SANITY_UPON_LEAVING(pGMM);
5495	}
5496	else
5497	rc = VERR_GMM_IS_NOT_SANE;
5498
5499	# ifndef DEBUG_sandervl
5500	gmmR0MutexRelease(pGMM);
5501	# endif
5502	return rc;
5503	#else
5504	RT_NOREF(pGVM, idCpu);
5505	return VERR_NOT_IMPLEMENTED;
5506	#endif
5507	}
5508
5509	#if defined(VBOX_STRICT) && HC_ARCH_BITS == 64
5510
5511	/**
5512	* Worker for GMMR0FindDuplicatePageReq.
5513	*
5514	* @returns true if duplicate, false if not.
5515	*/
5516	static bool gmmR0FindDupPageInChunk(PGMM pGMM, PGVM pGVM, PGMMCHUNK pChunk, uint8_t const *pbSourcePage)
5517	{
5518	bool fFoundDuplicate = false;
5519	/* Only take chunks not mapped into this VM process; not entirely correct. */
5520	uint8_t *pbChunk;
5521	if (!gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5522	{
5523	int rc = gmmR0MapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/, (PRTR3PTR)&pbChunk);
5524	if (RT_SUCCESS(rc))
5525	{
5526	/*
5527	* Look for duplicate pages
5528	*/
5529	uintptr_t iPage = (GMM_CHUNK_SIZE >> PAGE_SHIFT);
5530	while (iPage-- > 0)
5531	{
5532	if (GMM_PAGE_IS_PRIVATE(&pChunk->aPages[iPage]))
5533	{
5534	uint8_t *pbDestPage = pbChunk + (iPage << PAGE_SHIFT);
5535	if (!memcmp(pbSourcePage, pbDestPage, PAGE_SIZE))
5536	{
5537	fFoundDuplicate = true;
5538	break;
5539	}
5540	}
5541	}
5542	gmmR0UnmapChunk(pGMM, pGVM, pChunk, false /fRelaxedSem/);
5543	}
5544	}
5545	return fFoundDuplicate;
5546	}
5547
5548
5549	/**
5550	* Find a duplicate of the specified page in other active VMs
5551	*
5552	* @returns VBox status code.
5553	* @param pGVM The global (ring-0) VM structure.
5554	* @param pReq Pointer to the request packet.
5555	*/
5556	GMMR0DECL(int) GMMR0FindDuplicatePageReq(PGVM pGVM, PGMMFINDDUPLICATEPAGEREQ pReq)
5557	{
5558	/*
5559	* Validate input and pass it on.
5560	*/
5561	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5562	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5563
5564	PGMM pGMM;
5565	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5566
5567	int rc = GVMMR0ValidateGVM(pGVM);
5568	if (RT_FAILURE(rc))
5569	return rc;
5570
5571	/*
5572	* Take the semaphore and do some more validations.
5573	*/
5574	rc = gmmR0MutexAcquire(pGMM);
5575	if (GMM_CHECK_SANITY_UPON_ENTERING(pGMM))
5576	{
5577	uint8_t *pbChunk;
5578	PGMMCHUNK pChunk = gmmR0GetChunk(pGMM, pReq->idPage >> GMM_CHUNKID_SHIFT);
5579	if (pChunk)
5580	{
5581	if (gmmR0IsChunkMapped(pGMM, pGVM, pChunk, (PRTR3PTR)&pbChunk))
5582	{
5583	uint8_t *pbSourcePage = pbChunk + ((pReq->idPage & GMM_PAGEID_IDX_MASK) << PAGE_SHIFT);
5584	PGMMPAGE pPage = gmmR0GetPage(pGMM, pReq->idPage);
5585	if (pPage)
5586	{
5587	/*
5588	* Walk the chunks
5589	*/
5590	pReq->fDuplicate = false;
5591	RTListForEach(&pGMM->ChunkList, pChunk, GMMCHUNK, ListNode)
5592	{
5593	if (gmmR0FindDupPageInChunk(pGMM, pGVM, pChunk, pbSourcePage))
5594	{
5595	pReq->fDuplicate = true;
5596	break;
5597	}
5598	}
5599	}
5600	else
5601	{
5602	AssertFailed();
5603	rc = VERR_PGM_PHYS_INVALID_PAGE_ID;
5604	}
5605	}
5606	else
5607	AssertFailed();
5608	}
5609	else
5610	AssertFailed();
5611	}
5612	else
5613	rc = VERR_GMM_IS_NOT_SANE;
5614
5615	gmmR0MutexRelease(pGMM);
5616	return rc;
5617	}
5618
5619	#endif /* VBOX_STRICT && HC_ARCH_BITS == 64 */
5620
5621
5622	/**
5623	* Retrieves the GMM statistics visible to the caller.
5624	*
5625	* @returns VBox status code.
5626	*
5627	* @param pStats Where to put the statistics.
5628	* @param pSession The current session.
5629	* @param pGVM The GVM to obtain statistics for. Optional.
5630	*/
5631	GMMR0DECL(int) GMMR0QueryStatistics(PGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5632	{
5633	LogFlow(("GVMMR0QueryStatistics: pStats=%p pSession=%p pGVM=%p\n", pStats, pSession, pGVM));
5634
5635	/*
5636	* Validate input.
5637	*/
5638	AssertPtrReturn(pSession, VERR_INVALID_POINTER);
5639	AssertPtrReturn(pStats, VERR_INVALID_POINTER);
5640	pStats->cMaxPages = 0; /* (crash before taking the mutex...) */
5641
5642	PGMM pGMM;
5643	GMM_GET_VALID_INSTANCE(pGMM, VERR_GMM_INSTANCE);
5644
5645	/*
5646	* Validate the VM handle, if not NULL, and lock the GMM.
5647	*/
5648	int rc;
5649	if (pGVM)
5650	{
5651	rc = GVMMR0ValidateGVM(pGVM);
5652	if (RT_FAILURE(rc))
5653	return rc;
5654	}
5655
5656	rc = gmmR0MutexAcquire(pGMM);
5657	if (RT_FAILURE(rc))
5658	return rc;
5659
5660	/*
5661	* Copy out the GMM statistics.
5662	*/
5663	pStats->cMaxPages = pGMM->cMaxPages;
5664	pStats->cReservedPages = pGMM->cReservedPages;
5665	pStats->cOverCommittedPages = pGMM->cOverCommittedPages;
5666	pStats->cAllocatedPages = pGMM->cAllocatedPages;
5667	pStats->cSharedPages = pGMM->cSharedPages;
5668	pStats->cDuplicatePages = pGMM->cDuplicatePages;
5669	pStats->cLeftBehindSharedPages = pGMM->cLeftBehindSharedPages;
5670	pStats->cBalloonedPages = pGMM->cBalloonedPages;
5671	pStats->cChunks = pGMM->cChunks;
5672	pStats->cFreedChunks = pGMM->cFreedChunks;
5673	pStats->cShareableModules = pGMM->cShareableModules;
5674	pStats->idFreeGeneration = pGMM->idFreeGeneration;
5675	RT_ZERO(pStats->au64Reserved);
5676
5677	/*
5678	* Copy out the VM statistics.
5679	*/
5680	if (pGVM)
5681	pStats->VMStats = pGVM->gmm.s.Stats;
5682	else
5683	RT_ZERO(pStats->VMStats);
5684
5685	gmmR0MutexRelease(pGMM);
5686	return rc;
5687	}
5688
5689
5690	/**
5691	* VMMR0 request wrapper for GMMR0QueryStatistics.
5692	*
5693	* @returns see GMMR0QueryStatistics.
5694	* @param pGVM The global (ring-0) VM structure. Optional.
5695	* @param pReq Pointer to the request packet.
5696	*/
5697	GMMR0DECL(int) GMMR0QueryStatisticsReq(PGVM pGVM, PGMMQUERYSTATISTICSSREQ pReq)
5698	{
5699	/*
5700	* Validate input and pass it on.
5701	*/
5702	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5703	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5704
5705	return GMMR0QueryStatistics(&pReq->Stats, pReq->pSession, pGVM);
5706	}
5707
5708
5709	/**
5710	* Resets the specified GMM statistics.
5711	*
5712	* @returns VBox status code.
5713	*
5714	* @param pStats Which statistics to reset, that is, non-zero fields
5715	* indicates which to reset.
5716	* @param pSession The current session.
5717	* @param pGVM The GVM to reset statistics for. Optional.
5718	*/
5719	GMMR0DECL(int) GMMR0ResetStatistics(PCGMMSTATS pStats, PSUPDRVSESSION pSession, PGVM pGVM)
5720	{
5721	NOREF(pStats); NOREF(pSession); NOREF(pGVM);
5722	/* Currently nothing we can reset at the moment. */
5723	return VINF_SUCCESS;
5724	}
5725
5726
5727	/**
5728	* VMMR0 request wrapper for GMMR0ResetStatistics.
5729	*
5730	* @returns see GMMR0ResetStatistics.
5731	* @param pGVM The global (ring-0) VM structure. Optional.
5732	* @param pReq Pointer to the request packet.
5733	*/
5734	GMMR0DECL(int) GMMR0ResetStatisticsReq(PGVM pGVM, PGMMRESETSTATISTICSSREQ pReq)
5735	{
5736	/*
5737	* Validate input and pass it on.
5738	*/
5739	AssertPtrReturn(pReq, VERR_INVALID_POINTER);
5740	AssertMsgReturn(pReq->Hdr.cbReq == sizeof(pReq), ("%#x != %#x\n", pReq->Hdr.cbReq, sizeof(pReq)), VERR_INVALID_PARAMETER);
5741
5742	return GMMR0ResetStatistics(&pReq->Stats, pReq->pSession, pGVM);
5743	}
5744

Note: See TracBrowser for help on using the repository browser.

source: vbox/trunk/src/VBox/VMM/VMMR0/GMMR0.cpp@ 91044

Download in other formats: