VBox

Timestamp:

Sep 4, 2007 10:17:48 AM (17 years ago)

Author:

vboxsync

Message:

lunch commit.

File:

: 1 edited

trunk/src/VBox/VMM/PGM.cpp (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

trunk/src/VBox/VMM/PGM.cpp

-              r4388
+              r4511
  */
+/** @page pg_pgmPhys PGMPhys - Physical Guest Memory Management.
+ *
+ *
+ * Objectives:
+ *      - Guest RAM over-commitment using memory ballooning,
+ *        zero pages and general page sharing.
+ *      - Moving or mirroring a VM onto a different physical machine.
+ *
+ *
+ * @subsection subsec_pg_pgmPhys_AllocPage      Allocating a page.
+ *
+ * Initially we map *all* guest memory to the (per VM) zero page, which
+ * means that none of the read functions will cause pages to be allocated.
+ *
+ * Exception, access bit in page tables that have been shared. This must
+ * be handled, but we must also make sure PGMGst*Modify doesn't make
+ * unnecessary modifications.
+ *
+ * Allocation points:
+ *      - PGMPhysWriteGCPhys and PGMPhysWrite.
+ *      - Replacing a zero page mapping at \#PF.
+ *      - Replacing a shared page mapping at \#PF.
+ *      - ROM registration (currently MMR3RomRegister).
+ *      - VM restore (pgmR3Load).
+ *
+ * For the first three it would make sense to keep a few pages handy
+ * until we've reached the max memory commitment for the VM.
+ *
+ * For the ROM registration, we know exactly how many pages we need
+ * and will request these from ring-0. For restore, we will save
+ * the number of non-zero pages in the saved state and allocate
+ * them up front. This would allow the ring-0 component to refuse
+ * the request if the isn't sufficient memory available for VM use.
+ *
+ * Btw. for both ROM and restore allocations we won't be requiring
+ * zeroed pages as they are going to be filled instantly.
+ *
+ *
+ * @subsection subsec_pgmPhys_FreePage          Freeing a page
+ *
+ * There are a few points where a page can be freed:
+ *      - After being replaced by the zero page.
+ *      - After being replaced by a shared page.
+ *      - After being ballooned by the guest additions.
+ *      - At reset.
+ *      - At restore.
+ *
+ * When freeing one or more pages they will be returned to the ring-0
+ * component and replaced by the zero page.
+ *
+ * The reasoning for clearing out all the pages on reset is that it will
+ * return us to the exact same state as on power on, and may thereby help
+ * us reduce the memory load on the system. Further it might have a
+ * (temporary) positive influence on memory fragmentation (@see subsec_pgmPhys_Fragmentation).
+ *
+ * On restore, as mention under the allocation topic, pages should be
+ * freed / allocated depending on how many is actually required by the
+ * new VM state. The simplest approach is to do like on reset, and free
+ * all non-ROM pages and then allocate what we need.
+ *
+ * A measure to prevent some fragmentation, would be to let each allocation
+ * chunk have some affinity towards the VM having allocated the most pages
+ * from it. Also, try make sure to allocate from allocation chunks that
+ * are almost full. Admittedly, both these measures might work counter to
+ * our intentions and its probably not worth putting a lot of effort,
+ * cpu time or memory into this.
+ *
+ *
+ * @subsection subsec_pgmPhys_SharePage         Sharing a page
+ *
+ * The basic idea is that there there will be a idle priority kernel
+ * thread walking the non-shared VM pages hashing them and looking for
+ * pages with the same checksum. If such pages are found, it will compare
+ * them byte-by-byte to see if they actually are identical. If found to be
+ * identical it will allocate a shared page, copy the content, check that
+ * the page didn't change while doing this, and finally request both the
+ * VMs to use the shared page instead. If the page is all zeros (special
+ * checksum and byte-by-byte check) it will request the VM that owns it
+ * to replace it with the zero page.
+ *
+ * To make this efficient, we will have to make sure not to try share a page
+ * that will change its contents soon. This part requires the most work.
+ * A simple idea would be to request the VM to write monitor the page for
+ * a while to make sure it isn't modified any time soon. Also, it may
+ * make sense to skip pages that are being write monitored since this
+ * information is readily available to the thread if it works on the
+ * per-VM guest memory structures (presently called PGMRAMRANGE).
+ *
+ *
+ * @subsection subsec_pgmPhys_Fragmentation     Fragmentation Concerns and Counter Measures
+ *
+ * The pages are organized in allocation chunks in ring-0, this is a necessity
+ * if we wish to have an OS agnostic approach to this whole thing. (On Linux we
+ * could easily work on a page-by-page basis if we liked. Whether this is possible
+ * or efficient on NT I don't quite know.) Fragmentation within these chunks may
+ * become a problem as part of the idea here is that we wish to return memory to
+ * the host system.
+ *
+ * For instance, starting two VMs at the same time, they will both allocate the
+ * guest memory on-demand and if permitted their page allocations will be
+ * intermixed. Shut down one of the two VMs and it will be difficult to return
+ * any memory to the host system because the page allocation for the two VMs are
+ * mixed up in the same allocation chunks.
+ *
+ * To further complicate matters, when pages are freed because they have been
+ * ballooned or become shared/zero the whole idea is that the page is supposed
+ * to be reused by another VM or returned to the host system. This will cause
+ * allocation chunks to contain pages belonging to different VMs and prevent
+ * returning memory to the host when one of those VM shuts down.
+ *
+ * The only way to really deal with this problem is to move pages. This can
+ * either be done at VM shutdown and or by the idle priority worker thread
+ * that will be responsible for finding sharable/zero pages. The mechanisms
+ * involved for coercing a VM to move a page (or to do it for it) will be
+ * the same as when telling it to share/zero a page.
+ *
+ *
+ * @subsection subsec_pgmPhys_Serializing       Tracking Structures And Their Cost
+ *
+ * There's a difficult balance between keeping the per-page tracking structures
+ * (global and guest page) easy to use and keeping them from eating too much
+ * memory. We have limited virtual memory resources available when operating in
+ * 32-bit kernel space (on 64-bit there'll it's quite a different story). The
+ * tracking structures will be attemted designed such that we can deal with up
+ * to 32GB of memory on a 32-bit system and essentially unlimited on 64-bit ones.
+ *
+ * ...
+ *
+ *
+ * @subsection subsec_pgmPhys_Serializing       Serializing Access
+ *
+ * Initially, we'll try a simple scheme:
+ *
+ *      - The per-VM RAM tracking structures (PGMRAMRANGE) is only modified
+ *        by the EMT thread of that VM while in the pgm critsect.
+ *      - Other threads in the VM process that needs to make reliable use of
+ *        the per-VM RAM tracking structures will enter the critsect.
+ *      - No process external thread or kernel thread will ever try enter
+ *        the pgm critical section, as that just won't work.
+ *      - The idle thread (and similar threads) doesn't not need 100% reliable
+ *        data when performing it tasks as the EMT thread will be the one to
+ *        do the actual changes later anyway. So, as long as it only accesses
+ *        the main ram range, it can do so by somehow preventing the VM from
+ *        being destroyed while it works on it...
+ *
+ *      - The over-commitment management, including the allocating/freeing
+ *        chunks, is serialized by a ring-0 mutex lock (a fast one since the
+ *        more mundane mutex implementation is broken on Linux).
+ *      - A separeate mutex is protecting the set of allocation chunks so
+ *        that pages can be shared or/and freed up while some other VM is
+ *        allocating more chunks. This mutex can be take from under the other
+ *        one, but not the otherway around.
+ *
+ *
+ * @subsection subsec_pgmPhys_Request           VM Request interface
+ *
+ * When in ring-0 it will become necessary to send requests to a VM so it can
+ * for instance move a page while defragmenting during VM destroy. The idle
+ * thread will make use of this interface to request VMs to setup shared
+ * pages and to perform write monitoring of pages.
+ *
+ * I would propose an interface similar to the current VMReq interface, similar
+ * in that it doesn't require locking and that the one sending the request may
+ * wait for completion if it wishes to. This shouldn't be very difficult to
+ * realize.
+ *
+ * The requests themselves are also pretty simple. They are basically:
+ *      -# Check that some precondition is still true.
+ *      -# Do the update.
+ *      -# Update all shadow page tables involved with the page.
+ *
+ * The 3rd step is identical to what we're already doing when updating a
+ * physical handler, see pgmHandlerPhysicalSetRamFlagsAndFlushShadowPTs.
+ *
+ */

Note: See TracChangeset for help on using the changeset viewer.

Changeset 4511 in vbox for trunk/src/VBox

Legend:

trunk/src/VBox/VMM/PGM.cpp

Download in other formats: