VirtualBox

Changeset 4511 in vbox for trunk


Ignore:
Timestamp:
Sep 4, 2007 10:17:48 AM (17 years ago)
Author:
vboxsync
Message:

lunch commit.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/VBox/VMM/PGM.cpp

    r4388 r4511  
    106106 */
    107107
     108
     109/** @page pg_pgmPhys PGMPhys - Physical Guest Memory Management.
     110 *
     111 *
     112 * Objectives:
     113 *      - Guest RAM over-commitment using memory ballooning,
     114 *        zero pages and general page sharing.
     115 *      - Moving or mirroring a VM onto a different physical machine.
     116 *
     117 *
     118 * @subsection subsec_pg_pgmPhys_AllocPage      Allocating a page.
     119 *
     120 * Initially we map *all* guest memory to the (per VM) zero page, which
     121 * means that none of the read functions will cause pages to be allocated.
     122 *
     123 * Exception, access bit in page tables that have been shared. This must
     124 * be handled, but we must also make sure PGMGst*Modify doesn't make
     125 * unnecessary modifications.
     126 *
     127 * Allocation points:
     128 *      - PGMPhysWriteGCPhys and PGMPhysWrite.
     129 *      - Replacing a zero page mapping at \#PF.
     130 *      - Replacing a shared page mapping at \#PF.
     131 *      - ROM registration (currently MMR3RomRegister).
     132 *      - VM restore (pgmR3Load).
     133 *
     134 * For the first three it would make sense to keep a few pages handy
     135 * until we've reached the max memory commitment for the VM.
     136 *
     137 * For the ROM registration, we know exactly how many pages we need
     138 * and will request these from ring-0. For restore, we will save
     139 * the number of non-zero pages in the saved state and allocate
     140 * them up front. This would allow the ring-0 component to refuse
     141 * the request if the isn't sufficient memory available for VM use.
     142 *
     143 * Btw. for both ROM and restore allocations we won't be requiring
     144 * zeroed pages as they are going to be filled instantly.
     145 *
     146 *
     147 * @subsection subsec_pgmPhys_FreePage          Freeing a page
     148 *
     149 * There are a few points where a page can be freed:
     150 *      - After being replaced by the zero page.
     151 *      - After being replaced by a shared page.
     152 *      - After being ballooned by the guest additions.
     153 *      - At reset.
     154 *      - At restore.
     155 *
     156 * When freeing one or more pages they will be returned to the ring-0
     157 * component and replaced by the zero page.
     158 *
     159 * The reasoning for clearing out all the pages on reset is that it will
     160 * return us to the exact same state as on power on, and may thereby help
     161 * us reduce the memory load on the system. Further it might have a
     162 * (temporary) positive influence on memory fragmentation (@see subsec_pgmPhys_Fragmentation).
     163 *
     164 * On restore, as mention under the allocation topic, pages should be
     165 * freed / allocated depending on how many is actually required by the
     166 * new VM state. The simplest approach is to do like on reset, and free
     167 * all non-ROM pages and then allocate what we need.
     168 *
     169 * A measure to prevent some fragmentation, would be to let each allocation
     170 * chunk have some affinity towards the VM having allocated the most pages
     171 * from it. Also, try make sure to allocate from allocation chunks that
     172 * are almost full. Admittedly, both these measures might work counter to
     173 * our intentions and its probably not worth putting a lot of effort,
     174 * cpu time or memory into this.
     175 *
     176 *
     177 * @subsection subsec_pgmPhys_SharePage         Sharing a page
     178 *
     179 * The basic idea is that there there will be a idle priority kernel
     180 * thread walking the non-shared VM pages hashing them and looking for
     181 * pages with the same checksum. If such pages are found, it will compare
     182 * them byte-by-byte to see if they actually are identical. If found to be
     183 * identical it will allocate a shared page, copy the content, check that
     184 * the page didn't change while doing this, and finally request both the
     185 * VMs to use the shared page instead. If the page is all zeros (special
     186 * checksum and byte-by-byte check) it will request the VM that owns it
     187 * to replace it with the zero page.
     188 *
     189 * To make this efficient, we will have to make sure not to try share a page
     190 * that will change its contents soon. This part requires the most work.
     191 * A simple idea would be to request the VM to write monitor the page for
     192 * a while to make sure it isn't modified any time soon. Also, it may
     193 * make sense to skip pages that are being write monitored since this
     194 * information is readily available to the thread if it works on the
     195 * per-VM guest memory structures (presently called PGMRAMRANGE).
     196 *
     197 *
     198 * @subsection subsec_pgmPhys_Fragmentation     Fragmentation Concerns and Counter Measures
     199 *
     200 * The pages are organized in allocation chunks in ring-0, this is a necessity
     201 * if we wish to have an OS agnostic approach to this whole thing. (On Linux we
     202 * could easily work on a page-by-page basis if we liked. Whether this is possible
     203 * or efficient on NT I don't quite know.) Fragmentation within these chunks may
     204 * become a problem as part of the idea here is that we wish to return memory to
     205 * the host system.
     206 *
     207 * For instance, starting two VMs at the same time, they will both allocate the
     208 * guest memory on-demand and if permitted their page allocations will be
     209 * intermixed. Shut down one of the two VMs and it will be difficult to return
     210 * any memory to the host system because the page allocation for the two VMs are
     211 * mixed up in the same allocation chunks.
     212 *
     213 * To further complicate matters, when pages are freed because they have been
     214 * ballooned or become shared/zero the whole idea is that the page is supposed
     215 * to be reused by another VM or returned to the host system. This will cause
     216 * allocation chunks to contain pages belonging to different VMs and prevent
     217 * returning memory to the host when one of those VM shuts down.
     218 *
     219 * The only way to really deal with this problem is to move pages. This can
     220 * either be done at VM shutdown and or by the idle priority worker thread
     221 * that will be responsible for finding sharable/zero pages. The mechanisms
     222 * involved for coercing a VM to move a page (or to do it for it) will be
     223 * the same as when telling it to share/zero a page.
     224 *
     225 *
     226 * @subsection subsec_pgmPhys_Serializing       Tracking Structures And Their Cost
     227 *
     228 * There's a difficult balance between keeping the per-page tracking structures
     229 * (global and guest page) easy to use and keeping them from eating too much
     230 * memory. We have limited virtual memory resources available when operating in
     231 * 32-bit kernel space (on 64-bit there'll it's quite a different story). The
     232 * tracking structures will be attemted designed such that we can deal with up
     233 * to 32GB of memory on a 32-bit system and essentially unlimited on 64-bit ones.
     234 *
     235 * ...
     236 *
     237 *
     238 * @subsection subsec_pgmPhys_Serializing       Serializing Access
     239 *
     240 * Initially, we'll try a simple scheme:
     241 *
     242 *      - The per-VM RAM tracking structures (PGMRAMRANGE) is only modified
     243 *        by the EMT thread of that VM while in the pgm critsect.
     244 *      - Other threads in the VM process that needs to make reliable use of
     245 *        the per-VM RAM tracking structures will enter the critsect.
     246 *      - No process external thread or kernel thread will ever try enter
     247 *        the pgm critical section, as that just won't work.
     248 *      - The idle thread (and similar threads) doesn't not need 100% reliable
     249 *        data when performing it tasks as the EMT thread will be the one to
     250 *        do the actual changes later anyway. So, as long as it only accesses
     251 *        the main ram range, it can do so by somehow preventing the VM from
     252 *        being destroyed while it works on it...
     253 *
     254 *      - The over-commitment management, including the allocating/freeing
     255 *        chunks, is serialized by a ring-0 mutex lock (a fast one since the
     256 *        more mundane mutex implementation is broken on Linux).
     257 *      - A separeate mutex is protecting the set of allocation chunks so
     258 *        that pages can be shared or/and freed up while some other VM is
     259 *        allocating more chunks. This mutex can be take from under the other
     260 *        one, but not the otherway around.
     261 *
     262 *
     263 * @subsection subsec_pgmPhys_Request           VM Request interface
     264 *
     265 * When in ring-0 it will become necessary to send requests to a VM so it can
     266 * for instance move a page while defragmenting during VM destroy. The idle
     267 * thread will make use of this interface to request VMs to setup shared
     268 * pages and to perform write monitoring of pages.
     269 *
     270 * I would propose an interface similar to the current VMReq interface, similar
     271 * in that it doesn't require locking and that the one sending the request may
     272 * wait for completion if it wishes to. This shouldn't be very difficult to
     273 * realize.
     274 *
     275 * The requests themselves are also pretty simple. They are basically:
     276 *      -# Check that some precondition is still true.
     277 *      -# Do the update.
     278 *      -# Update all shadow page tables involved with the page.
     279 *
     280 * The 3rd step is identical to what we're already doing when updating a
     281 * physical handler, see pgmHandlerPhysicalSetRamFlagsAndFlushShadowPTs.
     282 *
     283 */
    108284
    109285
Note: See TracChangeset for help on using the changeset viewer.

© 2024 Oracle Support Privacy / Do Not Sell My Info Terms of Use Trademark Policy Automated Access Etiquette