Changeset 4511 in vbox for trunk/src/VBox/VMM
- Timestamp:
- Sep 4, 2007 10:17:48 AM (17 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/src/VBox/VMM/PGM.cpp
r4388 r4511 106 106 */ 107 107 108 109 /** @page pg_pgmPhys PGMPhys - Physical Guest Memory Management. 110 * 111 * 112 * Objectives: 113 * - Guest RAM over-commitment using memory ballooning, 114 * zero pages and general page sharing. 115 * - Moving or mirroring a VM onto a different physical machine. 116 * 117 * 118 * @subsection subsec_pg_pgmPhys_AllocPage Allocating a page. 119 * 120 * Initially we map *all* guest memory to the (per VM) zero page, which 121 * means that none of the read functions will cause pages to be allocated. 122 * 123 * Exception, access bit in page tables that have been shared. This must 124 * be handled, but we must also make sure PGMGst*Modify doesn't make 125 * unnecessary modifications. 126 * 127 * Allocation points: 128 * - PGMPhysWriteGCPhys and PGMPhysWrite. 129 * - Replacing a zero page mapping at \#PF. 130 * - Replacing a shared page mapping at \#PF. 131 * - ROM registration (currently MMR3RomRegister). 132 * - VM restore (pgmR3Load). 133 * 134 * For the first three it would make sense to keep a few pages handy 135 * until we've reached the max memory commitment for the VM. 136 * 137 * For the ROM registration, we know exactly how many pages we need 138 * and will request these from ring-0. For restore, we will save 139 * the number of non-zero pages in the saved state and allocate 140 * them up front. This would allow the ring-0 component to refuse 141 * the request if the isn't sufficient memory available for VM use. 142 * 143 * Btw. for both ROM and restore allocations we won't be requiring 144 * zeroed pages as they are going to be filled instantly. 145 * 146 * 147 * @subsection subsec_pgmPhys_FreePage Freeing a page 148 * 149 * There are a few points where a page can be freed: 150 * - After being replaced by the zero page. 151 * - After being replaced by a shared page. 152 * - After being ballooned by the guest additions. 153 * - At reset. 154 * - At restore. 155 * 156 * When freeing one or more pages they will be returned to the ring-0 157 * component and replaced by the zero page. 158 * 159 * The reasoning for clearing out all the pages on reset is that it will 160 * return us to the exact same state as on power on, and may thereby help 161 * us reduce the memory load on the system. Further it might have a 162 * (temporary) positive influence on memory fragmentation (@see subsec_pgmPhys_Fragmentation). 163 * 164 * On restore, as mention under the allocation topic, pages should be 165 * freed / allocated depending on how many is actually required by the 166 * new VM state. The simplest approach is to do like on reset, and free 167 * all non-ROM pages and then allocate what we need. 168 * 169 * A measure to prevent some fragmentation, would be to let each allocation 170 * chunk have some affinity towards the VM having allocated the most pages 171 * from it. Also, try make sure to allocate from allocation chunks that 172 * are almost full. Admittedly, both these measures might work counter to 173 * our intentions and its probably not worth putting a lot of effort, 174 * cpu time or memory into this. 175 * 176 * 177 * @subsection subsec_pgmPhys_SharePage Sharing a page 178 * 179 * The basic idea is that there there will be a idle priority kernel 180 * thread walking the non-shared VM pages hashing them and looking for 181 * pages with the same checksum. If such pages are found, it will compare 182 * them byte-by-byte to see if they actually are identical. If found to be 183 * identical it will allocate a shared page, copy the content, check that 184 * the page didn't change while doing this, and finally request both the 185 * VMs to use the shared page instead. If the page is all zeros (special 186 * checksum and byte-by-byte check) it will request the VM that owns it 187 * to replace it with the zero page. 188 * 189 * To make this efficient, we will have to make sure not to try share a page 190 * that will change its contents soon. This part requires the most work. 191 * A simple idea would be to request the VM to write monitor the page for 192 * a while to make sure it isn't modified any time soon. Also, it may 193 * make sense to skip pages that are being write monitored since this 194 * information is readily available to the thread if it works on the 195 * per-VM guest memory structures (presently called PGMRAMRANGE). 196 * 197 * 198 * @subsection subsec_pgmPhys_Fragmentation Fragmentation Concerns and Counter Measures 199 * 200 * The pages are organized in allocation chunks in ring-0, this is a necessity 201 * if we wish to have an OS agnostic approach to this whole thing. (On Linux we 202 * could easily work on a page-by-page basis if we liked. Whether this is possible 203 * or efficient on NT I don't quite know.) Fragmentation within these chunks may 204 * become a problem as part of the idea here is that we wish to return memory to 205 * the host system. 206 * 207 * For instance, starting two VMs at the same time, they will both allocate the 208 * guest memory on-demand and if permitted their page allocations will be 209 * intermixed. Shut down one of the two VMs and it will be difficult to return 210 * any memory to the host system because the page allocation for the two VMs are 211 * mixed up in the same allocation chunks. 212 * 213 * To further complicate matters, when pages are freed because they have been 214 * ballooned or become shared/zero the whole idea is that the page is supposed 215 * to be reused by another VM or returned to the host system. This will cause 216 * allocation chunks to contain pages belonging to different VMs and prevent 217 * returning memory to the host when one of those VM shuts down. 218 * 219 * The only way to really deal with this problem is to move pages. This can 220 * either be done at VM shutdown and or by the idle priority worker thread 221 * that will be responsible for finding sharable/zero pages. The mechanisms 222 * involved for coercing a VM to move a page (or to do it for it) will be 223 * the same as when telling it to share/zero a page. 224 * 225 * 226 * @subsection subsec_pgmPhys_Serializing Tracking Structures And Their Cost 227 * 228 * There's a difficult balance between keeping the per-page tracking structures 229 * (global and guest page) easy to use and keeping them from eating too much 230 * memory. We have limited virtual memory resources available when operating in 231 * 32-bit kernel space (on 64-bit there'll it's quite a different story). The 232 * tracking structures will be attemted designed such that we can deal with up 233 * to 32GB of memory on a 32-bit system and essentially unlimited on 64-bit ones. 234 * 235 * ... 236 * 237 * 238 * @subsection subsec_pgmPhys_Serializing Serializing Access 239 * 240 * Initially, we'll try a simple scheme: 241 * 242 * - The per-VM RAM tracking structures (PGMRAMRANGE) is only modified 243 * by the EMT thread of that VM while in the pgm critsect. 244 * - Other threads in the VM process that needs to make reliable use of 245 * the per-VM RAM tracking structures will enter the critsect. 246 * - No process external thread or kernel thread will ever try enter 247 * the pgm critical section, as that just won't work. 248 * - The idle thread (and similar threads) doesn't not need 100% reliable 249 * data when performing it tasks as the EMT thread will be the one to 250 * do the actual changes later anyway. So, as long as it only accesses 251 * the main ram range, it can do so by somehow preventing the VM from 252 * being destroyed while it works on it... 253 * 254 * - The over-commitment management, including the allocating/freeing 255 * chunks, is serialized by a ring-0 mutex lock (a fast one since the 256 * more mundane mutex implementation is broken on Linux). 257 * - A separeate mutex is protecting the set of allocation chunks so 258 * that pages can be shared or/and freed up while some other VM is 259 * allocating more chunks. This mutex can be take from under the other 260 * one, but not the otherway around. 261 * 262 * 263 * @subsection subsec_pgmPhys_Request VM Request interface 264 * 265 * When in ring-0 it will become necessary to send requests to a VM so it can 266 * for instance move a page while defragmenting during VM destroy. The idle 267 * thread will make use of this interface to request VMs to setup shared 268 * pages and to perform write monitoring of pages. 269 * 270 * I would propose an interface similar to the current VMReq interface, similar 271 * in that it doesn't require locking and that the one sending the request may 272 * wait for completion if it wishes to. This shouldn't be very difficult to 273 * realize. 274 * 275 * The requests themselves are also pretty simple. They are basically: 276 * -# Check that some precondition is still true. 277 * -# Do the update. 278 * -# Update all shadow page tables involved with the page. 279 * 280 * The 3rd step is identical to what we're already doing when updating a 281 * physical handler, see pgmHandlerPhysicalSetRamFlagsAndFlushShadowPTs. 282 * 283 */ 108 284 109 285
Note:
See TracChangeset
for help on using the changeset viewer.