1 | /* $Id: VBox-CodingGuidelines.cpp 71660 2018-04-04 15:14:28Z vboxsync $ */
|
---|
2 | /** @file
|
---|
3 | * VBox - Coding Guidelines.
|
---|
4 | */
|
---|
5 |
|
---|
6 | /*
|
---|
7 | * Copyright (C) 2006-2017 Oracle Corporation
|
---|
8 | *
|
---|
9 | * This file is part of VirtualBox Open Source Edition (OSE), as
|
---|
10 | * available from http://www.virtualbox.org. This file is free software;
|
---|
11 | * you can redistribute it and/or modify it under the terms of the GNU
|
---|
12 | * General Public License (GPL) as published by the Free Software
|
---|
13 | * Foundation, in version 2 as it comes in the "COPYING" file of the
|
---|
14 | * VirtualBox OSE distribution. VirtualBox OSE is distributed in the
|
---|
15 | * hope that it will be useful, but WITHOUT ANY WARRANTY of any kind.
|
---|
16 | */
|
---|
17 |
|
---|
18 | /** @page pg_vbox_guideline VBox Coding Guidelines
|
---|
19 | *
|
---|
20 | * The VBox Coding guidelines are followed by all of VBox with the exception of
|
---|
21 | * qemu. Qemu is using whatever the frenchman does.
|
---|
22 | *
|
---|
23 | * There are a few compulsory rules and a bunch of optional ones. The following
|
---|
24 | * sections will describe these in details. In addition there is a section of
|
---|
25 | * Subversion 'rules'.
|
---|
26 | *
|
---|
27 | *
|
---|
28 | *
|
---|
29 | * @section sec_vbox_guideline_compulsory Compulsory
|
---|
30 | *
|
---|
31 | * <ul>
|
---|
32 | *
|
---|
33 | * <li> The indentation size is 4 chars.
|
---|
34 | *
|
---|
35 | * <li> Tabs are only ever used in makefiles.
|
---|
36 | *
|
---|
37 | * <li> Use RT and VBOX types.
|
---|
38 | *
|
---|
39 | * <li> Use Runtime functions.
|
---|
40 | *
|
---|
41 | * <li> Use the standard bool, uintptr_t, intptr_t and [u]int[1-9+]_t types.
|
---|
42 | *
|
---|
43 | * <li> Avoid using plain unsigned and int.
|
---|
44 | *
|
---|
45 | * <li> Use static wherever possible. This makes the namespace less polluted
|
---|
46 | * and avoids nasty name clash problems which can occur, especially on
|
---|
47 | * Unix-like systems. (1) It also simplifies locating callers when
|
---|
48 | * changing it (single source file vs entire VBox tree).
|
---|
49 | *
|
---|
50 | * <li> Public names are of the form Domain[Subdomain[]]Method, using mixed
|
---|
51 | * casing to mark the words. The main domain is all uppercase.
|
---|
52 | * (Think like java, mapping domain and subdomain to packages/classes.)
|
---|
53 | *
|
---|
54 | * <li> Public names are always declared using the appropriate DECL macro. (2)
|
---|
55 | *
|
---|
56 | * <li> Internal names starts with a lowercased main domain.
|
---|
57 | *
|
---|
58 | * <li> Defines are all uppercase and separate words with underscore.
|
---|
59 | * This applies to enum values too.
|
---|
60 | *
|
---|
61 | * <li> Typedefs are all uppercase and contain no underscores to distinguish
|
---|
62 | * them from defines.
|
---|
63 | *
|
---|
64 | * <li> Pointer typedefs start with 'P'. If pointer to const then 'PC'.
|
---|
65 | *
|
---|
66 | * <li> Function typedefs start with 'FN'. If pointer to FN then 'PFN'.
|
---|
67 | *
|
---|
68 | * <li> All files are case sensitive.
|
---|
69 | *
|
---|
70 | * <li> Slashes are unix slashes ('/') runtime converts when necessary.
|
---|
71 | *
|
---|
72 | * <li> char strings are UTF-8.
|
---|
73 | *
|
---|
74 | * <li> Strings from any external source must be treated with utmost care as
|
---|
75 | * they do not have to be valid UTF-8. Only trust internal strings.
|
---|
76 | *
|
---|
77 | * <li> All functions return VBox status codes. There are three general
|
---|
78 | * exceptions from this:
|
---|
79 | *
|
---|
80 | * <ol>
|
---|
81 | * <li>Predicate functions. These are function which are boolean in
|
---|
82 | * nature and usage. They return bool. The function name will
|
---|
83 | * include 'Has', 'Is' or similar.
|
---|
84 | * <li>Functions which by nature cannot possibly fail.
|
---|
85 | * These return void.
|
---|
86 | * <li>"Get"-functions which return what they ask for.
|
---|
87 | * A get function becomes a "Query" function if there is any
|
---|
88 | * doubt about getting what is ask for.
|
---|
89 | * </ol>
|
---|
90 | *
|
---|
91 | * <li> VBox status codes have three subdivisions:
|
---|
92 | * <ol>
|
---|
93 | * <li> Errors, which are VERR_ prefixed and negative.
|
---|
94 | * <li> Warnings, which are VWRN_ prefixed and positive.
|
---|
95 | * <li> Informational, which are VINF_ prefixed and positive.
|
---|
96 | * </ol>
|
---|
97 | *
|
---|
98 | * <li> Platform/OS operation are generalized and put in the IPRT.
|
---|
99 | *
|
---|
100 | * <li> Other useful constructs are also put in the IPRT.
|
---|
101 | *
|
---|
102 | * <li> The code shall not cause compiler warnings. Check this on ALL
|
---|
103 | * the platforms.
|
---|
104 | *
|
---|
105 | * <li> The use of symbols leading with single or double underscores is
|
---|
106 | * forbidden as that intrudes on reserved compiler/system namespace. (3)
|
---|
107 | *
|
---|
108 | * <li> All files have file headers with $Id and a file tag which describes
|
---|
109 | * the file in a sentence or two.
|
---|
110 | * Note: Use the svn-ps.cmd/svn-ps.sh utility with the -a option to add
|
---|
111 | * new sources with keyword expansion and exporting correctly
|
---|
112 | * configured.
|
---|
113 | *
|
---|
114 | * <li> All public functions are fully documented in Doxygen style using the
|
---|
115 | * javadoc dialect (using the 'at' instead of the 'slash' as
|
---|
116 | * commandprefix.)
|
---|
117 | *
|
---|
118 | * <li> All structures in header files are described, including all their
|
---|
119 | * members. (Doxygen style, of course.)
|
---|
120 | *
|
---|
121 | * <li> All modules have a documentation '\@page' in the main source file
|
---|
122 | * which describes the intent and actual implementation.
|
---|
123 | *
|
---|
124 | * <li> Code which is doing things that are not immediately comprehensible
|
---|
125 | * shall include explanatory comments.
|
---|
126 | *
|
---|
127 | * <li> Documentation and comments are kept up to date.
|
---|
128 | *
|
---|
129 | * <li> Headers in /include/VBox shall not contain any slash-slash C++
|
---|
130 | * comments, only ANSI C comments!
|
---|
131 | *
|
---|
132 | * <li> Comments on \#else indicates what begins while the comment on a
|
---|
133 | * \#endif indicates what ended. Only add these when there are more than
|
---|
134 | * a few lines (6-10) of \#ifdef'ed code, otherwise they're just clutter.
|
---|
135 | *
|
---|
136 | * <li> No 'else' after if block ending with 'return', 'break', or 'continue'.
|
---|
137 | *
|
---|
138 | * </ul>
|
---|
139 | *
|
---|
140 | * (1) It is common practice on Unix to have a single symbol namespace for an
|
---|
141 | * entire process. If one is careless symbols might be resolved in a
|
---|
142 | * different way that one expects, leading to weird problems.
|
---|
143 | *
|
---|
144 | * (2) This is common practice among most projects dealing with modules in
|
---|
145 | * shared libraries. The Windows / PE __declspect(import) and
|
---|
146 | * __declspect(export) constructs are the main reason for this.
|
---|
147 | * OTOH, we do perhaps have a bit too detailed graining of this in VMM...
|
---|
148 | *
|
---|
149 | * (3) There are guys out there grepping public sources for symbols leading with
|
---|
150 | * single and double underscores as well as gotos and other things
|
---|
151 | * considered bad practice. They'll post statistics on how bad our sources
|
---|
152 | * are on some mailing list, forum or similar.
|
---|
153 | *
|
---|
154 | *
|
---|
155 | * @subsection sec_vbox_guideline_compulsory_sub64 64-bit and 32-bit
|
---|
156 | *
|
---|
157 | * Here are some amendments which address 64-bit vs. 32-bit portability issues.
|
---|
158 | *
|
---|
159 | * Some facts first:
|
---|
160 | *
|
---|
161 | * <ul>
|
---|
162 | *
|
---|
163 | * <li> On 64-bit Windows the type long remains 32-bit. On nearly all other
|
---|
164 | * 64-bit platforms long is 64-bit.
|
---|
165 | *
|
---|
166 | * <li> On all 64-bit platforms we care about, int is 32-bit, short is 16 bit
|
---|
167 | * and char is 8-bit.
|
---|
168 | * (I don't know about any platforms yet where this isn't true.)
|
---|
169 | *
|
---|
170 | * <li> size_t, ssize_t, uintptr_t, ptrdiff_t and similar are all 64-bit on
|
---|
171 | * 64-bit platforms. (These are 32-bit on 32-bit platforms.)
|
---|
172 | *
|
---|
173 | * <li> There is no inline assembly support in the 64-bit Microsoft compilers.
|
---|
174 | *
|
---|
175 | * </ul>
|
---|
176 | *
|
---|
177 | * Now for the guidelines:
|
---|
178 | *
|
---|
179 | * <ul>
|
---|
180 | *
|
---|
181 | * <li> Never, ever, use int, long, ULONG, LONG, DWORD or similar to cast a
|
---|
182 | * pointer to integer. Use uintptr_t or intptr_t. If you have to use
|
---|
183 | * NT/Windows types, there is the choice of ULONG_PTR and DWORD_PTR.
|
---|
184 | *
|
---|
185 | * <li> Avoid where ever possible the use of the types 'long' and 'unsigned
|
---|
186 | * long' as these differs in size between windows and the other hosts
|
---|
187 | * (see above).
|
---|
188 | *
|
---|
189 | * <li> RT_OS_WINDOWS is defined to indicate Windows. Do not use __WIN32__,
|
---|
190 | * __WIN64__ and __WIN__ because they are all deprecated and scheduled
|
---|
191 | * for removal (if not removed already). Do not use the compiler
|
---|
192 | * defined _WIN32, _WIN64, or similar either. The bitness can be
|
---|
193 | * determined by testing ARCH_BITS.
|
---|
194 | * Example:
|
---|
195 | * @code
|
---|
196 | * #ifdef RT_OS_WINDOWS
|
---|
197 | * // call win32/64 api.
|
---|
198 | * #endif
|
---|
199 | * #ifdef RT_OS_WINDOWS
|
---|
200 | * # if ARCH_BITS == 64
|
---|
201 | * // call win64 api.
|
---|
202 | * # else // ARCH_BITS == 32
|
---|
203 | * // call win32 api.
|
---|
204 | * # endif // ARCH_BITS == 32
|
---|
205 | * #else // !RT_OS_WINDOWS
|
---|
206 | * // call posix api
|
---|
207 | * #endif // !RT_OS_WINDOWS
|
---|
208 | * @endcode
|
---|
209 | *
|
---|
210 | * <li> There are RT_OS_xxx defines for each OS, just like RT_OS_WINDOWS
|
---|
211 | * mentioned above. Use these defines instead of any predefined
|
---|
212 | * compiler stuff or defines from system headers.
|
---|
213 | *
|
---|
214 | * <li> RT_ARCH_X86 is defined when compiling for the x86 the architecture.
|
---|
215 | * Do not use __x86__, __X86__, __[Ii]386__, __[Ii]586__, or similar
|
---|
216 | * for this purpose.
|
---|
217 | *
|
---|
218 | * <li> RT_ARCH_AMD64 is defined when compiling for the AMD64 architecture.
|
---|
219 | * Do not use __AMD64__, __amd64__ or __x64_86__.
|
---|
220 | *
|
---|
221 | * <li> Take care and use size_t when you have to, esp. when passing a pointer
|
---|
222 | * to a size_t as a parameter.
|
---|
223 | *
|
---|
224 | * <li> Be wary of type promotion to (signed) integer. For example the
|
---|
225 | * following will cause u8 to be promoted to int in the shift, and then
|
---|
226 | * sign extended in the assignment 64-bit:
|
---|
227 | * @code
|
---|
228 | * uint8_t u8 = 0xfe;
|
---|
229 | * uint64_t u64 = u8 << 24;
|
---|
230 | * // u64 == 0xfffffffffe000000
|
---|
231 | * @endcode
|
---|
232 | *
|
---|
233 | * </ul>
|
---|
234 | *
|
---|
235 | * @subsection sec_vbox_guideline_compulsory_cppmain C++ guidelines for Main
|
---|
236 | *
|
---|
237 | * Main is currently (2009) full of hard-to-maintain code that uses complicated
|
---|
238 | * templates. The new mid-term goal for Main is to have less custom templates
|
---|
239 | * instead of more for the following reasons:
|
---|
240 | *
|
---|
241 | * <ul>
|
---|
242 | *
|
---|
243 | * <li> Template code is harder to read and understand. Custom templates create
|
---|
244 | * territories which only the code writer understands.
|
---|
245 | *
|
---|
246 | * <li> Errors in using templates create terrible C++ compiler messages.
|
---|
247 | *
|
---|
248 | * <li> Template code is really hard to look at in a debugger.
|
---|
249 | *
|
---|
250 | * <li> Templates slow down the compiler a lot.
|
---|
251 | *
|
---|
252 | * </ul>
|
---|
253 | *
|
---|
254 | * In particular, the following bits should be considered deprecated and should
|
---|
255 | * NOT be used in new code:
|
---|
256 | *
|
---|
257 | * <ul>
|
---|
258 | *
|
---|
259 | * <li> everything in include/iprt/cpputils.h (auto_ref_ptr, exception_trap_base,
|
---|
260 | * char_auto_ptr and friends)
|
---|
261 | *
|
---|
262 | * </ul>
|
---|
263 | *
|
---|
264 | * Generally, in many cases, a simple class with a proper destructor can achieve
|
---|
265 | * the same effect as a 1,000-line template include file, and the code is
|
---|
266 | * much more accessible that way.
|
---|
267 | *
|
---|
268 | * Using standard STL templates like std::list, std::vector and std::map is OK.
|
---|
269 | * Exceptions are:
|
---|
270 | *
|
---|
271 | * <ul>
|
---|
272 | *
|
---|
273 | * <li> Guest Additions because we don't want to link against libstdc++ there.
|
---|
274 | *
|
---|
275 | * <li> std::string should not be used because we have iprt::MiniString and
|
---|
276 | * com::Utf8Str which can convert efficiently with COM's UTF-16 strings.
|
---|
277 | *
|
---|
278 | * <li> std::auto_ptr<> in general; that part of the C++ standard is just broken.
|
---|
279 | * Write a destructor that calls delete.
|
---|
280 | *
|
---|
281 | * </ul>
|
---|
282 | *
|
---|
283 | * @subsection sec_vbox_guideline_compulsory_cppqtgui C++ guidelines for the Qt GUI
|
---|
284 | *
|
---|
285 | * The Qt GUI is currently (2010) on its way to become more compatible to the
|
---|
286 | * rest of VirtualBox coding style wise. From now on, all the coding style
|
---|
287 | * rules described in this file are also mandatory for the Qt GUI. Additionally
|
---|
288 | * the following rules should be respected:
|
---|
289 | *
|
---|
290 | * <ul>
|
---|
291 | *
|
---|
292 | * <li> GUI classes which correspond to GUI tasks should be prefixed by UI (no VBox anymore)
|
---|
293 | *
|
---|
294 | * <li> Classes which extents some of the Qt classes should be prefix by QI
|
---|
295 | *
|
---|
296 | * <li> General task classes should be prefixed by C
|
---|
297 | *
|
---|
298 | * <li> Slots are prefixed by slt -> sltName
|
---|
299 | *
|
---|
300 | * <li> Signals are prefixed by sig -> sigName
|
---|
301 | *
|
---|
302 | * <li> Use Qt classes for lists, strings and so on, the use of STL classes should
|
---|
303 | * be avoided
|
---|
304 | *
|
---|
305 | * <li> All files like .cpp, .h, .ui, which belong together are located in the
|
---|
306 | * same directory and named the same
|
---|
307 | *
|
---|
308 | * </ul>
|
---|
309 | *
|
---|
310 | *
|
---|
311 | * @subsection sec_vbox_guideline_compulsory_xslt XSLT
|
---|
312 | *
|
---|
313 | * XSLT (eXtensible Stylesheet Language Transformations) is used quite a bit in
|
---|
314 | * the Main API area of VirtualBox to generate sources and bindings to that API.
|
---|
315 | * There are a couple of common pitfalls worth mentioning:
|
---|
316 | *
|
---|
317 | * <ul>
|
---|
318 | *
|
---|
319 | * <li> Never do repeated //interface[\@name=...] and //enum[\@name=...] lookups
|
---|
320 | * because they are expensive. Instead delcare xsl:key elements for these
|
---|
321 | * searches and do the lookup using the key() function. xsltproc uses
|
---|
322 | * (per current document) hash tables for each xsl:key, i.e. very fast.
|
---|
323 | *
|
---|
324 | * <li> When output type is 'text' make sure to call xsltprocNewlineOutputHack
|
---|
325 | * from typemap-shared.inc.xsl every few KB of output, or xsltproc will
|
---|
326 | * end up wasting all the time reallocating the output buffer.
|
---|
327 | *
|
---|
328 | * </ul>
|
---|
329 | *
|
---|
330 | *
|
---|
331 | * @subsection sec_vbox_guideline_compulsory_doxygen Doxygen Comments
|
---|
332 | *
|
---|
333 | * As mentioned above, we shall use doxygen/javadoc style commenting of public
|
---|
334 | * functions, typedefs, classes and such. It is mandatory to use this style
|
---|
335 | * everywhere!
|
---|
336 | *
|
---|
337 | * A couple of hints on how to best write doxygen comments:
|
---|
338 | *
|
---|
339 | * <ul>
|
---|
340 | *
|
---|
341 | * <li> A good class, method, function, structure or enum doxygen comment
|
---|
342 | * starts with a one line sentence giving a brief description of the
|
---|
343 | * item. Details comes in a new paragraph (after blank line).
|
---|
344 | *
|
---|
345 | * <li> Except for list generators like \@todo, \@cfgm, \@gcfgm and others,
|
---|
346 | * all doxygen comments are related to things in the code. So, for
|
---|
347 | * instance you DO NOT add a doxygen \@note comment in the middle of a
|
---|
348 | * because you've got something important to note, you add a normal
|
---|
349 | * comment like 'Note! blah, very importan blah!'
|
---|
350 | *
|
---|
351 | * <li> We do NOT use TODO/XXX/BUGBUG or similar markers in the code to flag
|
---|
352 | * things needing fixing later, we always use \@todo doxygen comments.
|
---|
353 | *
|
---|
354 | * <li> There is no colon after the \@todo. And it is ALWAYS in a doxygen
|
---|
355 | * comment.
|
---|
356 | *
|
---|
357 | * <li> The \@retval tag is used to explain status codes a method/function may
|
---|
358 | * returns. It is not used to describe output parameters, that is done
|
---|
359 | * using the \@param or \@param[out] tag.
|
---|
360 | *
|
---|
361 | * </ul>
|
---|
362 | *
|
---|
363 | * See https://www.stack.nl/~dimitri/doxygen/manual/index.html for the official
|
---|
364 | * doxygen documention.
|
---|
365 | *
|
---|
366 | *
|
---|
367 | *
|
---|
368 | * @subsection sec_vbox_guideline_compulsory_guest Handling of guest input
|
---|
369 | *
|
---|
370 | * First, guest input should ALWAYS be consider to be TOXIC and constructed with
|
---|
371 | * MALICIOUS intent! Max paranoia level!
|
---|
372 | *
|
---|
373 | * Second, when getting inputs from memory shared with the guest, be EXTREMELY
|
---|
374 | * careful to not re-read input from shared memory after validating it, because
|
---|
375 | * that will create TOCTOU problems. So, after reading input from shared memory
|
---|
376 | * always use the RT_UNTRUSTED_NONVOLATILE_COPY_FENCE() macor. For more details
|
---|
377 | * on TOCTOU: https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use
|
---|
378 | *
|
---|
379 | * Thirdly, considering the recent speculation side channel issues, spectre v1
|
---|
380 | * in particular, we would like to be ready for future screwups. This means
|
---|
381 | * having input validation in a separate block of code that ends with one (or
|
---|
382 | * more) RT_UNTRUSTED_VALIDATED_FENCE().
|
---|
383 | *
|
---|
384 | * So the rules:
|
---|
385 | *
|
---|
386 | * <ul>
|
---|
387 | *
|
---|
388 | * <li> Mark all pointers to shared memory with RT_UNTRUSTED_VOLATILE_GUEST.
|
---|
389 | *
|
---|
390 | * <li> Copy volatile data into local variables or heap before validating
|
---|
391 | * them (see RT_COPY_VOLATILE() and RT_BCOPY_VOLATILE().
|
---|
392 | *
|
---|
393 | * <li> Place RT_UNTRUSTED_NONVOLATILE_COPY_FENCE() after a block copying
|
---|
394 | * volatile data.
|
---|
395 | *
|
---|
396 | * <li> Always validate untrusted inputs in a block ending with a
|
---|
397 | * RT_UNTRUSTED_VALIDATED_FENCE().
|
---|
398 | *
|
---|
399 | * <li> Use the ASSERT_GUEST_XXXX macros from VBox/AssertGuest.h to validate
|
---|
400 | * guest input. (Do NOT use iprt/assert.h macros.)
|
---|
401 | *
|
---|
402 | * <li> Validation of an input B may require using another input A to look up
|
---|
403 | * some data, in which case its necessary to insert an
|
---|
404 | * RT_UNTRUSTED_VALIDATED_FENCE() after validating A and before A is used
|
---|
405 | * for the lookup.
|
---|
406 | *
|
---|
407 | * For example A is a view identifier, idView, and B is an offset into
|
---|
408 | * the view's framebuffer area, offView. To validate offView (B) it is
|
---|
409 | * necessary to get the size of the views framebuffer region:
|
---|
410 | * @code
|
---|
411 | * uint32_t const idView = pReq->idView; // A
|
---|
412 | * uint32_t const offView = pReq->offView; // B
|
---|
413 | * RT_UNTRUSTED_NONVOLATILE_COPY_FENCE();
|
---|
414 | *
|
---|
415 | * ASSERT_GUEST_RETURN(idView < pThis->cView,
|
---|
416 | * VERR_INVALID_PARAMETER);
|
---|
417 | * RT_UNTRUSTED_VALIDATED_FENCE();
|
---|
418 | * const MYVIEW *pView = &pThis->aViews[idView];
|
---|
419 | * ASSERT_GUEST_RETURN(offView < pView->cbFramebufferArea,
|
---|
420 | * VERR_OUT_OF_RANGE);
|
---|
421 | * RT_UNTRUSTED_VALIDATED_FENCE();
|
---|
422 | * @endcode
|
---|
423 | *
|
---|
424 | * <li> Take care to make sure input check are not subject to integer overflow problems.
|
---|
425 | *
|
---|
426 | * For instance when validating an area, you must not just add cbDst + offDst
|
---|
427 | * and check against pThis->offEnd or something like that. Rather do:
|
---|
428 | * @code
|
---|
429 | * uint32_t const offDst = pReq->offDst;
|
---|
430 | * uint32_t const cbDst = pReq->cbDst;
|
---|
431 | * RT_UNTRUSTED_NONVOLATILE_COPY_FENCE();
|
---|
432 | *
|
---|
433 | * ASSERT_GUEST_RETURN( cbDst <= pThis->cbSrc
|
---|
434 | * && offDst < pThis->cbSrc - cbDst,
|
---|
435 | * VERR_OUT_OF_RANGE);
|
---|
436 | * RT_UNTRUSTED_VALIDATED_FENCE();
|
---|
437 | * @endcode
|
---|
438 | *
|
---|
439 | * <li> Input validation does not only apply to shared data cases, but also to
|
---|
440 | * I/O port and MMIO handlers.
|
---|
441 | *
|
---|
442 | * <li> Ditto for kernel drivers working with usermode inputs.
|
---|
443 | *
|
---|
444 | * </ul>
|
---|
445 | *
|
---|
446 | *
|
---|
447 | * Problem patterns:
|
---|
448 | * - https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use
|
---|
449 | * - https://googleprojectzero.blogspot.de/2018/01/reading-privileged-memory-with-side.html
|
---|
450 | * (Variant 1 only).
|
---|
451 | * - https://en.wikipedia.org/wiki/Integer_overflow
|
---|
452 | *
|
---|
453 | *
|
---|
454 | *
|
---|
455 | * @section sec_vbox_guideline_optional Optional
|
---|
456 | *
|
---|
457 | * First part is the actual coding style and all the prefixes. The second part
|
---|
458 | * is a bunch of good advice.
|
---|
459 | *
|
---|
460 | *
|
---|
461 | * @subsection sec_vbox_guideline_optional_layout The code layout
|
---|
462 | *
|
---|
463 | * <ul>
|
---|
464 | *
|
---|
465 | * <li> Max line length is 130 chars. Exceptions are table-like
|
---|
466 | * code/initializers and Log*() statements (don't waste unnecessary
|
---|
467 | * vertical space on debug logging).
|
---|
468 | *
|
---|
469 | * <li> Comments should try stay within the usual 80 columns as these are
|
---|
470 | * denser and too long lines may be harder to read.
|
---|
471 | *
|
---|
472 | * <li> Curly brackets are not indented. Example:
|
---|
473 | * @code
|
---|
474 | * if (true)
|
---|
475 | * {
|
---|
476 | * Something1();
|
---|
477 | * Something2();
|
---|
478 | * }
|
---|
479 | * else
|
---|
480 | * {
|
---|
481 | * SomethingElse1().
|
---|
482 | * SomethingElse2().
|
---|
483 | * }
|
---|
484 | * @endcode
|
---|
485 | *
|
---|
486 | * <li> Space before the parentheses when it comes after a C keyword.
|
---|
487 | *
|
---|
488 | * <li> No space between argument and parentheses. Exception for complex
|
---|
489 | * expression. Example:
|
---|
490 | * @code
|
---|
491 | * if (PATMR3IsPatchGCAddr(pVM, GCPtr))
|
---|
492 | * @endcode
|
---|
493 | *
|
---|
494 | * <li> The else of an if is always the first statement on a line. (No curly
|
---|
495 | * stuff before it!)
|
---|
496 | *
|
---|
497 | * <li> else and if go on the same line if no { compound statement }
|
---|
498 | * follows the if. Example:
|
---|
499 | * @code
|
---|
500 | * if (fFlags & MYFLAGS_1)
|
---|
501 | * fFlags &= ~MYFLAGS_10;
|
---|
502 | * else if (fFlags & MYFLAGS_2)
|
---|
503 | * {
|
---|
504 | * fFlags &= ~MYFLAGS_MASK;
|
---|
505 | * fFlags |= MYFLAGS_5;
|
---|
506 | * }
|
---|
507 | * else if (fFlags & MYFLAGS_3)
|
---|
508 | * @endcode
|
---|
509 | *
|
---|
510 | * <li> Slightly complex boolean expressions are split into multiple lines,
|
---|
511 | * putting the operators first on the line and indenting it all according
|
---|
512 | * to the nesting of the expression. The purpose is to make it as easy as
|
---|
513 | * possible to read. Example:
|
---|
514 | * @code
|
---|
515 | * if ( RT_SUCCESS(rc)
|
---|
516 | * || (fFlags & SOME_FLAG))
|
---|
517 | * @endcode
|
---|
518 | *
|
---|
519 | * <li> When 'if' or 'while' statements gets long, the closing parentheses
|
---|
520 | * goes right below the opening parentheses. This may be applied to
|
---|
521 | * sub-expression. Example:
|
---|
522 | * @code
|
---|
523 | * if ( RT_SUCCESS(rc)
|
---|
524 | * || ( fSomeStuff
|
---|
525 | * && fSomeOtherStuff
|
---|
526 | * && fEvenMoreStuff
|
---|
527 | * )
|
---|
528 | * || SomePredicateFunction()
|
---|
529 | * )
|
---|
530 | * {
|
---|
531 | * ...
|
---|
532 | * }
|
---|
533 | * @endcode
|
---|
534 | *
|
---|
535 | * <li> The case is indented from the switch (to avoid having the braces for
|
---|
536 | * the 'case' at the same level as the 'switch' statement).
|
---|
537 | *
|
---|
538 | * <li> If a case needs curly brackets they contain the entire case, are not
|
---|
539 | * indented from the case, and the break or return is placed inside them.
|
---|
540 | * Example:
|
---|
541 | * @code
|
---|
542 | * switch (pCur->eType)
|
---|
543 | * {
|
---|
544 | * case PGMMAPPINGTYPE_PAGETABLES:
|
---|
545 | * {
|
---|
546 | * unsigned iPDE = pCur->GCPtr >> PGDIR_SHIFT;
|
---|
547 | * unsigned iPT = (pCur->GCPtrEnd - pCur->GCPtr) >> PGDIR_SHIFT;
|
---|
548 | * while (iPT-- > 0)
|
---|
549 | * if (pPD->a[iPDE + iPT].n.u1Present)
|
---|
550 | * return VERR_HYPERVISOR_CONFLICT;
|
---|
551 | * break;
|
---|
552 | * }
|
---|
553 | * }
|
---|
554 | * @endcode
|
---|
555 | *
|
---|
556 | * <li> In a do while construction, the while is on the same line as the
|
---|
557 | * closing "}" if any are used.
|
---|
558 | * Example:
|
---|
559 | * @code
|
---|
560 | * do
|
---|
561 | * {
|
---|
562 | * stuff;
|
---|
563 | * i--;
|
---|
564 | * } while (i > 0);
|
---|
565 | * @endcode
|
---|
566 | *
|
---|
567 | * <li> Comments are in C style. C++ style comments are used for temporary
|
---|
568 | * disabling a few lines of code.
|
---|
569 | *
|
---|
570 | * <li> No unnecessary parentheses in expressions (just don't over do this
|
---|
571 | * so that gcc / msc starts bitching). Find a correct C/C++ operator
|
---|
572 | * precedence table if needed.
|
---|
573 | *
|
---|
574 | * <li> 'for (;;)' is preferred over 'while (true)' and 'while (1)'.
|
---|
575 | *
|
---|
576 | * <li> Parameters are indented to the start parentheses when breaking up
|
---|
577 | * function calls, declarations or prototypes. (This is in line with
|
---|
578 | * how 'if', 'for' and 'while' statements are done as well.) Example:
|
---|
579 | * @code
|
---|
580 | * RTPROCESS hProcess;
|
---|
581 | * int rc = RTProcCreateEx(papszArgs[0],
|
---|
582 | * papszArgs,
|
---|
583 | * RTENV_DEFAULT,
|
---|
584 | * fFlags,
|
---|
585 | * NULL, // phStdIn
|
---|
586 | * NULL, // phStdOut
|
---|
587 | * NULL, // phStdErr
|
---|
588 | * NULL, // pszAsUser
|
---|
589 | * NULL, // pszPassword
|
---|
590 | * &hProcess);
|
---|
591 | * @endcode
|
---|
592 | *
|
---|
593 | * <li> That Dijkstra is dead is no excuse for using gotos.
|
---|
594 | *
|
---|
595 | * <li> Using do-while-false loops to avoid gotos is considered very bad form.
|
---|
596 | * They create hard to read code. They tend to be either too short (i.e.
|
---|
597 | * pointless) or way to long (split up the function already), making
|
---|
598 | * tracking the state is difficult and prone to bugs. Also, they cause
|
---|
599 | * the compiler to generate suboptimal code, because the break branches
|
---|
600 | * are by preferred over the main code flow (MSC has no branch hinting!).
|
---|
601 | * Instead, do make use the 130 columns (i.e. nested ifs) and split
|
---|
602 | * the code up into more functions!
|
---|
603 | *
|
---|
604 | * <li> Avoid code like
|
---|
605 | * @code
|
---|
606 | * int foo;
|
---|
607 | * int rc;
|
---|
608 | * ...
|
---|
609 | * rc = FooBar();
|
---|
610 | * if (RT_SUCCESS(rc))
|
---|
611 | * {
|
---|
612 | * foo = getFoo();
|
---|
613 | * ...
|
---|
614 | * pvBar = RTMemAlloc(sizeof(*pvBar));
|
---|
615 | * if (!pvBar)
|
---|
616 | * rc = VERR_NO_MEMORY;
|
---|
617 | * }
|
---|
618 | * if (RT_SUCCESS(rc))
|
---|
619 | * {
|
---|
620 | * buzz = foo;
|
---|
621 | * ...
|
---|
622 | * }
|
---|
623 | * @endcode
|
---|
624 | * The intention of such code is probably to save some horizontal space
|
---|
625 | * but unfortunately it's hard to read and the scope of certain varables
|
---|
626 | * (e.g. foo in this example) is not optimal. Better use the following
|
---|
627 | * style:
|
---|
628 | * @code
|
---|
629 | * int rc;
|
---|
630 | * ...
|
---|
631 | * rc = FooBar();
|
---|
632 | * if (RT_SUCCESS(rc))
|
---|
633 | * {
|
---|
634 | * int foo = getFoo();
|
---|
635 | * ...
|
---|
636 | * pvBar = RTMemAlloc(sizeof(*pvBar));
|
---|
637 | * if (pvBar)
|
---|
638 | * {
|
---|
639 | * buzz = foo;
|
---|
640 | * ...
|
---|
641 | * }
|
---|
642 | * else
|
---|
643 | * rc = VERR_NO_MEMORY;
|
---|
644 | * }
|
---|
645 | * @endcode
|
---|
646 | *
|
---|
647 | * </ul>
|
---|
648 | *
|
---|
649 | * @subsection sec_vbox_guideline_optional_prefix Variable / Member Prefixes
|
---|
650 | *
|
---|
651 | * Prefixes are meant to provide extra context clues to a variable/member, we
|
---|
652 | * therefore avoid using prefixes that just indicating the type if a better
|
---|
653 | * choice is available.
|
---|
654 | *
|
---|
655 | *
|
---|
656 | * The prefixes:
|
---|
657 | *
|
---|
658 | * <ul>
|
---|
659 | *
|
---|
660 | * <li> The 'g_' (or 'g') prefix means a global variable, either on file or module level.
|
---|
661 | *
|
---|
662 | * <li> The 's_' (or 's') prefix means a static variable inside a function or
|
---|
663 | * class. This is not used for static variables on file level, use 'g_'
|
---|
664 | * for those (logical, right).
|
---|
665 | *
|
---|
666 | * <li> The 'm_' (or 'm') prefix means a class data member.
|
---|
667 | *
|
---|
668 | * In new code in Main, use "m_" (and common sense). As an exception,
|
---|
669 | * in Main, if a class encapsulates its member variables in an anonymous
|
---|
670 | * structure which is declared in the class, but defined only in the
|
---|
671 | * implementation (like this: 'class X { struct Data; Data *m; }'), then
|
---|
672 | * the pointer to that struct is called 'm' itself and its members then
|
---|
673 | * need no prefix, because the members are accessed with 'm->member'
|
---|
674 | * already which is clear enough.
|
---|
675 | *
|
---|
676 | * <li> The 'a_' prefix means a parameter (argument) variable. This is
|
---|
677 | * sometimes written 'a' in parts of the source code that does not use
|
---|
678 | * the array prefix.
|
---|
679 | *
|
---|
680 | * <li> The 'p' prefix means pointer. For instance 'pVM' is pointer to VM.
|
---|
681 | *
|
---|
682 | * <li> The 'r' prefix means that something is passed by reference.
|
---|
683 | *
|
---|
684 | * <li> The 'k' prefix means that something is a constant. For instance
|
---|
685 | * 'enum { kStuff };'. This is usually not used in combination with
|
---|
686 | * 'p', 'r' or any such thing, it's main main use is to make enums
|
---|
687 | * easily identifiable.
|
---|
688 | *
|
---|
689 | * <li> The 'a' prefix means array. For instance 'aPages' could be read as
|
---|
690 | * array of pages.
|
---|
691 | *
|
---|
692 | * <li> The 'c' prefix means count. For instance 'cbBlock' could be read,
|
---|
693 | * count of bytes in block. (1)
|
---|
694 | *
|
---|
695 | * <li> The 'cx' prefix means width (count of 'x' units).
|
---|
696 | *
|
---|
697 | * <li> The 'cy' prefix means height (count of 'y' units).
|
---|
698 | *
|
---|
699 | * <li> The 'x', 'y' and 'z' prefix refers to the x-, y- , and z-axis
|
---|
700 | * respectively.
|
---|
701 | *
|
---|
702 | * <li> The 'off' prefix means offset.
|
---|
703 | *
|
---|
704 | * <li> The 'i' or 'idx' prefixes usually means index. Although the 'i' one
|
---|
705 | * can sometimes just mean signed integer.
|
---|
706 | *
|
---|
707 | * <li> The 'i[1-9]+' prefix means a fixed bit size variable. Frequently
|
---|
708 | * used with the int[1-9]+_t types where the width is really important.
|
---|
709 | * In most cases 'i' is more appropriate. [type]
|
---|
710 | *
|
---|
711 | * <li> The 'e' (or 'enm') prefix means enum.
|
---|
712 | *
|
---|
713 | * <li> The 'u' prefix usually means unsigned integer. Exceptions follows.
|
---|
714 | *
|
---|
715 | * <li> The 'u[1-9]+' prefix means a fixed bit size variable. Frequently
|
---|
716 | * used with the uint[1-9]+_t types and with bitfields where the width is
|
---|
717 | * really important. In most cases 'u' or 'b' (byte) would be more
|
---|
718 | * appropriate. [type]
|
---|
719 | *
|
---|
720 | * <li> The 'b' prefix means byte or bytes. [type]
|
---|
721 | *
|
---|
722 | * <li> The 'f' prefix means flags. Flags are unsigned integers of some kind
|
---|
723 | * or booleans.
|
---|
724 | *
|
---|
725 | * <li> TODO: need prefix for real float. [type]
|
---|
726 | *
|
---|
727 | * <li> The 'rd' prefix means real double and is used for 'double' variables.
|
---|
728 | * [type]
|
---|
729 | *
|
---|
730 | * <li> The 'lrd' prefix means long real double and is used for 'long double'
|
---|
731 | * variables. [type]
|
---|
732 | *
|
---|
733 | * <li> The 'ch' prefix means a char, the (signed) char type. [type]
|
---|
734 | *
|
---|
735 | * <li> The 'wc' prefix means a wide/windows char, the RTUTF16 type. [type]
|
---|
736 | *
|
---|
737 | * <li> The 'uc' prefix means a Unicode Code point, the RTUNICP type. [type]
|
---|
738 | *
|
---|
739 | * <li> The 'uch' prefix means unsigned char. It's rarely used. [type]
|
---|
740 | *
|
---|
741 | * <li> The 'sz' prefix means zero terminated character string (array of
|
---|
742 | * chars). (UTF-8)
|
---|
743 | *
|
---|
744 | * <li> The 'wsz' prefix means zero terminated wide/windows character string
|
---|
745 | * (array of RTUTF16).
|
---|
746 | *
|
---|
747 | * <li> The 'usz' prefix means zero terminated Unicode string (array of
|
---|
748 | * RTUNICP).
|
---|
749 | *
|
---|
750 | * <li> The 'str' prefix means C++ string; either a std::string or, in Main,
|
---|
751 | * a Utf8Str or, in Qt, a QString. When used with 'p', 'r', 'a' or 'c'
|
---|
752 | * the first letter should be capitalized.
|
---|
753 | *
|
---|
754 | * <li> The 'bstr' prefix, in Main, means a UTF-16 Bstr. When used with 'p',
|
---|
755 | * 'r', 'a' or 'c' the first letter should be capitalized.
|
---|
756 | *
|
---|
757 | * <li> The 'pfn' prefix means pointer to function. Common usage is 'pfnCallback'
|
---|
758 | * and such like.
|
---|
759 | *
|
---|
760 | * <li> The 'psz' prefix is a combination of 'p' and 'sz' and thus means
|
---|
761 | * pointer to a zero terminated character string. (UTF-8)
|
---|
762 | *
|
---|
763 | * <li> The 'pcsz' prefix is used to indicate constant string pointers in
|
---|
764 | * parts of the code. Most code uses 'psz' for const and non-const
|
---|
765 | * string pointers, so please ignore this one.
|
---|
766 | *
|
---|
767 | * <li> The 'l' prefix means (signed) long. We try avoid using this,
|
---|
768 | * expecially with the 'LONG' types in Main as these are not 'long' on
|
---|
769 | * 64-bit non-Windows platforms and can cause confusion. Alternatives:
|
---|
770 | * 'i' or 'i32'. [type]
|
---|
771 | *
|
---|
772 | * <li> The 'ul' prefix means unsigned long. We try avoid using this,
|
---|
773 | * expecially with the 'ULONG' types in Main as these are not 'unsigned
|
---|
774 | * long' on 64-bit non-Windows platforms and can cause confusion.
|
---|
775 | * Alternatives: 'u' or 'u32'. [type]
|
---|
776 | *
|
---|
777 | * </ul>
|
---|
778 | *
|
---|
779 | * (1) Except in the occasional 'pcsz' prefix, the 'c' prefix is never ever
|
---|
780 | * used in the meaning 'const'.
|
---|
781 | *
|
---|
782 | *
|
---|
783 | * @subsection sec_vbox_guideline_optional_misc Misc / Advice / Stuff
|
---|
784 | *
|
---|
785 | * <ul>
|
---|
786 | *
|
---|
787 | * <li> When writing code think as the reader.
|
---|
788 | *
|
---|
789 | * <li> When writing code think as the compiler. (2)
|
---|
790 | *
|
---|
791 | * <li> When reading code think as if it's full of bugs - find them and fix them.
|
---|
792 | *
|
---|
793 | * <li> Pointer within range tests like:
|
---|
794 | * @code
|
---|
795 | * if ((uintptr_t)pv >= (uintptr_t)pvBase && (uintptr_t)pv < (uintptr_t)pvBase + cbRange)
|
---|
796 | * @endcode
|
---|
797 | * Can also be written as (assuming cbRange unsigned):
|
---|
798 | * @code
|
---|
799 | * if ((uintptr_t)pv - (uintptr_t)pvBase < cbRange)
|
---|
800 | * @endcode
|
---|
801 | * Which is shorter and potentially faster. (1)
|
---|
802 | *
|
---|
803 | * <li> Avoid unnecessary casting. All pointers automatically cast down to
|
---|
804 | * void *, at least for non class instance pointers.
|
---|
805 | *
|
---|
806 | * <li> It's very very bad practise to write a function larger than a
|
---|
807 | * screen full (1024x768) without any comprehensibility and explaining
|
---|
808 | * comments.
|
---|
809 | *
|
---|
810 | * <li> More to come....
|
---|
811 | *
|
---|
812 | * </ul>
|
---|
813 | *
|
---|
814 | * (1) Important, be very careful with the casting. In particular, note that
|
---|
815 | * a compiler might treat pointers as signed (IIRC).
|
---|
816 | *
|
---|
817 | * (2) "A really advanced hacker comes to understand the true inner workings of
|
---|
818 | * the machine - he sees through the language he's working in and glimpses
|
---|
819 | * the secret functioning of the binary code - becomes a Ba'al Shem of
|
---|
820 | * sorts." (Neal Stephenson "Snow Crash")
|
---|
821 | *
|
---|
822 | *
|
---|
823 | *
|
---|
824 | * @section sec_vbox_guideline_warnings Compiler Warnings
|
---|
825 | *
|
---|
826 | * The code should when possible compile on all platforms and compilers without any
|
---|
827 | * warnings. That's a nice idea, however, if it means making the code harder to read,
|
---|
828 | * less portable, unreliable or similar, the warning should not be fixed.
|
---|
829 | *
|
---|
830 | * Some of the warnings can seem kind of innocent at first glance. So, let's take the
|
---|
831 | * most common ones and explain them.
|
---|
832 | *
|
---|
833 | *
|
---|
834 | * @subsection sec_vbox_guideline_warnings_signed_unsigned_compare Signed / Unsigned Compare
|
---|
835 | *
|
---|
836 | * GCC says: "warning: comparison between signed and unsigned integer expressions"
|
---|
837 | * MSC says: "warning C4018: '<|<=|==|>=|>' : signed/unsigned mismatch"
|
---|
838 | *
|
---|
839 | * The following example will not output what you expect:
|
---|
840 | @code
|
---|
841 | #include <stdio.h>
|
---|
842 | int main()
|
---|
843 | {
|
---|
844 | signed long a = -1;
|
---|
845 | unsigned long b = 2294967295;
|
---|
846 | if (a < b)
|
---|
847 | printf("%ld < %lu: true\n", a, b);
|
---|
848 | else
|
---|
849 | printf("%ld < %lu: false\n", a, b);
|
---|
850 | return 0;
|
---|
851 | }
|
---|
852 | @endcode
|
---|
853 | * If I understood it correctly, the compiler will convert a to an
|
---|
854 | * unsigned long before doing the compare.
|
---|
855 | *
|
---|
856 | *
|
---|
857 | *
|
---|
858 | * @section sec_vbox_guideline_svn Subversion Commit Rules
|
---|
859 | *
|
---|
860 | *
|
---|
861 | * Before checking in:
|
---|
862 | *
|
---|
863 | * <ul>
|
---|
864 | *
|
---|
865 | * <li> Check Tinderbox and make sure the tree is green across all platforms. If it's
|
---|
866 | * red on a platform, don't check in. If you want, warn in the \#vbox channel and
|
---|
867 | * help make the responsible person fix it.
|
---|
868 | * NEVER CHECK IN TO A BROKEN BUILD.
|
---|
869 | *
|
---|
870 | * <li> When checking in keep in mind that a commit is atomic and that the Tinderbox and
|
---|
871 | * developers are constantly checking out the tree. Therefore do not split up the
|
---|
872 | * commit unless it's into 100% independent parts. If you need to split it up in order
|
---|
873 | * to have sensible commit comments, make the sub-commits as rapid as possible.
|
---|
874 | *
|
---|
875 | * <li> If you make a user visible change, such as fixing a reported bug,
|
---|
876 | * make sure you add an entry to doc/manual/user_ChangeLogImpl.xml.
|
---|
877 | *
|
---|
878 | * <li> If you are adding files make sure set the right attributes.
|
---|
879 | * svn-ps.sh/cmd was created for this purpose, please make use of it.
|
---|
880 | *
|
---|
881 | * </ul>
|
---|
882 | *
|
---|
883 | * After checking in:
|
---|
884 | *
|
---|
885 | * <ul>
|
---|
886 | *
|
---|
887 | * <li> After checking-in, you watch Tinderbox until your check-ins clear. You do not
|
---|
888 | * go home. You do not sleep. You do not log out or experiment with drugs. You do
|
---|
889 | * not become unavailable. If you break the tree, add a comment saying that you're
|
---|
890 | * fixing it. If you can't fix it and need help, ask in the \#innotek channel or back
|
---|
891 | * out the change.
|
---|
892 | *
|
---|
893 | * </ul>
|
---|
894 | *
|
---|
895 | * (Inspired by mozilla tree rules.)
|
---|
896 | */
|
---|
897 |
|
---|