Changeset 89800 in vbox for trunk/include
- Timestamp:
- Jun 21, 2021 12:02:14 AM (4 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/include/VBox/vmm/pdmaudioifs.h
r89798 r89800 115 115 * 116 116 * Both "Output Sink" and "Input Sink" talks to all the attached driver chains 117 * ("DrvAudio0" and "DrvAudio1"), but using different PDMAUDIOSTREAM instances. 118 * There can be an arbritrary number of driver chains attached to an audio 119 * device, the mixer sinks will multiplex output to each of them and blend input 120 * from all of them, taking care of format and rate conversions. The mixer and 121 * mixer sinks does not fit into the PDM device/driver model, so it is 117 * ("DrvAudio #0" and "DrvAudio #1"), but using different PDMAUDIOSTREAM 118 * instances. There can be an arbritrary number of driver chains attached to an 119 * audio device, the mixer sinks will multiplex output to each of them and blend 120 * input from all of them, taking care of format and rate conversions. The 121 * mixer and mixer sinks does not fit into the PDM device/driver model, because 122 * a driver can only have exactly one or zero other drivers attached, so it is 122 123 * implemented as a separate component that all the audio devices share (see 123 124 * AudioMixer.h, AudioMixer.cpp, AudioMixBuffer.h and AudioMixBuffer.cpp). … … 128 129 * exposes PDMIAUDIOCONNECTOR upwards towards the device and mixer component, 129 130 * and PDMIHOSTAUDIOPORT downwards towards DrvHostAudioWasApi and the other 130 * backends. The backend exposes the PDMIHOSTAUDIO upwards towards DrvAudio. 131 * backends. 132 * 133 * The backend exposes the PDMIHOSTAUDIO upwards towards DrvAudio. It is 134 * possible, though, to only have the DrvAudio instance and not backend, in 135 * which case DrvAudio works as if the NULL backend was attached. Main does 136 * such setups when the main component we're interfacing with isn't currently 137 * active, as this simplifies runtime activation. 131 138 * 132 139 * The purpose of DrvAudio is to make the work of the backend as simple as … … 145 152 * 146 153 * 147 * @section sec_pdm_audio_mixing Mixing 148 * 149 * The AUDIOMIXER API is optionally available to create and manage virtual audio 150 * mixers. Such an audio mixer in turn then can be used by the device emulation 151 * code to manage all the multiplexing to/from the connected LUN audio streams. 152 * 153 * Currently only input and output stream are supported. Duplex stream are not 154 * supported yet. 155 * 156 * This also is handy if certain LUN audio streams should be added or removed 157 * during runtime. 158 * 159 * To create a group of either input or output streams the AUDMIXSINK API can be 160 * used. 161 * 162 * For example: The device emulation has one hardware output stream (HW0), and 163 * that output stream shall be available to all connected LUN backends. For that 164 * to happen, an AUDMIXSINK sink has to be created and attached to the device's 165 * AUDIOMIXER object. 166 * 167 * As every LUN has its own AUDMIXSTREAM object, adding all those 168 * objects to the just created audio mixer sink will do the job. 169 * 170 * @note The AUDIOMIXER API is purely optional and is not used by all currently 171 * implemented device emulations (e.g. SB16). 172 * 173 * 174 * @section sec_pdm_audio_data_processing Data processing 175 * 176 * Audio input / output data gets handed off to/from the device emulation in an 177 * unmodified (raw) way. The actual audio frame / sample conversion is done via 178 * the AUDIOMIXBUF API. 179 * 180 * This concentrates the audio data processing in one place and makes it easier 181 * to test / benchmark such code. 182 * 183 * A PDMAUDIOFRAME is the internal representation of a single audio frame, which 184 * consists of a single left and right audio sample in time. Only mono (1) and 185 * stereo (2) channel(s) currently are supported. 154 * @section sec_pdm_audio_device Virtual Audio Device 155 * 156 * The virtual device translates the settings of the emulated device into mixing 157 * sinks with sample format, sample rate, volume control, and whatnot. 158 * 159 * It also implements a DMA engine for transfering samples to (input) or from 160 * (output) the guest memory. The starting and stopping of the DMA engines are 161 * communicated to the associated mixing sinks and by then onto the 162 * PDMAUDIOSTREAM instance for each driver chain. A RTCIRCBUF is used as an 163 * intermediary between the DMA engine and the asynchronous worker thread of the 164 * mixing sink. 165 * 166 * 167 * @section sec_pdm_audio_mixing Audio Mixing 168 * 169 * The audio mixer is a mandatory component in an audio device. It consists of 170 * a mixer and one or more sinks with mixer buffers. The sinks are typically 171 * one per virtual output/input connector, so for instance you could have a 172 * device with a "PCM Output" sink and a "PCM Input" sink. 173 * 174 * The audio mixer takes care of: 175 * - Much of the driver chain (LUN) management work. 176 * - Multiplexing output to each active driver chain. 177 * - Blending input from each active driver chain into a single audio 178 * stream. 179 * - Do format conversion (it uses signed 32-bit PCM internally) between 180 * the audio device and all of the LUNs (no common format needed). 181 * - Do sample rate conversions between the device rate and that of the 182 * individual driver chains. 183 * - Apply the volume settings of the device to the audio stream. 184 * - Provide the asynchronous thread that pushes data from the device's 185 * internal DMA buffer and all the way to the backend for output sinks, 186 * and vice versa for input. 187 * 188 * The term active LUNs above means that not all LUNs will actually produce 189 * (input) or consume (output) audio. The mixer checks the return of 190 * PDMIHOSTAUDIO::pfnStreamGetState each time it's processing samples to see 191 * which streams are currently active and which aren't. Inactive streams are 192 * ignored. 193 * 194 * The AudioMixer API reference can be found here: 195 * - @ref grp_pdm_ifs_audio_mixing and 196 * - @ref grp_pdm_ifs_audio_mixer_buffers 186 197 * 187 198 * … … 189 200 * 190 201 * Handling audio data in a virtual environment is hard, as the human perception 191 * is very sensitive to the slightest cracks and stutters in the audible data. 192 * This can happen if the VM's timing is lagging behind or not within the 193 * expected time frame. 194 * 195 * The two main components which unfortunately contradict each other is a) the 196 * audio device emulation and b) the audio backend(s) on the host. Those need to 197 * be served in a timely manner to function correctly. To make e.g. the device 198 * emulation rely on the pace the host backend(s) set - or vice versa - will not 199 * work, as the guest's audio system / drivers then will not be able to 200 * compensate this accordingly. 201 * 202 * So each component, the device emulation, the audio connector(s) and the 203 * backend(s) must do its thing *when* it needs to do it, independently of the 204 * others. For that we use various (small) ring buffers to (hopefully) serve all 205 * components with the amount of data *when* they need it. 206 * 207 * Additionally, the device emulation can run with a different audio frame size, 208 * while the backends(s) may require a different frame size (16 bit stereo 209 * -> 8 bit mono, for example). 210 * 211 * The device emulation can give the audio connector(s) a scheduling hint 212 * (optional), e.g. in which interval it expects any data processing. 213 * 214 * A data transfer for playing audio data from the guest on the host looks like 215 * this: (RB = Ring Buffer, MB = Mixing Buffer) 216 * 217 * (A) Device DMA -> (B) Device RB -> (C) Audio Connector %Guest MB -> (D) Audio 218 * Connector %Host MB -> (E) Backend RB (optional, up to the backend) -> (F) 219 * Backend audio framework. 220 * 221 * When capturing audio data the chain is similar to the above one, just in a 222 * different direction, of course. 223 * 224 * The audio connector hereby plays a key role when it comes to (pre-)buffering 225 * data to minimize any audio stutters and/or cracks. The following values, 226 * which also can be tweaked via CFGM / extra-data are available: 227 * 228 * - The pre-buffering time (in ms): Audio data which needs to be buffered 229 * before any playback (or capturing) can happen. 230 * - The actual buffer size (in ms): How big the mixing buffer (for C and D) 231 * will be. 232 * - The period size (in ms): How big a chunk of audio (often called period or 233 * fragment) for F must be to get handled correctly. 234 * 235 * The above values can be set on a per-driver level, whereas input and output 236 * streams for a driver also can be handled set independently. The verbose audio 237 * (release) log will tell about the (final) state of each audio stream. 238 * 239 * 240 * @section sec_pdm_audio_diagram Diagram 241 * 242 * @todo r=bird: Not quite able to make sense of this, esp. the 243 * AUDMIXSINK/AUDIOMIXER bits crossing the LUN connections. 244 * 245 * @verbatim 246 +----------------------------------+ 247 |Device (SB16 / AC'97 / HDA) | 248 |----------------------------------| 249 |AUDIOMIXER (Optional) | 250 |AUDMIXSINK0 (Optional) | 251 |AUDMIXSINK1 (Optional) | 252 |AUDMIXSINKn (Optional) | 253 | | 254 | L L L | 255 | U U U | 256 | N N N | 257 | 0 1 n | 258 +-----+----+----+------------------+ 259 | | | 260 | | | 261 +--------------+ | | | +-------------+ 262 |AUDMIXSINK | | | | |AUDIOMIXER | 263 |--------------| | | | |-------------| 264 |AUDMIXSTREAM0 |+-|----|----|-->|AUDMIXSINK0 | 265 |AUDMIXSTREAM1 |+-|----|----|-->|AUDMIXSINK1 | 266 |AUDMIXSTREAMn |+-|----|----|-->|AUDMIXSINKn | 267 +--------------+ | | | +-------------+ 268 | | | 269 | | | 270 +----+----+----+----+ 271 |LUN | 272 |-------------------| 273 |PDMIAUDIOCONNECTOR | 274 |AUDMIXSTREAM | 275 | +------+ 276 | | | 277 | | | 278 | | | 279 +-------------------+ | 280 | 281 +-------------------------+ | 282 +-------------------------+ +----+--------------------+ 283 |PDMAUDIOSTREAM | |PDMIAUDIOCONNECTOR | 284 |-------------------------| |-------------------------| 285 |AUDIOMIXBUF |+------>|PDMAUDIOSTREAM Host | 286 |PDMAUDIOSTREAMCFG |+------>|PDMAUDIOSTREAM Guest | 287 | | |Device capabilities | 288 | | |Device configuration | 289 | | | | 290 | | +--+|PDMIHOSTAUDIO | 291 | | | |+-----------------------+| 292 +-------------------------+ | ||Backend storage space || 293 | |+-----------------------+| 294 | +-------------------------+ 295 | 296 +---------------------+ | 297 |PDMIHOSTAUDIO | | 298 |+--------------+ | | 299 ||DirectSound | | | 300 |+--------------+ | | 301 | | | 302 |+--------------+ | | 303 ||PulseAudio | | | 304 |+--------------+ |+-------+ 305 | | 306 |+--------------+ | 307 ||Core Audio | | 308 |+--------------+ | 309 | | 310 | | 311 | | 312 | | 313 +---------------------+ 314 @endverbatim 202 * is very sensitive to the slightest cracks and stutters in the audible data, 203 * and the task of playing back and recording audio is in the real-time domain. 204 * 205 * The virtual machine is not executed with any real-time guarentees, only best 206 * effort, mainly because it is subject to preemptive scheduling on the host 207 * side. The audio processing done on the guest side is typically also subject 208 * to preemptive scheduling on the guest side and available CPU processing power 209 * there. 210 * 211 * Thus, the guest may be lagging behind because the host prioritizes other 212 * processes/threads over the virtual machine. This will, if it's too servere, 213 * cause the virtual machine to speed up it's time sense while it's trying to 214 * catch up. So, we can easily have a bit of a seesaw execution going on here, 215 * where in the playback case, the guest produces data too slowly for while and 216 * then switches to producing it too quickly for a while to catch up. 217 * 218 * Our working principle is that the backends and the guest are producing and 219 * consuming samples at the same rate, but we have to deal with the uneven 220 * execution. 221 * 222 * To deal with this we employ (by default) 300ms of backend buffer and 223 * pre-buffer 150ms of that for both input and output audio streams. This means 224 * we have about 150ms worth of samples to feed to the host audio device should 225 * the virtual machine be starving and lagging behind. Likewise, we have about 226 * 150ms of buffer space will can fill when the VM is in a catch-up mode. Now, 227 * 300ms and 150 ms isn't much for the purpose of glossing over 228 * scheduling/timinig differences here, but we can't do too much more or the lag 229 * will grow rather annoying. The pre-buffering is implemented by DrvAudio. 230 * 231 * In addition to the backend buffer that defaults to 300ms, we have the 232 * internal DMA buffer of the device and the mixing buffer of the mixing sink. 233 * The latter two are typically rather small, sized to fit the anticipated DMA 234 * period currently in use by the guest. 315 235 */ 316 236
Note:
See TracChangeset
for help on using the changeset viewer.