1 .. SPDX-License-Identifier: GPL-2.0 2 3 .. _stateless_decoder: 4 5 ************************************************** 6 Memory-to-memory Stateless Video Decoder Interface 7 ************************************************** 8 9 A stateless decoder is a decoder that works without retaining any kind of state 10 between processed frames. This means that each frame is decoded independently 11 of any previous and future frames, and that the client is responsible for 12 maintaining the decoding state and providing it to the decoder with each 13 decoding request. This is in contrast to the stateful video decoder interface, 14 where the hardware and driver maintain the decoding state and all the client 15 has to do is to provide the raw encoded stream and dequeue decoded frames in 16 display order. 17 18 This section describes how user-space ("the client") is expected to communicate 19 with stateless decoders in order to successfully decode an encoded stream. 20 Compared to stateful codecs, the decoder/client sequence is simpler, but the 21 cost of this simplicity is extra complexity in the client which is responsible 22 for maintaining a consistent decoding state. 23 24 Stateless decoders make use of the :ref:`media-request-api`. A stateless 25 decoder must expose the ``V4L2_BUF_CAP_SUPPORTS_REQUESTS`` capability on its 26 ``OUTPUT`` queue when :c:func:`VIDIOC_REQBUFS` or :c:func:`VIDIOC_CREATE_BUFS` 27 are invoked. 28 29 Depending on the encoded formats supported by the decoder, a single decoded 30 frame may be the result of several decode requests (for instance, H.264 streams 31 with multiple slices per frame). Decoders that support such formats must also 32 expose the ``V4L2_BUF_CAP_SUPPORTS_M2M_HOLD_CAPTURE_BUF`` capability on their 33 ``OUTPUT`` queue. 34 35 Querying capabilities 36 ===================== 37 38 1. To enumerate the set of coded formats supported by the decoder, the client 39 calls :c:func:`VIDIOC_ENUM_FMT` on the ``OUTPUT`` queue. 40 41 * The driver must always return the full set of supported ``OUTPUT`` formats, 42 irrespective of the format currently set on the ``CAPTURE`` queue. 43 44 * Simultaneously, the driver must restrain the set of values returned by 45 codec-specific capability controls (such as H.264 profiles) to the set 46 actually supported by the hardware. 47 48 2. To enumerate the set of supported raw formats, the client calls 49 :c:func:`VIDIOC_ENUM_FMT` on the ``CAPTURE`` queue. 50 51 * The driver must return only the formats supported for the format currently 52 active on the ``OUTPUT`` queue. 53 54 * Depending on the currently set ``OUTPUT`` format, the set of supported raw 55 formats may depend on the value of some codec-dependent controls. 56 The client is responsible for making sure that these controls are set 57 before querying the ``CAPTURE`` queue. Failure to do so will result in the 58 default values for these controls being used, and a returned set of formats 59 that may not be usable for the media the client is trying to decode. 60 61 3. The client may use :c:func:`VIDIOC_ENUM_FRAMESIZES` to detect supported 62 resolutions for a given format, passing desired pixel format in 63 :c:type:`v4l2_frmsizeenum`'s ``pixel_format``. 64 65 4. Supported profiles and levels for the current ``OUTPUT`` format, if 66 applicable, may be queried using their respective controls via 67 :c:func:`VIDIOC_QUERYCTRL`. 68 69 Initialization 70 ============== 71 72 1. Set the coded format on the ``OUTPUT`` queue via :c:func:`VIDIOC_S_FMT`. 73 74 * **Required fields:** 75 76 ``type`` 77 a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``. 78 79 ``pixelformat`` 80 a coded pixel format. 81 82 ``width``, ``height`` 83 coded width and height parsed from the stream. 84 85 other fields 86 follow standard semantics. 87 88 .. note:: 89 90 Changing the ``OUTPUT`` format may change the currently set ``CAPTURE`` 91 format. The driver will derive a new ``CAPTURE`` format from the 92 ``OUTPUT`` format being set, including resolution, colorimetry 93 parameters, etc. If the client needs a specific ``CAPTURE`` format, 94 it must adjust it afterwards. 95 96 2. Call :c:func:`VIDIOC_S_EXT_CTRLS` to set all the controls (parsed headers, 97 etc.) required by the ``OUTPUT`` format to enumerate the ``CAPTURE`` formats. 98 99 3. Call :c:func:`VIDIOC_G_FMT` for ``CAPTURE`` queue to get the format for the 100 destination buffers parsed/decoded from the bytestream. 101 102 * **Required fields:** 103 104 ``type`` 105 a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``. 106 107 * **Returned fields:** 108 109 ``width``, ``height`` 110 frame buffer resolution for the decoded frames. 111 112 ``pixelformat`` 113 pixel format for decoded frames. 114 115 ``num_planes`` (for _MPLANE ``type`` only) 116 number of planes for pixelformat. 117 118 ``sizeimage``, ``bytesperline`` 119 as per standard semantics; matching frame buffer format. 120 121 .. note:: 122 123 The value of ``pixelformat`` may be any pixel format supported for the 124 ``OUTPUT`` format, based on the hardware capabilities. It is suggested 125 that the driver chooses the preferred/optimal format for the current 126 configuration. For example, a YUV format may be preferred over an RGB 127 format, if an additional conversion step would be required for RGB. 128 129 4. *[optional]* Enumerate ``CAPTURE`` formats via :c:func:`VIDIOC_ENUM_FMT` on 130 the ``CAPTURE`` queue. The client may use this ioctl to discover which 131 alternative raw formats are supported for the current ``OUTPUT`` format and 132 select one of them via :c:func:`VIDIOC_S_FMT`. 133 134 .. note:: 135 136 The driver will return only formats supported for the currently selected 137 ``OUTPUT`` format and currently set controls, even if more formats may be 138 supported by the decoder in general. 139 140 For example, a decoder may support YUV and RGB formats for 141 resolutions 1920x1088 and lower, but only YUV for higher resolutions (due 142 to hardware limitations). After setting a resolution of 1920x1088 or lower 143 as the ``OUTPUT`` format, :c:func:`VIDIOC_ENUM_FMT` may return a set of 144 YUV and RGB pixel formats, but after setting a resolution higher than 145 1920x1088, the driver will not return RGB pixel formats, since they are 146 unsupported for this resolution. 147 148 5. *[optional]* Choose a different ``CAPTURE`` format than suggested via 149 :c:func:`VIDIOC_S_FMT` on ``CAPTURE`` queue. It is possible for the client to 150 choose a different format than selected/suggested by the driver in 151 :c:func:`VIDIOC_G_FMT`. 152 153 * **Required fields:** 154 155 ``type`` 156 a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``. 157 158 ``pixelformat`` 159 a raw pixel format. 160 161 ``width``, ``height`` 162 frame buffer resolution of the decoded stream; typically unchanged from 163 what was returned with :c:func:`VIDIOC_G_FMT`, but it may be different 164 if the hardware supports composition and/or scaling. 165 166 After performing this step, the client must perform step 3 again in order 167 to obtain up-to-date information about the buffers size and layout. 168 169 6. Allocate source (bytestream) buffers via :c:func:`VIDIOC_REQBUFS` on 170 ``OUTPUT`` queue. 171 172 * **Required fields:** 173 174 ``count`` 175 requested number of buffers to allocate; greater than zero. 176 177 ``type`` 178 a ``V4L2_BUF_TYPE_*`` enum appropriate for ``OUTPUT``. 179 180 ``memory`` 181 follows standard semantics. 182 183 * **Returned fields:** 184 185 ``count`` 186 actual number of buffers allocated. 187 188 * If required, the driver will adjust ``count`` to be equal or bigger to the 189 minimum of required number of ``OUTPUT`` buffers for the given format and 190 requested count. The client must check this value after the ioctl returns 191 to get the actual number of buffers allocated. 192 193 7. Allocate destination (raw format) buffers via :c:func:`VIDIOC_REQBUFS` on the 194 ``CAPTURE`` queue. 195 196 * **Required fields:** 197 198 ``count`` 199 requested number of buffers to allocate; greater than zero. The client 200 is responsible for deducing the minimum number of buffers required 201 for the stream to be properly decoded (taking e.g. reference frames 202 into account) and pass an equal or bigger number. 203 204 ``type`` 205 a ``V4L2_BUF_TYPE_*`` enum appropriate for ``CAPTURE``. 206 207 ``memory`` 208 follows standard semantics. ``V4L2_MEMORY_USERPTR`` is not supported 209 for ``CAPTURE`` buffers. 210 211 * **Returned fields:** 212 213 ``count`` 214 adjusted to allocated number of buffers, in case the codec requires 215 more buffers than requested. 216 217 * The driver must adjust count to the minimum of required number of 218 ``CAPTURE`` buffers for the current format, stream configuration and 219 requested count. The client must check this value after the ioctl 220 returns to get the number of buffers allocated. 221 222 8. Allocate requests (likely one per ``OUTPUT`` buffer) via 223 :c:func:`MEDIA_IOC_REQUEST_ALLOC` on the media device. 224 225 9. Start streaming on both ``OUTPUT`` and ``CAPTURE`` queues via 226 :c:func:`VIDIOC_STREAMON`. 227 228 Decoding 229 ======== 230 231 For each frame, the client is responsible for submitting at least one request to 232 which the following is attached: 233 234 * The amount of encoded data expected by the codec for its current 235 configuration, as a buffer submitted to the ``OUTPUT`` queue. Typically, this 236 corresponds to one frame worth of encoded data, but some formats may allow (or 237 require) different amounts per unit. 238 * All the metadata needed to decode the submitted encoded data, in the form of 239 controls relevant to the format being decoded. 240 241 The amount of data and contents of the source ``OUTPUT`` buffer, as well as the 242 controls that must be set on the request, depend on the active coded pixel 243 format and might be affected by codec-specific extended controls, as stated in 244 documentation of each format. 245 246 If there is a possibility that the decoded frame will require one or more 247 decode requests after the current one in order to be produced, then the client 248 must set the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag on the ``OUTPUT`` 249 buffer. This will result in the (potentially partially) decoded ``CAPTURE`` 250 buffer not being made available for dequeueing, and reused for the next decode 251 request if the timestamp of the next ``OUTPUT`` buffer has not changed. 252 253 A typical frame would thus be decoded using the following sequence: 254 255 1. Queue an ``OUTPUT`` buffer containing one unit of encoded bytestream data for 256 the decoding request, using :c:func:`VIDIOC_QBUF`. 257 258 * **Required fields:** 259 260 ``index`` 261 index of the buffer being queued. 262 263 ``type`` 264 type of the buffer. 265 266 ``bytesused`` 267 number of bytes taken by the encoded data frame in the buffer. 268 269 ``flags`` 270 the ``V4L2_BUF_FLAG_REQUEST_FD`` flag must be set. Additionally, if 271 we are not sure that the current decode request is the last one needed 272 to produce a fully decoded frame, then 273 ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` must also be set. 274 275 ``request_fd`` 276 must be set to the file descriptor of the decoding request. 277 278 ``timestamp`` 279 must be set to a unique value per frame. This value will be propagated 280 into the decoded frame's buffer and can also be used to use this frame 281 as the reference of another. If using multiple decode requests per 282 frame, then the timestamps of all the ``OUTPUT`` buffers for a given 283 frame must be identical. If the timestamp changes, then the currently 284 held ``CAPTURE`` buffer will be made available for dequeuing and the 285 current request will work on a new ``CAPTURE`` buffer. 286 287 2. Set the codec-specific controls for the decoding request, using 288 :c:func:`VIDIOC_S_EXT_CTRLS`. 289 290 * **Required fields:** 291 292 ``which`` 293 must be ``V4L2_CTRL_WHICH_REQUEST_VAL``. 294 295 ``request_fd`` 296 must be set to the file descriptor of the decoding request. 297 298 other fields 299 other fields are set as usual when setting controls. The ``controls`` 300 array must contain all the codec-specific controls required to decode 301 a frame. 302 303 .. note:: 304 305 It is possible to specify the controls in different invocations of 306 :c:func:`VIDIOC_S_EXT_CTRLS`, or to overwrite a previously set control, as 307 long as ``request_fd`` and ``which`` are properly set. The controls state 308 at the moment of request submission is the one that will be considered. 309 310 .. note:: 311 312 The order in which steps 1 and 2 take place is interchangeable. 313 314 3. Submit the request by invoking :c:func:`MEDIA_REQUEST_IOC_QUEUE` on the 315 request FD. 316 317 If the request is submitted without an ``OUTPUT`` buffer, or if some of the 318 required controls are missing from the request, then 319 :c:func:`MEDIA_REQUEST_IOC_QUEUE` will return ``-ENOENT``. If more than one 320 ``OUTPUT`` buffer is queued, then it will return ``-EINVAL``. 321 :c:func:`MEDIA_REQUEST_IOC_QUEUE` returning non-zero means that no 322 ``CAPTURE`` buffer will be produced for this request. 323 324 ``CAPTURE`` buffers must not be part of the request, and are queued 325 independently. They are returned in decode order (i.e. the same order as coded 326 frames were submitted to the ``OUTPUT`` queue). 327 328 Runtime decoding errors are signaled by the dequeued ``CAPTURE`` buffers 329 carrying the ``V4L2_BUF_FLAG_ERROR`` flag. If a decoded reference frame has an 330 error, then all following decoded frames that refer to it also have the 331 ``V4L2_BUF_FLAG_ERROR`` flag set, although the decoder will still try to 332 produce (likely corrupted) frames. 333 334 Buffer management while decoding 335 ================================ 336 Contrary to stateful decoders, a stateless decoder does not perform any kind of 337 buffer management: it only guarantees that dequeued ``CAPTURE`` buffers can be 338 used by the client for as long as they are not queued again. "Used" here 339 encompasses using the buffer for compositing or display. 340 341 A dequeued capture buffer can also be used as the reference frame of another 342 buffer. 343 344 A frame is specified as reference by converting its timestamp into nanoseconds, 345 and storing it into the relevant member of a codec-dependent control structure. 346 The :c:func:`v4l2_timeval_to_ns` function must be used to perform that 347 conversion. The timestamp of a frame can be used to reference it as soon as all 348 its units of encoded data are successfully submitted to the ``OUTPUT`` queue. 349 350 A decoded buffer containing a reference frame must not be reused as a decoding 351 target until all the frames referencing it have been decoded. The safest way to 352 achieve this is to refrain from queueing a reference buffer until all the 353 decoded frames referencing it have been dequeued. However, if the driver can 354 guarantee that buffers queued to the ``CAPTURE`` queue are processed in queued 355 order, then user-space can take advantage of this guarantee and queue a 356 reference buffer when the following conditions are met: 357 358 1. All the requests for frames affected by the reference frame have been 359 queued, and 360 361 2. A sufficient number of ``CAPTURE`` buffers to cover all the decoded 362 referencing frames have been queued. 363 364 When queuing a decoding request, the driver will increase the reference count of 365 all the resources associated with reference frames. This means that the client 366 can e.g. close the DMABUF file descriptors of reference frame buffers if it 367 won't need them afterwards. 368 369 Seeking 370 ======= 371 In order to seek, the client just needs to submit requests using input buffers 372 corresponding to the new stream position. It must however be aware that 373 resolution may have changed and follow the dynamic resolution change sequence in 374 that case. Also depending on the codec used, picture parameters (e.g. SPS/PPS 375 for H.264) may have changed and the client is responsible for making sure that a 376 valid state is sent to the decoder. 377 378 The client is then free to ignore any returned ``CAPTURE`` buffer that comes 379 from the pre-seek position. 380 381 Pausing 382 ======= 383 384 In order to pause, the client can just cease queuing buffers onto the ``OUTPUT`` 385 queue. Without source bytestream data, there is no data to process and the codec 386 will remain idle. 387 388 Dynamic resolution change 389 ========================= 390 391 If the client detects a resolution change in the stream, it will need to perform 392 the initialization sequence again with the new resolution: 393 394 1. If the last submitted request resulted in a ``CAPTURE`` buffer being 395 held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the 396 last frame is not available on the ``CAPTURE`` queue. In this case, a 397 ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver 398 dequeue the held ``CAPTURE`` buffer. 399 400 2. Wait until all submitted requests have completed and dequeue the 401 corresponding output buffers. 402 403 3. Call :c:func:`VIDIOC_STREAMOFF` on both the ``OUTPUT`` and ``CAPTURE`` 404 queues. 405 406 4. Free all ``CAPTURE`` buffers by calling :c:func:`VIDIOC_REQBUFS` on the 407 ``CAPTURE`` queue with a buffer count of zero. 408 409 5. Perform the initialization sequence again (minus the allocation of 410 ``OUTPUT`` buffers), with the new resolution set on the ``OUTPUT`` queue. 411 Note that due to resolution constraints, a different format may need to be 412 picked on the ``CAPTURE`` queue. 413 414 Drain 415 ===== 416 417 If the last submitted request resulted in a ``CAPTURE`` buffer being 418 held by the use of the ``V4L2_BUF_FLAG_M2M_HOLD_CAPTURE_BUF`` flag, then the 419 last frame is not available on the ``CAPTURE`` queue. In this case, a 420 ``V4L2_DEC_CMD_FLUSH`` command shall be sent. This will make the driver 421 dequeue the held ``CAPTURE`` buffer. 422 423 After that, in order to drain the stream on a stateless decoder, the client 424 just needs to wait until all the submitted requests are completed.
Linux® is a registered trademark of Linus Torvalds in the United States and other countries.
TOMOYO® is a registered trademark of NTT DATA CORPORATION.