Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

doc: uapi: Add document describing dma-buf semantics

Since there's a lot of confusion around this, document both the rules
and the best practices around negotiating, allocating, importing, and
using buffers when crossing context/process/device/subsystem boundaries.

This ties up all of dma-buf, formats and modifiers, and their usage.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Sui Jingfeng <suijingfeng@loongson.cn>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20230803154908.105124-4-daniels@collabora.com

authored by

Daniel Stone and committed by
Simon Ser
504245a5 09902f3a

+406
+8
Documentation/driver-api/dma-buf.rst
··· 22 22 allowing implicit (kernel-ordered) synchronization of work to 23 23 preserve the illusion of coherent access 24 24 25 + 26 + Userspace API principles and use 27 + -------------------------------- 28 + 29 + For more details on how to design your subsystem's API for dma-buf use, please 30 + see Documentation/userspace-api/dma-buf-alloc-exchange.rst. 31 + 32 + 25 33 Shared DMA Buffers 26 34 ------------------ 27 35
+7
Documentation/gpu/drm-uapi.rst
··· 486 486 487 487 .. kernel-doc:: include/uapi/drm/drm_mode.h 488 488 :internal: 489 + 490 + 491 + dma-buf interoperability 492 + ======================== 493 + 494 + Please see Documentation/userspace-api/dma-buf-alloc-exchange.rst for 495 + information on how dma-buf is integrated and exposed within DRM.
+389
Documentation/userspace-api/dma-buf-alloc-exchange.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + .. Copyright 2021-2023 Collabora Ltd. 3 + 4 + ======================== 5 + Exchanging pixel buffers 6 + ======================== 7 + 8 + As originally designed, the Linux graphics subsystem had extremely limited 9 + support for sharing pixel-buffer allocations between processes, devices, and 10 + subsystems. Modern systems require extensive integration between all three 11 + classes; this document details how applications and kernel subsystems should 12 + approach this sharing for two-dimensional image data. 13 + 14 + It is written with reference to the DRM subsystem for GPU and display devices, 15 + V4L2 for media devices, and also to Vulkan, EGL and Wayland, for userspace 16 + support, however any other subsystems should also follow this design and advice. 17 + 18 + 19 + Glossary of terms 20 + ================= 21 + 22 + .. glossary:: 23 + 24 + image: 25 + Conceptually a two-dimensional array of pixels. The pixels may be stored 26 + in one or more memory buffers. Has width and height in pixels, pixel 27 + format and modifier (implicit or explicit). 28 + 29 + row: 30 + A span along a single y-axis value, e.g. from co-ordinates (0,100) to 31 + (200,100). 32 + 33 + scanline: 34 + Synonym for row. 35 + 36 + column: 37 + A span along a single x-axis value, e.g. from co-ordinates (100,0) to 38 + (100,100). 39 + 40 + memory buffer: 41 + A piece of memory for storing (parts of) pixel data. Has stride and size 42 + in bytes and at least one handle in some API. May contain one or more 43 + planes. 44 + 45 + plane: 46 + A two-dimensional array of some or all of an image's color and alpha 47 + channel values. 48 + 49 + pixel: 50 + A picture element. Has a single color value which is defined by one or 51 + more color channels values, e.g. R, G and B, or Y, Cb and Cr. May also 52 + have an alpha value as an additional channel. 53 + 54 + pixel data: 55 + Bytes or bits that represent some or all of the color/alpha channel values 56 + of a pixel or an image. The data for one pixel may be spread over several 57 + planes or memory buffers depending on format and modifier. 58 + 59 + color value: 60 + A tuple of numbers, representing a color. Each element in the tuple is a 61 + color channel value. 62 + 63 + color channel: 64 + One of the dimensions in a color model. For example, RGB model has 65 + channels R, G, and B. Alpha channel is sometimes counted as a color 66 + channel as well. 67 + 68 + pixel format: 69 + A description of how pixel data represents the pixel's color and alpha 70 + values. 71 + 72 + modifier: 73 + A description of how pixel data is laid out in memory buffers. 74 + 75 + alpha: 76 + A value that denotes the color coverage in a pixel. Sometimes used for 77 + translucency instead. 78 + 79 + stride: 80 + A value that denotes the relationship between pixel-location co-ordinates 81 + and byte-offset values. Typically used as the byte offset between two 82 + pixels at the start of vertically-consecutive tiling blocks. For linear 83 + layouts, the byte offset between two vertically-adjacent pixels. For 84 + non-linear formats the stride must be computed in a consistent way, which 85 + usually is done as-if the layout was linear. 86 + 87 + pitch: 88 + Synonym for stride. 89 + 90 + 91 + Formats and modifiers 92 + ===================== 93 + 94 + Each buffer must have an underlying format. This format describes the color 95 + values provided for each pixel. Although each subsystem has its own format 96 + descriptions (e.g. V4L2 and fbdev), the ``DRM_FORMAT_*`` tokens should be reused 97 + wherever possible, as they are the standard descriptions used for interchange. 98 + These tokens are described in the ``drm_fourcc.h`` file, which is a part of 99 + DRM's uAPI. 100 + 101 + Each ``DRM_FORMAT_*`` token describes the translation between a pixel 102 + co-ordinate in an image, and the color values for that pixel contained within 103 + its memory buffers. The number and type of color channels are described: 104 + whether they are RGB or YUV, integer or floating-point, the size of each channel 105 + and their locations within the pixel memory, and the relationship between color 106 + planes. 107 + 108 + For example, ``DRM_FORMAT_ARGB8888`` describes a format in which each pixel has 109 + a single 32-bit value in memory. Alpha, red, green, and blue, color channels are 110 + available at 8-bit precision per channel, ordered respectively from most to 111 + least significant bits in little-endian storage. ``DRM_FORMAT_*`` is not 112 + affected by either CPU or device endianness; the byte pattern in memory is 113 + always as described in the format definition, which is usually little-endian. 114 + 115 + As a more complex example, ``DRM_FORMAT_NV12`` describes a format in which luma 116 + and chroma YUV samples are stored in separate planes, where the chroma plane is 117 + stored at half the resolution in both dimensions (i.e. one U/V chroma 118 + sample is stored for each 2x2 pixel grouping). 119 + 120 + Format modifiers describe a translation mechanism between these per-pixel memory 121 + samples, and the actual memory storage for the buffer. The most straightforward 122 + modifier is ``DRM_FORMAT_MOD_LINEAR``, describing a scheme in which each plane 123 + is laid out row-sequentially, from the top-left to the bottom-right corner. 124 + This is considered the baseline interchange format, and most convenient for CPU 125 + access. 126 + 127 + Modern hardware employs much more sophisticated access mechanisms, typically 128 + making use of tiled access and possibly also compression. For example, the 129 + ``DRM_FORMAT_MOD_VIVANTE_TILED`` modifier describes memory storage where pixels 130 + are stored in 4x4 blocks arranged in row-major ordering, i.e. the first tile in 131 + a plane stores pixels (0,0) to (3,3) inclusive, and the second tile in a plane 132 + stores pixels (4,0) to (7,3) inclusive. 133 + 134 + Some modifiers may modify the number of planes required for an image; for 135 + example, the ``I915_FORMAT_MOD_Y_TILED_CCS`` modifier adds a second plane to RGB 136 + formats in which it stores data about the status of every tile, notably 137 + including whether the tile is fully populated with pixel data, or can be 138 + expanded from a single solid color. 139 + 140 + These extended layouts are highly vendor-specific, and even specific to 141 + particular generations or configurations of devices per-vendor. For this reason, 142 + support of modifiers must be explicitly enumerated and negotiated by all users 143 + in order to ensure a compatible and optimal pipeline, as discussed below. 144 + 145 + 146 + Dimensions and size 147 + =================== 148 + 149 + Each pixel buffer must be accompanied by logical pixel dimensions. This refers 150 + to the number of unique samples which can be extracted from, or stored to, the 151 + underlying memory storage. For example, even though a 1920x1080 152 + ``DRM_FORMAT_NV12`` buffer has a luma plane containing 1920x1080 samples for the Y 153 + component, and 960x540 samples for the U and V components, the overall buffer is 154 + still described as having dimensions of 1920x1080. 155 + 156 + The in-memory storage of a buffer is not guaranteed to begin immediately at the 157 + base address of the underlying memory, nor is it guaranteed that the memory 158 + storage is tightly clipped to either dimension. 159 + 160 + Each plane must therefore be described with an ``offset`` in bytes, which will be 161 + added to the base address of the memory storage before performing any per-pixel 162 + calculations. This may be used to combine multiple planes into a single memory 163 + buffer; for example, ``DRM_FORMAT_NV12`` may be stored in a single memory buffer 164 + where the luma plane's storage begins immediately at the start of the buffer 165 + with an offset of 0, and the chroma plane's storage follows within the same buffer 166 + beginning from the byte offset for that plane. 167 + 168 + Each plane must also have a ``stride`` in bytes, expressing the offset in memory 169 + between two contiguous row. For example, a ``DRM_FORMAT_MOD_LINEAR`` buffer 170 + with dimensions of 1000x1000 may have been allocated as if it were 1024x1000, in 171 + order to allow for aligned access patterns. In this case, the buffer will still 172 + be described with a width of 1000, however the stride will be ``1024 * bpp``, 173 + indicating that there are 24 pixels at the positive extreme of the x axis whose 174 + values are not significant. 175 + 176 + Buffers may also be padded further in the y dimension, simply by allocating a 177 + larger area than would ordinarily be required. For example, many media decoders 178 + are not able to natively output buffers of height 1080, but instead require an 179 + effective height of 1088 pixels. In this case, the buffer continues to be 180 + described as having a height of 1080, with the memory allocation for each buffer 181 + being increased to account for the extra padding. 182 + 183 + 184 + Enumeration 185 + =========== 186 + 187 + Every user of pixel buffers must be able to enumerate a set of supported formats 188 + and modifiers, described together. Within KMS, this is achieved with the 189 + ``IN_FORMATS`` property on each DRM plane, listing the supported DRM formats, and 190 + the modifiers supported for each format. In userspace, this is supported through 191 + the `EGL_EXT_image_dma_buf_import_modifiers`_ extension entrypoints for EGL, the 192 + `VK_EXT_image_drm_format_modifier`_ extension for Vulkan, and the 193 + `zwp_linux_dmabuf_v1`_ extension for Wayland. 194 + 195 + Each of these interfaces allows users to query a set of supported 196 + format+modifier combinations. 197 + 198 + 199 + Negotiation 200 + =========== 201 + 202 + It is the responsibility of userspace to negotiate an acceptable format+modifier 203 + combination for its usage. This is performed through a simple intersection of 204 + lists. For example, if a user wants to use Vulkan to render an image to be 205 + displayed on a KMS plane, it must: 206 + 207 + - query KMS for the ``IN_FORMATS`` property for the given plane 208 + - query Vulkan for the supported formats for its physical device, making sure 209 + to pass the ``VkImageUsageFlagBits`` and ``VkImageCreateFlagBits`` 210 + corresponding to the intended rendering use 211 + - intersect these formats to determine the most appropriate one 212 + - for this format, intersect the lists of supported modifiers for both KMS and 213 + Vulkan, to obtain a final list of acceptable modifiers for that format 214 + 215 + This intersection must be performed for all usages. For example, if the user 216 + also wishes to encode the image to a video stream, it must query the media API 217 + it intends to use for encoding for the set of modifiers it supports, and 218 + additionally intersect against this list. 219 + 220 + If the intersection of all lists is an empty list, it is not possible to share 221 + buffers in this way, and an alternate strategy must be considered (e.g. using 222 + CPU access routines to copy data between the different uses, with the 223 + corresponding performance cost). 224 + 225 + The resulting modifier list is unsorted; the order is not significant. 226 + 227 + 228 + Allocation 229 + ========== 230 + 231 + Once userspace has determined an appropriate format, and corresponding list of 232 + acceptable modifiers, it must allocate the buffer. As there is no universal 233 + buffer-allocation interface available at either kernel or userspace level, the 234 + client makes an arbitrary choice of allocation interface such as Vulkan, GBM, or 235 + a media API. 236 + 237 + Each allocation request must take, at a minimum: the pixel format, a list of 238 + acceptable modifiers, and the buffer's width and height. Each API may extend 239 + this set of properties in different ways, such as allowing allocation in more 240 + than two dimensions, intended usage patterns, etc. 241 + 242 + The component which allocates the buffer will make an arbitrary choice of what 243 + it considers the 'best' modifier within the acceptable list for the requested 244 + allocation, any padding required, and further properties of the underlying 245 + memory buffers such as whether they are stored in system or device-specific 246 + memory, whether or not they are physically contiguous, and their cache mode. 247 + These properties of the memory buffer are not visible to userspace, however the 248 + ``dma-heaps`` API is an effort to address this. 249 + 250 + After allocation, the client must query the allocator to determine the actual 251 + modifier selected for the buffer, as well as the per-plane offset and stride. 252 + Allocators are not permitted to vary the format in use, to select a modifier not 253 + provided within the acceptable list, nor to vary the pixel dimensions other than 254 + the padding expressed through offset, stride, and size. 255 + 256 + Communicating additional constraints, such as alignment of stride or offset, 257 + placement within a particular memory area, etc, is out of scope of dma-buf, 258 + and is not solved by format and modifier tokens. 259 + 260 + 261 + Import 262 + ====== 263 + 264 + To use a buffer within a different context, device, or subsystem, the user 265 + passes these parameters (format, modifier, width, height, and per-plane offset 266 + and stride) to an importing API. 267 + 268 + Each memory buffer is referred to by a buffer handle, which may be unique or 269 + duplicated within an image. For example, a ``DRM_FORMAT_NV12`` buffer may have 270 + the luma and chroma buffers combined into a single memory buffer by use of the 271 + per-plane offset parameters, or they may be completely separate allocations in 272 + memory. For this reason, each import and allocation API must provide a separate 273 + handle for each plane. 274 + 275 + Each kernel subsystem has its own types and interfaces for buffer management. 276 + DRM uses GEM buffer objects (BOs), V4L2 has its own references, etc. These types 277 + are not portable between contexts, processes, devices, or subsystems. 278 + 279 + To address this, ``dma-buf`` handles are used as the universal interchange for 280 + buffers. Subsystem-specific operations are used to export native buffer handles 281 + to a ``dma-buf`` file descriptor, and to import those file descriptors into a 282 + native buffer handle. dma-buf file descriptors can be transferred between 283 + contexts, processes, devices, and subsystems. 284 + 285 + For example, a Wayland media player may use V4L2 to decode a video frame into a 286 + ``DRM_FORMAT_NV12`` buffer. This will result in two memory planes (luma and 287 + chroma) being dequeued by the user from V4L2. These planes are then exported to 288 + one dma-buf file descriptor per plane, these descriptors are then sent along 289 + with the metadata (format, modifier, width, height, per-plane offset and stride) 290 + to the Wayland server. The Wayland server will then import these file 291 + descriptors as an EGLImage for use through EGL/OpenGL (ES), a VkImage for use 292 + through Vulkan, or a KMS framebuffer object; each of these import operations 293 + will take the same metadata and convert the dma-buf file descriptors into their 294 + native buffer handles. 295 + 296 + Having a non-empty intersection of supported modifiers does not guarantee that 297 + import will succeed into all consumers; they may have constraints beyond those 298 + implied by modifiers which must be satisfied. 299 + 300 + 301 + Implicit modifiers 302 + ================== 303 + 304 + The concept of modifiers post-dates all of the subsystems mentioned above. As 305 + such, it has been retrofitted into all of these APIs, and in order to ensure 306 + backwards compatibility, support is needed for drivers and userspace which do 307 + not (yet) support modifiers. 308 + 309 + As an example, GBM is used to allocate buffers to be shared between EGL for 310 + rendering and KMS for display. It has two entrypoints for allocating buffers: 311 + ``gbm_bo_create`` which only takes the format, width, height, and a usage token, 312 + and ``gbm_bo_create_with_modifiers`` which extends this with a list of modifiers. 313 + 314 + In the latter case, the allocation is as discussed above, being provided with a 315 + list of acceptable modifiers that the implementation can choose from (or fail if 316 + it is not possible to allocate within those constraints). In the former case 317 + where modifiers are not provided, the GBM implementation must make its own 318 + choice as to what is likely to be the 'best' layout. Such a choice is entirely 319 + implementation-specific: some will internally use tiled layouts which are not 320 + CPU-accessible if the implementation decides that is a good idea through 321 + whatever heuristic. It is the implementation's responsibility to ensure that 322 + this choice is appropriate. 323 + 324 + To support this case where the layout is not known because there is no awareness 325 + of modifiers, a special ``DRM_FORMAT_MOD_INVALID`` token has been defined. This 326 + pseudo-modifier declares that the layout is not known, and that the driver 327 + should use its own logic to determine what the underlying layout may be. 328 + 329 + .. note:: 330 + 331 + ``DRM_FORMAT_MOD_INVALID`` is a non-zero value. The modifier value zero is 332 + ``DRM_FORMAT_MOD_LINEAR``, which is an explicit guarantee that the image 333 + has the linear layout. Care and attention should be taken to ensure that 334 + zero as a default value is not mixed up with either no modifier or the linear 335 + modifier. Also note that in some APIs the invalid modifier value is specified 336 + with an out-of-band flag, like in ``DRM_IOCTL_MODE_ADDFB2``. 337 + 338 + There are four cases where this token may be used: 339 + - during enumeration, an interface may return ``DRM_FORMAT_MOD_INVALID``, either 340 + as the sole member of a modifier list to declare that explicit modifiers are 341 + not supported, or as part of a larger list to declare that implicit modifiers 342 + may be used 343 + - during allocation, a user may supply ``DRM_FORMAT_MOD_INVALID``, either as the 344 + sole member of a modifier list (equivalent to not supplying a modifier list 345 + at all) to declare that explicit modifiers are not supported and must not be 346 + used, or as part of a larger list to declare that an allocation using implicit 347 + modifiers is acceptable 348 + - in a post-allocation query, an implementation may return 349 + ``DRM_FORMAT_MOD_INVALID`` as the modifier of the allocated buffer to declare 350 + that the underlying layout is implementation-defined and that an explicit 351 + modifier description is not available; per the above rules, this may only be 352 + returned when the user has included ``DRM_FORMAT_MOD_INVALID`` as part of the 353 + list of acceptable modifiers, or not provided a list 354 + - when importing a buffer, the user may supply ``DRM_FORMAT_MOD_INVALID`` as the 355 + buffer modifier (or not supply a modifier) to indicate that the modifier is 356 + unknown for whatever reason; this is only acceptable when the buffer has 357 + not been allocated with an explicit modifier 358 + 359 + It follows from this that for any single buffer, the complete chain of operations 360 + formed by the producer and all the consumers must be either fully implicit or fully 361 + explicit. For example, if a user wishes to allocate a buffer for use between 362 + GPU, display, and media, but the media API does not support modifiers, then the 363 + user **must not** allocate the buffer with explicit modifiers and attempt to 364 + import the buffer into the media API with no modifier, but either perform the 365 + allocation using implicit modifiers, or allocate the buffer for media use 366 + separately and copy between the two buffers. 367 + 368 + As one exception to the above, allocations may be 'upgraded' from implicit 369 + to explicit modifiers. For example, if the buffer is allocated with 370 + ``gbm_bo_create`` (taking no modifiers), the user may then query the modifier with 371 + ``gbm_bo_get_modifier`` and then use this modifier as an explicit modifier token 372 + if a valid modifier is returned. 373 + 374 + When allocating buffers for exchange between different users and modifiers are 375 + not available, implementations are strongly encouraged to use 376 + ``DRM_FORMAT_MOD_LINEAR`` for their allocation, as this is the universal baseline 377 + for exchange. However, it is not guaranteed that this will result in the correct 378 + interpretation of buffer content, as implicit modifier operation may still be 379 + subject to driver-specific heuristics. 380 + 381 + Any new users - userspace programs and protocols, kernel subsystems, etc - 382 + wishing to exchange buffers must offer interoperability through dma-buf file 383 + descriptors for memory planes, DRM format tokens to describe the format, DRM 384 + format modifiers to describe the layout in memory, at least width and height for 385 + dimensions, and at least offset and stride for each memory plane. 386 + 387 + .. _zwp_linux_dmabuf_v1: https://gitlab.freedesktop.org/wayland/wayland-protocols/-/blob/main/unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml 388 + .. _VK_EXT_image_drm_format_modifier: https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_EXT_image_drm_format_modifier.html 389 + .. _EGL_EXT_image_dma_buf_import_modifiers: https://registry.khronos.org/EGL/extensions/EXT/EGL_EXT_image_dma_buf_import_modifiers.txt
+1
Documentation/userspace-api/index.rst
··· 22 22 unshare 23 23 spec_ctrl 24 24 accelerators/ocxl 25 + dma-buf-alloc-exchange 25 26 ebpf/index 26 27 ELF 27 28 ioctl/index
+1
MAINTAINERS
··· 6106 6106 S: Maintained 6107 6107 T: git git://anongit.freedesktop.org/drm/drm-misc 6108 6108 F: Documentation/driver-api/dma-buf.rst 6109 + F: Documentation/userspace-api/dma-buf-alloc-exchange.rst 6109 6110 F: drivers/dma-buf/ 6110 6111 F: include/linux/*fence.h 6111 6112 F: include/linux/dma-buf.h