drm/ttm: Allow drivers to specify maximum beneficial TTM pool size

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

GPUs typically benefit from contiguous memory via reduced TLB pressure and
improved caching performance, where the maximum size of contiguous block
which adds a performance benefit is related to hardware design.

TTM pool allocator by default tries (hard) to allocate up to the system
MAX_PAGE_ORDER blocks. This varies by the CPU platform and can also be
configured via Kconfig.

If that limit was set to be higher than the GPU can make an extra use of,
lets allow the individual drivers to let TTM know over which allocation
order can the pool allocator afford to make a little bit less effort with.

We implement this by disabling direct reclaim for those allocations, which
reduces the allocation latency and lowers the demands on the page
allocator, in cases where expending this effort is not critical for the
GPU in question.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
Link: https://lore.kernel.org/r/20251020115411.36818-5-tvrtko.ursulin@igalia.com

authored by

Tvrtko Ursulin and committed by

Tvrtko Ursulin 7 months ago 7e9c548d 77e19f8d

+16 -2

3 changed files

expand all

drivers

gpu

drm

ttm

ttm_pool.c

ttm_pool_internal.h

include

drm

ttm

ttm_allocation.h

drivers/gpu/drm/ttm/ttm_pool.c

··· 136 136 static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t gfp_flags, 137 137 unsigned int order) 138 138 { 139 + const unsigned int beneficial_order = ttm_pool_beneficial_order(pool); 139 140 unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; 140 141 struct ttm_pool_dma *dma; 141 142 struct page *p; ··· 149 148 if (order) 150 149 gfp_flags |= __GFP_NOMEMALLOC | __GFP_NORETRY | __GFP_NOWARN | 151 150 __GFP_THISNODE; 151 + 152 + /* 153 + * Do not add latency to the allocation path for allocations orders 154 + * device tolds us do not bring them additional performance gains. 155 + */ 156 + if (beneficial_order && order > beneficial_order) 157 + gfp_flags &= ~__GFP_DIRECT_RECLAIM; 152 158 153 159 if (!ttm_pool_uses_dma_alloc(pool)) { 154 160 p = alloc_pages_node(pool->nid, gfp_flags, order);

drivers/gpu/drm/ttm/ttm_pool_internal.h

··· 17 17 return pool->alloc_flags & TTM_ALLOCATION_POOL_USE_DMA32; 18 18 } 19 19 20 + static inline bool ttm_pool_beneficial_order(struct ttm_pool *pool) 21 + { 22 + return pool->alloc_flags & 0xff; 23 + } 24 + 20 25 #endif

+3 -2

include/drm/ttm/ttm_allocation.h

··· 4 4 #ifndef _TTM_ALLOCATION_H_ 5 5 #define _TTM_ALLOCATION_H_ 6 6 7 - #define TTM_ALLOCATION_POOL_USE_DMA_ALLOC BIT(0) /* Use coherent DMA allocations. */ 8 - #define TTM_ALLOCATION_POOL_USE_DMA32 BIT(1) /* Use GFP_DMA32 allocations. */ 7 + #define TTM_ALLOCATION_POOL_BENEFICIAL_ORDER(n) ((n) & 0xff) /* Max order which caller can benefit from */ 8 + #define TTM_ALLOCATION_POOL_USE_DMA_ALLOC BIT(8) /* Use coherent DMA allocations. */ 9 + #define TTM_ALLOCATION_POOL_USE_DMA32 BIT(9) /* Use GFP_DMA32 allocations. */ 9 10 10 11 #endif

Configure Feed

Configure Feed