Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amdgpu: rework how we handle TLB fences

Add a new VM flag to indicate whether or not we need
a TLB fence. Userqs (KFD or KGD) require a TLB fence.
A TLB fence is not strictly required for kernel queues,
but it shouldn't hurt. That said, enabling this
unconditionally should be fine, but it seems to tickle
some issues in KIQ/MES. Only enable them for KFD,
or when KGD userq queues are enabled (currently via module
parameter).

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749
Fixes: f3854e04b708 ("drm/amdgpu: attach tlb fence to the PTs update")
Cc: Christian König <christian.koenig@amd.com>
Cc: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 69c5fbd2b93b5ced77c6e79afe83371bca84c788)
Cc: stable@vger.kernel.org

+8 -1
+6 -1
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
··· 1069 1069 } 1070 1070 1071 1071 /* Prepare a TLB flush fence to be attached to PTs */ 1072 - if (!params->unlocked) { 1072 + /* The check for need_tlb_fence should be dropped once we 1073 + * sort out the issues with KIQ/MES TLB invalidation timeouts. 1074 + */ 1075 + if (!params->unlocked && vm->need_tlb_fence) { 1073 1076 amdgpu_vm_tlb_fence_create(params->adev, vm, fence); 1074 1077 1075 1078 /* Makes sure no PD/PT is freed before the flush */ ··· 2605 2602 ttm_lru_bulk_move_init(&vm->lru_bulk_move); 2606 2603 2607 2604 vm->is_compute_context = false; 2605 + vm->need_tlb_fence = amdgpu_userq_enabled(&adev->ddev); 2608 2606 2609 2607 vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode & 2610 2608 AMDGPU_VM_USE_CPU_FOR_GFX); ··· 2743 2739 dma_fence_put(vm->last_update); 2744 2740 vm->last_update = dma_fence_get_stub(); 2745 2741 vm->is_compute_context = true; 2742 + vm->need_tlb_fence = true; 2746 2743 2747 2744 unreserve_bo: 2748 2745 amdgpu_bo_unreserve(vm->root.bo);
+2
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
··· 441 441 struct ttm_lru_bulk_move lru_bulk_move; 442 442 /* Flag to indicate if VM is used for compute */ 443 443 bool is_compute_context; 444 + /* Flag to indicate if VM needs a TLB fence (KFD or KGD) */ 445 + bool need_tlb_fence; 444 446 445 447 /* Memory partition number, -1 means any partition */ 446 448 int8_t mem_id;