Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amd: Add per-ring reset for vcn v4.0.5 use

There is a problem occurring on VCN 4.0.5 where in some situations a job
is timing out. This triggers a job timeout which then causes a GPU
reset for recovery. That has exposed a number of issues with GPU reset
that have since been fixed. But also a GPU reset isn't actually needed
for this circumstance. Just restarting the ring is enough.

Add a reset callback for the ring which will stop and start VCN if the
issue happens.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/12528
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3909
Link: https://lore.kernel.org/r/20250506204948.12048-2-mario.limonciello@amd.com
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Mario Limonciello and committed by
Alex Deucher
d1a46cdd 5c937b4a

+22
+22
drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c
··· 219 219 adev->vcn.inst[i].pause_dpg_mode = vcn_v4_0_5_pause_dpg_mode; 220 220 } 221 221 222 + adev->vcn.supported_reset = amdgpu_get_soft_full_reset_mask(&adev->vcn.inst[0].ring_enc[0]); 223 + adev->vcn.supported_reset |= AMDGPU_RESET_TYPE_PER_QUEUE; 224 + 225 + r = amdgpu_vcn_sysfs_reset_mask_init(adev); 226 + if (r) 227 + return r; 228 + 222 229 if (amdgpu_sriov_vf(adev)) { 223 230 r = amdgpu_virt_alloc_mm_table(adev); 224 231 if (r) ··· 1447 1440 } 1448 1441 } 1449 1442 1443 + static int vcn_v4_0_5_ring_reset(struct amdgpu_ring *ring, unsigned int vmid) 1444 + { 1445 + struct amdgpu_device *adev = ring->adev; 1446 + struct amdgpu_vcn_inst *vinst = &adev->vcn.inst[ring->me]; 1447 + 1448 + if (!(adev->vcn.supported_reset & AMDGPU_RESET_TYPE_PER_QUEUE)) 1449 + return -EOPNOTSUPP; 1450 + 1451 + vcn_v4_0_5_stop(vinst); 1452 + vcn_v4_0_5_start(vinst); 1453 + 1454 + return amdgpu_ring_test_helper(ring); 1455 + } 1456 + 1450 1457 static struct amdgpu_ring_funcs vcn_v4_0_5_unified_ring_vm_funcs = { 1451 1458 .type = AMDGPU_RING_TYPE_VCN_ENC, 1452 1459 .align_mask = 0x3f, ··· 1488 1467 .emit_wreg = vcn_v2_0_enc_ring_emit_wreg, 1489 1468 .emit_reg_wait = vcn_v2_0_enc_ring_emit_reg_wait, 1490 1469 .emit_reg_write_reg_wait = amdgpu_ring_emit_reg_write_reg_wait_helper, 1470 + .reset = vcn_v4_0_5_ring_reset, 1491 1471 }; 1492 1472 1493 1473 /**