Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii board

On a specific Radeon R9 390X board, the GPU can "randomly" hang
while gaming. Initially I thought this was a RADV bug and tried
to work around this in Mesa:
commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR")

However, I got some feedback from other users who are reporting
that the above mitigation causes a significant performance
regression for them, and they didn't experience the hang on their
GPU in the first place.

After some further investigation, it turns out that the problem
is that the highest SCLK DPM level on this board isn't stable.
Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue,
and has a negligible impact on performance compared to the Mesa
patch. (Note that increasing the voltage can also work around it,
but we felt that lowering the SCLK is the safer option.)

To solve the above issue, add an "sclk_cap" field to smu7_hwmgr
and set this field for the affected board. The capped SCLK value
correctly appears on the sysfs interface and shows up in GUI
tools such as LACT.

Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Timur Kristóf and committed by
Alex Deucher
4724bc5b baf28ec5

+27 -4
+26 -4
drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c
··· 787 787 hwmgr->dyn_state.vddc_dependency_on_mclk; 788 788 struct phm_cac_leakage_table *std_voltage_table = 789 789 hwmgr->dyn_state.cac_leakage_table; 790 - uint32_t i; 790 + uint32_t i, clk; 791 791 792 792 PP_ASSERT_WITH_CODE(allowed_vdd_sclk_table != NULL, 793 793 "SCLK dependency table is missing. This table is mandatory", return -EINVAL); ··· 804 804 data->dpm_table.sclk_table.count = 0; 805 805 806 806 for (i = 0; i < allowed_vdd_sclk_table->count; i++) { 807 + clk = min(allowed_vdd_sclk_table->entries[i].clk, data->sclk_cap); 808 + 807 809 if (i == 0 || data->dpm_table.sclk_table.dpm_levels[data->dpm_table.sclk_table.count-1].value != 808 - allowed_vdd_sclk_table->entries[i].clk) { 810 + clk) { 809 811 data->dpm_table.sclk_table.dpm_levels[data->dpm_table.sclk_table.count].value = 810 - allowed_vdd_sclk_table->entries[i].clk; 812 + clk; 811 813 data->dpm_table.sclk_table.dpm_levels[data->dpm_table.sclk_table.count].enabled = (i == 0) ? 1 : 0; 812 814 data->dpm_table.sclk_table.count++; 813 815 } ··· 3002 3000 return 0; 3003 3001 } 3004 3002 3003 + static void smu7_set_sclk_cap(struct pp_hwmgr *hwmgr) 3004 + { 3005 + struct amdgpu_device *adev = hwmgr->adev; 3006 + struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend); 3007 + 3008 + data->sclk_cap = 0xffffffff; 3009 + 3010 + if (hwmgr->od_enabled) 3011 + return; 3012 + 3013 + /* R9 390X board: last sclk dpm level is unstable, use lower sclk */ 3014 + if (adev->pdev->device == 0x67B0 && 3015 + adev->pdev->subsystem_vendor == 0x1043) 3016 + data->sclk_cap = 104000; /* 1040 MHz */ 3017 + 3018 + if (data->sclk_cap != 0xffffffff) 3019 + dev_info(adev->dev, "sclk cap: %u kHz on quirky ASIC\n", data->sclk_cap * 10); 3020 + } 3021 + 3005 3022 static int smu7_hwmgr_backend_init(struct pp_hwmgr *hwmgr) 3006 3023 { 3007 3024 struct amdgpu_device *adev = hwmgr->adev; ··· 3032 3011 return -ENOMEM; 3033 3012 3034 3013 hwmgr->backend = data; 3014 + smu7_set_sclk_cap(hwmgr); 3035 3015 smu7_patch_voltage_workaround(hwmgr); 3036 3016 smu7_init_dpm_defaults(hwmgr); 3037 3017 ··· 3916 3894 3917 3895 /* Performance levels are arranged from low to high. */ 3918 3896 performance_level->memory_clock = memory_clock; 3919 - performance_level->engine_clock = engine_clock; 3897 + performance_level->engine_clock = min(engine_clock, data->sclk_cap); 3920 3898 3921 3899 pcie_gen_from_bios = visland_clk_info->ucPCIEGen; 3922 3900
+1
drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.h
··· 234 234 uint32_t pcie_gen_cap; 235 235 uint32_t pcie_lane_cap; 236 236 uint32_t pcie_spc_cap; 237 + uint32_t sclk_cap; 237 238 struct smu7_leakage_voltage vddc_leakage; 238 239 struct smu7_leakage_voltage vddci_leakage; 239 240 struct smu7_leakage_voltage vddcgfx_leakage;