drm/msm/a6xx: Fix excessive stack usage

Clang-19 and above sometimes end up with multiple copies of the large
a6xx_hfi_msg_bw_table structure on the stack. The problem is that
a6xx_hfi_send_bw_table() calls a number of device specific functions to
fill the structure, but these create another copy of the structure on
the stack which gets copied to the first.

If the functions get inlined, that busts the warning limit:

drivers/gpu/drm/msm/adreno/a6xx_hfi.c:631:12: error: stack frame size (1032) exceeds limit (1024) in 'a6xx_hfi_send_bw_table' [-Werror,-Wframe-larger-than]

Fix this by kmalloc-ating struct a6xx_hfi_msg_bw_table instead of using
the stack. Also, use this opportunity to skip re-initializing this table
to optimize gpu wake up latency.

Cc: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/621814/
Signed-off-by: Rob Clark <robdclark@chromium.org>

authored by

Akhil P Oommen and committed by

Rob Clark 2 years ago d6d1ad32 8f32ddd8

+35 -24

2 changed files

expand all

drivers

gpu

drm

msm

adreno

a6xx_gmu.h

a6xx_hfi.c

drivers/gpu/drm/msm/adreno/a6xx_gmu.h

··· 99 99 struct completion pd_gate; 100 100 101 101 struct qmp *qmp; 102 + struct a6xx_hfi_msg_bw_table *bw_table; 102 103 }; 103 104 104 105 static inline u32 gmu_read(struct a6xx_gmu *gmu, u32 offset)

+34 -24

drivers/gpu/drm/msm/adreno/a6xx_hfi.c

··· 661 661 662 662 static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu) 663 663 { 664 - struct a6xx_hfi_msg_bw_table msg = { 0 }; 664 + struct a6xx_hfi_msg_bw_table *msg; 665 665 struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu); 666 666 struct adreno_gpu *adreno_gpu = &a6xx_gpu->base; 667 667 668 - if (adreno_is_a618(adreno_gpu)) 669 - a618_build_bw_table(&msg); 670 - else if (adreno_is_a619(adreno_gpu)) 671 - a619_build_bw_table(&msg); 672 - else if (adreno_is_a640_family(adreno_gpu)) 673 - a640_build_bw_table(&msg); 674 - else if (adreno_is_a650(adreno_gpu)) 675 - a650_build_bw_table(&msg); 676 - else if (adreno_is_7c3(adreno_gpu)) 677 - adreno_7c3_build_bw_table(&msg); 678 - else if (adreno_is_a660(adreno_gpu)) 679 - a660_build_bw_table(&msg); 680 - else if (adreno_is_a663(adreno_gpu)) 681 - a663_build_bw_table(&msg); 682 - else if (adreno_is_a690(adreno_gpu)) 683 - a690_build_bw_table(&msg); 684 - else if (adreno_is_a730(adreno_gpu)) 685 - a730_build_bw_table(&msg); 686 - else if (adreno_is_a740_family(adreno_gpu)) 687 - a740_build_bw_table(&msg); 688 - else 689 - a6xx_build_bw_table(&msg); 668 + if (gmu->bw_table) 669 + goto send; 690 670 691 - return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_BW_TABLE, &msg, sizeof(msg), 671 + msg = devm_kzalloc(gmu->dev, sizeof(*msg), GFP_KERNEL); 672 + if (!msg) 673 + return -ENOMEM; 674 + 675 + if (adreno_is_a618(adreno_gpu)) 676 + a618_build_bw_table(msg); 677 + else if (adreno_is_a619(adreno_gpu)) 678 + a619_build_bw_table(msg); 679 + else if (adreno_is_a640_family(adreno_gpu)) 680 + a640_build_bw_table(msg); 681 + else if (adreno_is_a650(adreno_gpu)) 682 + a650_build_bw_table(msg); 683 + else if (adreno_is_7c3(adreno_gpu)) 684 + adreno_7c3_build_bw_table(msg); 685 + else if (adreno_is_a660(adreno_gpu)) 686 + a660_build_bw_table(msg); 687 + else if (adreno_is_a663(adreno_gpu)) 688 + a663_build_bw_table(msg); 689 + else if (adreno_is_a690(adreno_gpu)) 690 + a690_build_bw_table(msg); 691 + else if (adreno_is_a730(adreno_gpu)) 692 + a730_build_bw_table(msg); 693 + else if (adreno_is_a740_family(adreno_gpu)) 694 + a740_build_bw_table(msg); 695 + else 696 + a6xx_build_bw_table(msg); 697 + 698 + gmu->bw_table = msg; 699 + 700 + send: 701 + return a6xx_hfi_send_msg(gmu, HFI_H2F_MSG_BW_TABLE, gmu->bw_table, sizeof(*(gmu->bw_table)), 692 702 NULL, 0); 693 703 } 694 704

Configure Feed

Configure Feed