Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pages

It ensures that appropriate error codes are returned when an error
condition is detected

Fixes the below;
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2849 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_umc_pages_in_a_row()' failed.
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:2884 amdgpu_ras_add_bad_pages() warn: missing error code here? 'amdgpu_ras_mca2pa()' failed.

v2: s/-EIO/-EINVAL, retained the use of -EINVAL from
amdgpu_umc_pages_in_a_row & and amdgpu_ras_mca2pa_by_idx, when the
RAS context is not initialized or the convert_ras_err_addr function is
unavailable. (Thomas)

V3: Returning 0 as the absence of eh_data is acceptable. (Tao)

Fixes: a8d133e625ce ("drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modes")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: YiPeng Chai <yipeng.chai@amd.com>
Cc: Tao Zhou <tao.zhou1@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Srinivasan Shanmugam and committed by
Alex Deucher
9095567b 2774ef76

+16 -5
+16 -5
drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
··· 2832 2832 2833 2833 mutex_lock(&con->recovery_lock); 2834 2834 data = con->eh_data; 2835 - if (!data) 2835 + if (!data) { 2836 + /* Returning 0 as the absence of eh_data is acceptable */ 2836 2837 goto free; 2838 + } 2837 2839 2838 2840 for (i = 0; i < pages; i++) { 2839 2841 if (from_rom && ··· 2847 2845 * one row 2848 2846 */ 2849 2847 if (amdgpu_umc_pages_in_a_row(adev, &err_data, 2850 - bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) 2848 + bps[i].retired_page << 2849 + AMDGPU_GPU_PAGE_SHIFT)) { 2850 + ret = -EINVAL; 2851 2851 goto free; 2852 - else 2852 + } else { 2853 2853 find_pages_per_pa = true; 2854 + } 2854 2855 } else { 2855 2856 /* unsupported cases */ 2857 + ret = -EOPNOTSUPP; 2856 2858 goto free; 2857 2859 } 2858 2860 } 2859 2861 } else { 2860 2862 if (amdgpu_umc_pages_in_a_row(adev, &err_data, 2861 - bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) 2863 + bps[i].retired_page << AMDGPU_GPU_PAGE_SHIFT)) { 2864 + ret = -EINVAL; 2862 2865 goto free; 2866 + } 2863 2867 } 2864 2868 } else { 2865 2869 if (from_rom && !find_pages_per_pa) { 2866 2870 if (bps[i].retired_page & UMC_CHANNEL_IDX_V2) { 2867 2871 /* bad page in any NPS mode in eeprom */ 2868 - if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) 2872 + if (amdgpu_ras_mca2pa_by_idx(adev, &bps[i], &err_data)) { 2873 + ret = -EINVAL; 2869 2874 goto free; 2875 + } 2870 2876 } else { 2871 2877 /* legacy bad page in eeprom, generated only in 2872 2878 * NPS1 mode ··· 2891 2881 /* non-nps1 mode, old RAS TA 2892 2882 * can't support it 2893 2883 */ 2884 + ret = -EOPNOTSUPP; 2894 2885 goto free; 2895 2886 } 2896 2887 }