Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amdkfd: sever xgmi io link if host driver has disable sharing

Host drivers can create partial hives per guest by disabling xgmi sharing
between certain peers in the main hive.
Typically, these partial hives are fully connected per guest session.
In the event that the host makes a mistake by adding a non-shared node
to a guest session, have the KFD reflect sharing disabled by severing
the IO link.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Tested-by: James Yao <yiqing.yao@amd.com>
Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

authored by

Jonathan Kim and committed by
Alex Deucher
e46738a5 46186667

+22
+17
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c
··· 801 801 return -EINVAL; 802 802 } 803 803 804 + bool amdgpu_xgmi_get_is_sharing_enabled(struct amdgpu_device *adev, 805 + struct amdgpu_device *peer_adev) 806 + { 807 + struct psp_xgmi_topology_info *top = &adev->psp.xgmi_context.top_info; 808 + int i; 809 + 810 + /* Sharing should always be enabled for non-SRIOV. */ 811 + if (!amdgpu_sriov_vf(adev)) 812 + return true; 813 + 814 + for (i = 0 ; i < top->num_nodes; ++i) 815 + if (top->nodes[i].node_id == peer_adev->gmc.xgmi.node_id) 816 + return !!top->nodes[i].is_sharing_enabled; 817 + 818 + return false; 819 + } 820 + 804 821 /* 805 822 * Devices that support extended data require the entire hive to initialize with 806 823 * the shared memory buffer flag set.
+2
drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h
··· 66 66 struct amdgpu_device *peer_adev); 67 67 int amdgpu_xgmi_get_num_links(struct amdgpu_device *adev, 68 68 struct amdgpu_device *peer_adev); 69 + bool amdgpu_xgmi_get_is_sharing_enabled(struct amdgpu_device *adev, 70 + struct amdgpu_device *peer_adev); 69 71 uint64_t amdgpu_xgmi_get_relative_phy_addr(struct amdgpu_device *adev, 70 72 uint64_t addr); 71 73 static inline bool amdgpu_xgmi_same_hive(struct amdgpu_device *adev,
+3
drivers/gpu/drm/amd/amdkfd/kfd_crat.c
··· 28 28 #include "kfd_topology.h" 29 29 #include "amdgpu.h" 30 30 #include "amdgpu_amdkfd.h" 31 + #include "amdgpu_xgmi.h" 31 32 32 33 /* GPU Processor ID base for dGPUs for which VCRAT needs to be created. 33 34 * GPU processor ID are expressed with Bit[31]=1. ··· 2329 2328 if (!peer_dev->gpu) 2330 2329 continue; 2331 2330 if (peer_dev->gpu->kfd->hive_id != kdev->kfd->hive_id) 2331 + continue; 2332 + if (!amdgpu_xgmi_get_is_sharing_enabled(kdev->adev, peer_dev->gpu->adev)) 2332 2333 continue; 2333 2334 sub_type_hdr = (typeof(sub_type_hdr))( 2334 2335 (char *)sub_type_hdr +