Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

drm/amdkfd: Check if there are kfd porcesses using adev by kfd_processes_count

During gpu hot-unplug need check if there are kfd porcesses still using the
being removed gpu before clean resources of the device. Current driver checks
if kfd_processes_table is empty. kfd processes are not terminated after
removed from kfd_processes_table immediately. They are still alive and may
access the device until kfd_process_wq work queue got ran.

Check kfd->kfd_processes_count value that is updated after kfd process got
uninitialized when its ref becomes zero.

Fixes: 6cca686dfce7 ("drm/amdkfd: kfd driver supports hot unplug/replug amdgpu devices")
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d12d05c4bc4c15585130af43e897923ff292df7b)

authored by

Xiaogang Chen and committed by
Alex Deucher
81665e35 e6c2e6c2

+1 -32
+1 -32
drivers/gpu/drm/amd/amdkfd/kfd_device.c
··· 1737 1737 return false; 1738 1738 } 1739 1739 1740 - /* check if there is kfd process still uses adev */ 1741 - static bool kgd2kfd_check_device_idle(struct amdgpu_device *adev) 1742 - { 1743 - struct kfd_process *p; 1744 - struct hlist_node *p_temp; 1745 - unsigned int temp; 1746 - struct kfd_node *dev; 1747 - 1748 - mutex_lock(&kfd_processes_mutex); 1749 - 1750 - if (hash_empty(kfd_processes_table)) { 1751 - mutex_unlock(&kfd_processes_mutex); 1752 - return true; 1753 - } 1754 - 1755 - /* check if there is device still use adev */ 1756 - hash_for_each_safe(kfd_processes_table, temp, p_temp, p, kfd_processes) { 1757 - for (int i = 0; i < p->n_pdds; i++) { 1758 - dev = p->pdds[i]->dev; 1759 - if (dev->adev == adev) { 1760 - mutex_unlock(&kfd_processes_mutex); 1761 - return false; 1762 - } 1763 - } 1764 - } 1765 - 1766 - mutex_unlock(&kfd_processes_mutex); 1767 - 1768 - return true; 1769 - } 1770 - 1771 1740 /** kgd2kfd_teardown_processes - gracefully tear down existing 1772 1741 * kfd processes that use adev 1773 1742 * ··· 1769 1800 mutex_unlock(&kfd_processes_mutex); 1770 1801 1771 1802 /* wait all kfd processes use adev terminate */ 1772 - while (!kgd2kfd_check_device_idle(adev)) 1803 + while (!!atomic_read(&adev->kfd.dev->kfd_processes_count)) 1773 1804 cond_resched(); 1774 1805 } 1775 1806