Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

thermal: core: Address thermal zone removal races with resume

Since thermal_zone_pm_complete() and thermal_zone_device_resume()
re-initialize the poll_queue delayed work for the given thermal zone,
the cancel_delayed_work_sync() in thermal_zone_device_unregister()
may miss some already running work items and the thermal zone may
be freed prematurely [1].

There are two failing scenarios that both start with
running thermal_pm_notify_complete() right before invoking
thermal_zone_device_unregister() for one of the thermal zones.

In the first scenario, there is a work item already running for
the given thermal zone when thermal_pm_notify_complete() calls
thermal_zone_pm_complete() for that thermal zone and it continues to
run when thermal_zone_device_unregister() starts. Since the poll_queue
delayed work has been re-initialized by thermal_pm_notify_complete(), the
running work item will be missed by the cancel_delayed_work_sync() in
thermal_zone_device_unregister() and if it continues to run past the
freeing of the thermal zone object, a use-after-free will occur.

In the second scenario, thermal_zone_device_resume() queued up by
thermal_pm_notify_complete() runs right after the thermal_zone_exit()
called by thermal_zone_device_unregister() has returned. The poll_queue
delayed work is re-initialized by it before cancel_delayed_work_sync() is
called by thermal_zone_device_unregister(), so it may continue to run
after the freeing of the thermal zone object, which also leads to a
use-after-free.

Address the first failing scenario by ensuring that no thermal work
items will be running when thermal_pm_notify_complete() is called.
For this purpose, first move the cancel_delayed_work() call from
thermal_zone_pm_complete() to thermal_zone_pm_prepare() to prevent
new work from entering the workqueue going forward. Next, switch
over to using a dedicated workqueue for thermal events and update
the code in thermal_pm_notify() to flush that workqueue after
thermal_pm_notify_prepare() has returned which will take care of
all leftover thermal work already on the workqueue (that leftover
work would do nothing useful anyway because all of the thermal zones
have been flagged as suspended).

The second failing scenario is addressed by adding a tz->state check
to thermal_zone_device_resume() to prevent it from re-initializing
the poll_queue delayed work if the thermal zone is going away.

Note that the above changes will also facilitate relocating the suspend
and resume of thermal zones closer to the suspend and resume of devices,
respectively.

Fixes: 5a5efdaffda5 ("thermal: core: Resume thermal zones asynchronously")
Reported-by: syzbot+3b3852c6031d0f30dfaf@syzkaller.appspotmail.com
Closes: https://syzbot.org/bug?extid=3b3852c6031d0f30dfaf
Reported-by: Mauricio Faria de Oliveira <mfo@igalia.com>
Closes: https://lore.kernel.org/linux-pm/20260324-thermal-core-uaf-init_delayed_work-v1-1-6611ae76a8a1@igalia.com/ [1]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mauricio Faria de Oliveira <mfo@igalia.com>
Tested-by: Mauricio Faria de Oliveira <mfo@igalia.com>
Reviewed-by: Lukasz Luba <lukasz.luba@arm.com>
Cc: All applicable <stable@vger.kernel.org>
Link: https://patch.msgid.link/6267615.lOV4Wx5bFT@rafael.j.wysocki

+26 -5
+26 -5
drivers/thermal/thermal_core.c
··· 41 41 42 42 static bool thermal_pm_suspended; 43 43 44 + static struct workqueue_struct *thermal_wq __ro_after_init; 45 + 44 46 /* 45 47 * Governor section: set of functions to handle thermal governors 46 48 * ··· 315 313 if (delay > HZ) 316 314 delay = round_jiffies_relative(delay); 317 315 318 - mod_delayed_work(system_freezable_power_efficient_wq, &tz->poll_queue, delay); 316 + mod_delayed_work(thermal_wq, &tz->poll_queue, delay); 319 317 } 320 318 321 319 static void thermal_zone_recheck(struct thermal_zone_device *tz, int error) ··· 1787 1785 1788 1786 guard(thermal_zone)(tz); 1789 1787 1788 + /* If the thermal zone is going away, there's nothing to do. */ 1789 + if (tz->state & TZ_STATE_FLAG_EXIT) 1790 + return; 1791 + 1790 1792 tz->state &= ~(TZ_STATE_FLAG_SUSPENDED | TZ_STATE_FLAG_RESUMING); 1791 1793 1792 1794 thermal_debug_tz_resume(tz); ··· 1817 1811 } 1818 1812 1819 1813 tz->state |= TZ_STATE_FLAG_SUSPENDED; 1814 + 1815 + /* Prevent new work from getting to the workqueue subsequently. */ 1816 + cancel_delayed_work(&tz->poll_queue); 1820 1817 } 1821 1818 1822 1819 static void thermal_pm_notify_prepare(void) ··· 1838 1829 { 1839 1830 guard(thermal_zone)(tz); 1840 1831 1841 - cancel_delayed_work(&tz->poll_queue); 1842 - 1843 1832 reinit_completion(&tz->resume); 1844 1833 tz->state |= TZ_STATE_FLAG_RESUMING; 1845 1834 ··· 1847 1840 */ 1848 1841 INIT_DELAYED_WORK(&tz->poll_queue, thermal_zone_device_resume); 1849 1842 /* Queue up the work without a delay. */ 1850 - mod_delayed_work(system_freezable_power_efficient_wq, &tz->poll_queue, 0); 1843 + mod_delayed_work(thermal_wq, &tz->poll_queue, 0); 1851 1844 } 1852 1845 1853 1846 static void thermal_pm_notify_complete(void) ··· 1870 1863 case PM_RESTORE_PREPARE: 1871 1864 case PM_SUSPEND_PREPARE: 1872 1865 thermal_pm_notify_prepare(); 1866 + /* 1867 + * Allow any leftover thermal work items already on the 1868 + * worqueue to complete so they don't get in the way later. 1869 + */ 1870 + flush_workqueue(thermal_wq); 1873 1871 break; 1874 1872 case PM_POST_HIBERNATION: 1875 1873 case PM_POST_RESTORE: ··· 1907 1895 if (result) 1908 1896 goto error; 1909 1897 1898 + thermal_wq = alloc_workqueue("thermal_events", 1899 + WQ_FREEZABLE | WQ_POWER_EFFICIENT | WQ_PERCPU, 0); 1900 + if (!thermal_wq) { 1901 + result = -ENOMEM; 1902 + goto unregister_netlink; 1903 + } 1904 + 1910 1905 result = thermal_register_governors(); 1911 1906 if (result) 1912 - goto unregister_netlink; 1907 + goto destroy_workqueue; 1913 1908 1914 1909 thermal_class = kzalloc_obj(*thermal_class); 1915 1910 if (!thermal_class) { ··· 1943 1924 1944 1925 unregister_governors: 1945 1926 thermal_unregister_governors(); 1927 + destroy_workqueue: 1928 + destroy_workqueue(thermal_wq); 1946 1929 unregister_netlink: 1947 1930 thermal_netlink_exit(); 1948 1931 error: