Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

io_uring/tw: serialize ctx->retry_llist with ->uring_lock

The DEFER_TASKRUN local task work paths all run under ctx->uring_lock,
which serializes them with each other and with the rest of the ring's
hot paths. io_move_task_work_from_local() is the exception - it's called
from io_ring_exit_work() on a kworker without holding the lock and from
the iopoll cancelation side right after dropping it.

->work_llist is fine with this, as it's only ever updated via the
expected paths. But the ->retry_llist is updated while runing, and hence
it could potentially race between normal task_work running and the
task-has-exited shutdown path.

Simply grab ->uring_lock while moving the local work to the fallback
list for exit purposes, which nicely serializes it across both the
normal additions and the exit prune path.

Cc: stable@vger.kernel.org
Fixes: f46b9cdb22f7 ("io_uring: limit local tw done")
Reported-by: Robert Femmer <robert.femmer@x41-dsec.de>
Reported-by: Christian Reitter <invd@inhq.net>
Reported-by: Michael Rodler <michael.rodler@x41-dsec.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

+11 -1
+11 -1
io_uring/tw.c
··· 273 273 274 274 void __cold io_move_task_work_from_local(struct io_ring_ctx *ctx) 275 275 { 276 - struct llist_node *node = llist_del_all(&ctx->work_llist); 276 + struct llist_node *node; 277 277 278 + /* 279 + * Running the work items may utilize ->retry_llist as a means 280 + * for capping the number of task_work entries run at the same 281 + * time. But that list can potentially race with moving the work 282 + * from here, if the task is exiting. As any normal task_work 283 + * running holds ->uring_lock already, just guard this slow path 284 + * with ->uring_lock to avoid racing on ->retry_llist. 285 + */ 286 + guard(mutex)(&ctx->uring_lock); 287 + node = llist_del_all(&ctx->work_llist); 278 288 __io_fallback_tw(node, false); 279 289 node = llist_del_all(&ctx->retry_llist); 280 290 __io_fallback_tw(node, false);