Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

rcu: Fix racy re-initialization of irq_work causing hangs

RCU re-initializes the deferred QS irq work everytime before attempting
to queue it. However there are situations where the irq work is
attempted to be queued even though it is already queued. In that case
re-initializing messes-up with the irq work queue that is about to be
handled.

The chances for that to happen are higher when the architecture doesn't
support self-IPIs and irq work are then all lazy, such as with the
following sequence:

1) rcu_read_unlock() is called when IRQs are disabled and there is a
grace period involving blocked tasks on the node. The irq work
is then initialized and queued.

2) The related tasks are unblocked and the CPU quiescent state
is reported. rdp->defer_qs_iw_pending is reset to DEFER_QS_IDLE,
allowing the irq work to be requeued in the future (note the previous
one hasn't fired yet).

3) A new grace period starts and the node has blocked tasks.

4) rcu_read_unlock() is called when IRQs are disabled again. The irq work
is re-initialized (but it's queued! and its node is cleared) and
requeued. Which means it's requeued to itself.

5) The irq work finally fires with the tick. But since it was requeued
to itself, it loops and hangs.

Fix this with initializing the irq work only once before the CPU boots.

Fixes: b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202508071303.c1134cce-lkp@intel.com
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>
Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.upadhyay@kernel.org>

authored by

Frederic Weisbecker and committed by
Neeraj Upadhyay (AMD)
61399e0c 8f5ae30d

+9 -2
+2
kernel/rcu/tree.c
··· 4262 4262 rdp->rcu_iw_gp_seq = rdp->gp_seq - 1; 4263 4263 trace_rcu_grace_period(rcu_state.name, rdp->gp_seq, TPS("cpuonl")); 4264 4264 raw_spin_unlock_irqrestore_rcu_node(rnp, flags); 4265 + 4266 + rcu_preempt_deferred_qs_init(rdp); 4265 4267 rcu_spawn_rnp_kthreads(rnp); 4266 4268 rcu_spawn_cpu_nocb_kthread(cpu); 4267 4269 ASSERT_EXCLUSIVE_WRITER(rcu_state.n_online_cpus);
+1
kernel/rcu/tree.h
··· 488 488 static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp); 489 489 static void rcu_flavor_sched_clock_irq(int user); 490 490 static void dump_blkd_tasks(struct rcu_node *rnp, int ncheck); 491 + static void rcu_preempt_deferred_qs_init(struct rcu_data *rdp); 491 492 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags); 492 493 static void rcu_preempt_boost_start_gp(struct rcu_node *rnp); 493 494 static bool rcu_is_callbacks_kthread(struct rcu_data *rdp);
+6 -2
kernel/rcu/tree_plugin.h
··· 763 763 cpu_online(rdp->cpu)) { 764 764 // Get scheduler to re-evaluate and call hooks. 765 765 // If !IRQ_WORK, FQS scan will eventually IPI. 766 - rdp->defer_qs_iw = 767 - IRQ_WORK_INIT_HARD(rcu_preempt_deferred_qs_handler); 768 766 rdp->defer_qs_iw_pending = DEFER_QS_PENDING; 769 767 irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu); 770 768 } ··· 902 904 } 903 905 } 904 906 907 + static void rcu_preempt_deferred_qs_init(struct rcu_data *rdp) 908 + { 909 + rdp->defer_qs_iw = IRQ_WORK_INIT_HARD(rcu_preempt_deferred_qs_handler); 910 + } 905 911 #else /* #ifdef CONFIG_PREEMPT_RCU */ 906 912 907 913 /* ··· 1104 1102 { 1105 1103 WARN_ON_ONCE(!list_empty(&rnp->blkd_tasks)); 1106 1104 } 1105 + 1106 + static void rcu_preempt_deferred_qs_init(struct rcu_data *rdp) { } 1107 1107 1108 1108 #endif /* #else #ifdef CONFIG_PREEMPT_RCU */ 1109 1109