Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

cputimer: Cure lock inversion

There's a lock inversion between the cputimer->lock and rq->lock;
notably the two callchains involved are:

update_rlimit_cpu()
sighand->siglock
set_process_cpu_timer()
cpu_timer_sample_group()
thread_group_cputimer()
cputimer->lock
thread_group_cputime()
task_sched_runtime()
->pi_lock
rq->lock

scheduler_tick()
rq->lock
task_tick_fair()
update_curr()
account_group_exec()
cputimer->lock

Where the first one is enabling a CLOCK_PROCESS_CPUTIME_ID timer, and
the second one is keeping up-to-date.

This problem was introduced by e8abccb7193 ("posix-cpu-timers: Cure
SMP accounting oddities").

Cure the problem by removing the cputimer->lock and rq->lock nesting,
this leaves concurrent enablers doing duplicate work, but the time
wasted should be on the same order otherwise wasted spinning on the
lock and the greater-than assignment filter should ensure we preserve
monotonicity.

Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Simon Kirby <sim@hostway.ca>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: stable@kernel.org
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/1318928713.21167.4.camel@twins
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

authored by

Peter Zijlstra and committed by
Thomas Gleixner
bcd5cff7 899e3ee4

+4 -3
+4 -3
kernel/posix-cpu-timers.c
··· 274 274 struct task_cputime sum; 275 275 unsigned long flags; 276 276 277 - spin_lock_irqsave(&cputimer->lock, flags); 278 277 if (!cputimer->running) { 279 - cputimer->running = 1; 280 278 /* 281 279 * The POSIX timer interface allows for absolute time expiry 282 280 * values through the TIMER_ABSTIME flag, therefore we have ··· 282 284 * it. 283 285 */ 284 286 thread_group_cputime(tsk, &sum); 287 + spin_lock_irqsave(&cputimer->lock, flags); 288 + cputimer->running = 1; 285 289 update_gt_cputime(&cputimer->cputime, &sum); 286 - } 290 + } else 291 + spin_lock_irqsave(&cputimer->lock, flags); 287 292 *times = cputimer->cputime; 288 293 spin_unlock_irqrestore(&cputimer->lock, flags); 289 294 }