Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

percpu_counter: add a cmpxchg-based _add_batch variant

Interrupt disable/enable trips are quite expensive on x86-64 compared to a
mere cmpxchg (note: no lock prefix!) and percpu counters are used quite
often.

With this change I get a bump of 1% ops/s for negative path lookups,
plugged into will-it-scale:

void testcase(unsigned long long *iterations, unsigned long nr)
{
while (1) {
int fd = open("/tmp/nonexistent", O_RDONLY);
assert(fd == -1);

(*iterations)++;
}
}

The win would be higher if it was not for other slowdowns, but one has
to start somewhere.

Link: https://lkml.kernel.org/r/20240528204257.434817-1-mjguzik@gmail.com
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Dennis Zhou <dennis@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Mateusz Guzik and committed by
Andrew Morton
51d82165 727759d7

+39 -5
+39 -5
lib/percpu_counter.c
··· 73 73 EXPORT_SYMBOL(percpu_counter_set); 74 74 75 75 /* 76 - * local_irq_save() is needed to make the function irq safe: 77 - * - The slow path would be ok as protected by an irq-safe spinlock. 78 - * - this_cpu_add would be ok as it is irq-safe by definition. 79 - * But: 80 - * The decision slow path/fast path and the actual update must be atomic, too. 76 + * Add to a counter while respecting batch size. 77 + * 78 + * There are 2 implementations, both dealing with the following problem: 79 + * 80 + * The decision slow path/fast path and the actual update must be atomic. 81 81 * Otherwise a call in process context could check the current values and 82 82 * decide that the fast path can be used. If now an interrupt occurs before 83 83 * the this_cpu_add(), and the interrupt updates this_cpu(*fbc->counters), 84 84 * then the this_cpu_add() that is executed after the interrupt has completed 85 85 * can produce values larger than "batch" or even overflows. 86 + */ 87 + #ifdef CONFIG_HAVE_CMPXCHG_LOCAL 88 + /* 89 + * Safety against interrupts is achieved in 2 ways: 90 + * 1. the fast path uses local cmpxchg (note: no lock prefix) 91 + * 2. the slow path operates with interrupts disabled 92 + */ 93 + void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) 94 + { 95 + s64 count; 96 + unsigned long flags; 97 + 98 + count = this_cpu_read(*fbc->counters); 99 + do { 100 + if (unlikely(abs(count + amount) >= batch)) { 101 + raw_spin_lock_irqsave(&fbc->lock, flags); 102 + /* 103 + * Note: by now we might have migrated to another CPU 104 + * or the value might have changed. 105 + */ 106 + count = __this_cpu_read(*fbc->counters); 107 + fbc->count += count + amount; 108 + __this_cpu_sub(*fbc->counters, count); 109 + raw_spin_unlock_irqrestore(&fbc->lock, flags); 110 + return; 111 + } 112 + } while (!this_cpu_try_cmpxchg(*fbc->counters, &count, count + amount)); 113 + } 114 + #else 115 + /* 116 + * local_irq_save() is used to make the function irq safe: 117 + * - The slow path would be ok as protected by an irq-safe spinlock. 118 + * - this_cpu_add would be ok as it is irq-safe by definition. 86 119 */ 87 120 void percpu_counter_add_batch(struct percpu_counter *fbc, s64 amount, s32 batch) 88 121 { ··· 134 101 } 135 102 local_irq_restore(flags); 136 103 } 104 + #endif 137 105 EXPORT_SYMBOL(percpu_counter_add_batch); 138 106 139 107 /*