Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

s390/percpu: Provide arch_raw_cpu_ptr()

Provide an s390 specific arch_raw_cpu_ptr() implementation which avoids the
detour over get_lowcore() to get the lowcore pointer. The inline assembly
is implemented with an alternative so that relocated lowcore (percpu offset
is at a different address) is handled correctly.

This turns code like this

102f78: a7 39 00 00 lghi %r3,0
102f7c: e3 20 33 b8 00 08 ag %r2,952(%r3)

which adds the percpu offset to register r2 into a single instruction

102f7c: e3 20 33 b8 00 08 ag %r2,952(%r0)

and also avoids the need of a base register, thus reducing register
pressure.

With defconfig bloat-o-meter -t provides this result:

add/remove: 12/26 grow/shrink: 183/3391 up/down: 14880/-41950 (-27070)

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

authored by

Heiko Carstens and committed by
Vasily Gorbik
fa8be59c 7bbf7013

+18
+18
arch/s390/include/asm/percpu.h
··· 12 12 */ 13 13 #define __my_cpu_offset get_lowcore()->percpu_offset 14 14 15 + #define arch_raw_cpu_ptr(_ptr) \ 16 + ({ \ 17 + unsigned long lc_percpu, tcp_ptr__; \ 18 + \ 19 + tcp_ptr__ = (__force unsigned long)(_ptr); \ 20 + lc_percpu = offsetof(struct lowcore, percpu_offset); \ 21 + asm_inline volatile( \ 22 + ALTERNATIVE("ag %[__ptr__],%[offzero](%%r0)\n", \ 23 + "ag %[__ptr__],%[offalt](%%r0)\n", \ 24 + ALT_FEATURE(MFEATURE_LOWCORE)) \ 25 + : [__ptr__] "+d" (tcp_ptr__) \ 26 + : [offzero] "i" (lc_percpu), \ 27 + [offalt] "i" (lc_percpu + LOWCORE_ALT_ADDRESS), \ 28 + "m" (((struct lowcore *)0)->percpu_offset) \ 29 + : "cc"); \ 30 + (TYPEOF_UNQUAL(*(_ptr)) __force __kernel *)tcp_ptr__; \ 31 + }) 32 + 15 33 /* 16 34 * We use a compare-and-swap loop since that uses less cpu cycles than 17 35 * disabling and enabling interrupts like the generic variant would do.