Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs

AVX-512 supports 3-input XORs via the vpternlogd (or vpternlogq)
instruction with immediate 0x96. This approach, vs. the alternative of
two vpxor instructions, is already used in the CRC, AES-GCM, and AES-XTS
code, since it reduces the instruction count and is faster on some CPUs.
Make blake2s_compress_avx512() take advantage of it too.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20251102234209.62133-7-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>

+2 -4
+2 -4
lib/crypto/x86/blake2s-core.S
··· 278 278 jne .Lavx512_roundloop 279 279 280 280 // Compute the new h: h[0..7] ^= v[0..7] ^ v[8..15] 281 - vpxor %xmm10,%xmm0,%xmm0 282 - vpxor %xmm11,%xmm1,%xmm1 283 - vpxor %xmm2,%xmm0,%xmm0 284 - vpxor %xmm3,%xmm1,%xmm1 281 + vpternlogd $0x96,%xmm10,%xmm2,%xmm0 282 + vpternlogd $0x96,%xmm11,%xmm3,%xmm1 285 283 decq NBLOCKS 286 284 jne .Lavx512_mainloop 287 285