Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull crypto library updates from Eric Biggers:
"This is the main crypto library pull request for 6.19. It includes:

- Add SHA-3 support to lib/crypto/, including support for both the
hash functions and the extendable-output functions. Reimplement the
existing SHA-3 crypto_shash support on top of the library.

This is motivated mainly by the upcoming support for the ML-DSA
signature algorithm, which needs the SHAKE128 and SHAKE256
functions. But even on its own it's a useful cleanup.

This also fixes the longstanding issue where the
architecture-optimized SHA-3 code was disabled by default.

- Add BLAKE2b support to lib/crypto/, and reimplement the existing
BLAKE2b crypto_shash support on top of the library.

This is motivated mainly by btrfs, which supports BLAKE2b
checksums. With this change, all btrfs checksum algorithms now have
library APIs. btrfs is planned to start just using the library
directly.

This refactor also improves consistency between the BLAKE2b code
and BLAKE2s code. And as usual, it also fixes the issue where the
architecture-optimized BLAKE2b code was disabled by default.

- Add POLYVAL support to lib/crypto/, replacing the existing POLYVAL
support in crypto_shash. Reimplement HCTR2 on top of the library.

This simplifies the code and improves HCTR2 performance. As usual,
it also makes the architecture-optimized code be enabled by
default. The generic implementation of POLYVAL is greatly improved
as well.

- Clean up the BLAKE2s code

- Add FIPS self-tests for SHA-1, SHA-2, and SHA-3"

* tag 'libcrypto-updates-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (37 commits)
fscrypt: Drop obsolete recommendation to enable optimized POLYVAL
crypto: polyval - Remove the polyval crypto_shash
crypto: hctr2 - Convert to use POLYVAL library
lib/crypto: x86/polyval: Migrate optimized code into library
lib/crypto: arm64/polyval: Migrate optimized code into library
lib/crypto: polyval: Add POLYVAL library
crypto: polyval - Rename conflicting functions
lib/crypto: x86/blake2s: Use vpternlogd for 3-input XORs
lib/crypto: x86/blake2s: Avoid writing back unchanged 'f' value
lib/crypto: x86/blake2s: Improve readability
lib/crypto: x86/blake2s: Use local labels for data
lib/crypto: x86/blake2s: Drop check for nblocks == 0
lib/crypto: x86/blake2s: Fix 32-bit arg treated as 64-bit
lib/crypto: arm, arm64: Drop filenames from file comments
lib/crypto: arm/blake2s: Fix some comments
crypto: s390/sha3 - Remove superseded SHA-3 code
crypto: sha3 - Reimplement using library API
crypto: jitterentropy - Use default sha3 implementation
lib/crypto: s390/sha3: Add optimized one-shot SHA-3 digest functions
lib/crypto: sha3: Support arch overrides of one-shot digest functions
...

+3074 -2526
+1
Documentation/crypto/index.rst
··· 27 27 descore-readme 28 28 device_drivers/index 29 29 krb5 30 + sha3
+119
Documentation/crypto/sha3.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0-or-later 2 + 3 + ========================== 4 + SHA-3 Algorithm Collection 5 + ========================== 6 + 7 + .. contents:: 8 + 9 + Overview 10 + ======== 11 + 12 + The SHA-3 family of algorithms, as specified in NIST FIPS-202 [1]_, contains six 13 + algorithms based on the Keccak sponge function. The differences between them 14 + are: the "rate" (how much of the state buffer gets updated with new data between 15 + invocations of the Keccak function and analogous to the "block size"), what 16 + domain separation suffix gets appended to the input data, and how much output 17 + data is extracted at the end. The Keccak sponge function is designed such that 18 + arbitrary amounts of output can be obtained for certain algorithms. 19 + 20 + Four digest algorithms are provided: 21 + 22 + - SHA3-224 23 + - SHA3-256 24 + - SHA3-384 25 + - SHA3-512 26 + 27 + Additionally, two Extendable-Output Functions (XOFs) are provided: 28 + 29 + - SHAKE128 30 + - SHAKE256 31 + 32 + The SHA-3 library API supports all six of these algorithms. The four digest 33 + algorithms are also supported by the crypto_shash and crypto_ahash APIs. 34 + 35 + This document describes the SHA-3 library API. 36 + 37 + 38 + Digests 39 + ======= 40 + 41 + The following functions compute SHA-3 digests:: 42 + 43 + void sha3_224(const u8 *in, size_t in_len, u8 out[SHA3_224_DIGEST_SIZE]); 44 + void sha3_256(const u8 *in, size_t in_len, u8 out[SHA3_256_DIGEST_SIZE]); 45 + void sha3_384(const u8 *in, size_t in_len, u8 out[SHA3_384_DIGEST_SIZE]); 46 + void sha3_512(const u8 *in, size_t in_len, u8 out[SHA3_512_DIGEST_SIZE]); 47 + 48 + For users that need to pass in data incrementally, an incremental API is also 49 + provided. The incremental API uses the following struct:: 50 + 51 + struct sha3_ctx { ... }; 52 + 53 + Initialization is done with one of:: 54 + 55 + void sha3_224_init(struct sha3_ctx *ctx); 56 + void sha3_256_init(struct sha3_ctx *ctx); 57 + void sha3_384_init(struct sha3_ctx *ctx); 58 + void sha3_512_init(struct sha3_ctx *ctx); 59 + 60 + Input data is then added with any number of calls to:: 61 + 62 + void sha3_update(struct sha3_ctx *ctx, const u8 *in, size_t in_len); 63 + 64 + Finally, the digest is generated using:: 65 + 66 + void sha3_final(struct sha3_ctx *ctx, u8 *out); 67 + 68 + which also zeroizes the context. The length of the digest is determined by the 69 + initialization function that was called. 70 + 71 + 72 + Extendable-Output Functions 73 + =========================== 74 + 75 + The following functions compute the SHA-3 extendable-output functions (XOFs):: 76 + 77 + void shake128(const u8 *in, size_t in_len, u8 *out, size_t out_len); 78 + void shake256(const u8 *in, size_t in_len, u8 *out, size_t out_len); 79 + 80 + For users that need to provide the input data incrementally and/or receive the 81 + output data incrementally, an incremental API is also provided. The incremental 82 + API uses the following struct:: 83 + 84 + struct shake_ctx { ... }; 85 + 86 + Initialization is done with one of:: 87 + 88 + void shake128_init(struct shake_ctx *ctx); 89 + void shake256_init(struct shake_ctx *ctx); 90 + 91 + Input data is then added with any number of calls to:: 92 + 93 + void shake_update(struct shake_ctx *ctx, const u8 *in, size_t in_len); 94 + 95 + Finally, the output data is extracted with any number of calls to:: 96 + 97 + void shake_squeeze(struct shake_ctx *ctx, u8 *out, size_t out_len); 98 + 99 + and telling it how much data should be extracted. Note that performing multiple 100 + squeezes, with the output laid consecutively in a buffer, gets exactly the same 101 + output as doing a single squeeze for the combined amount over the same buffer. 102 + 103 + More input data cannot be added after squeezing has started. 104 + 105 + Once all the desired output has been extracted, zeroize the context:: 106 + 107 + void shake_zeroize_ctx(struct shake_ctx *ctx); 108 + 109 + 110 + References 111 + ========== 112 + 113 + .. [1] https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf 114 + 115 + 116 + API Function Reference 117 + ====================== 118 + 119 + .. kernel-doc:: include/crypto/sha3.h
-2
Documentation/filesystems/fscrypt.rst
··· 450 450 - CONFIG_CRYPTO_HCTR2 451 451 - Recommended: 452 452 - arm64: CONFIG_CRYPTO_AES_ARM64_CE_BLK 453 - - arm64: CONFIG_CRYPTO_POLYVAL_ARM64_CE 454 453 - x86: CONFIG_CRYPTO_AES_NI_INTEL 455 - - x86: CONFIG_CRYPTO_POLYVAL_CLMUL_NI 456 454 457 455 - Adiantum 458 456 - Mandatory:
-16
arch/arm/crypto/Kconfig
··· 33 33 Architecture: arm using: 34 34 - NEON (Advanced SIMD) extensions 35 35 36 - config CRYPTO_BLAKE2B_NEON 37 - tristate "Hash functions: BLAKE2b (NEON)" 38 - depends on KERNEL_MODE_NEON 39 - select CRYPTO_BLAKE2B 40 - help 41 - BLAKE2b cryptographic hash function (RFC 7693) 42 - 43 - Architecture: arm using 44 - - NEON (Advanced SIMD) extensions 45 - 46 - BLAKE2b digest algorithm optimized with ARM NEON instructions. 47 - On ARM processors that have NEON support but not the ARMv8 48 - Crypto Extensions, typically this BLAKE2b implementation is 49 - much faster than the SHA-2 family and slightly faster than 50 - SHA-1. 51 - 52 36 config CRYPTO_AES_ARM 53 37 tristate "Ciphers: AES" 54 38 select CRYPTO_ALGAPI
-2
arch/arm/crypto/Makefile
··· 5 5 6 6 obj-$(CONFIG_CRYPTO_AES_ARM) += aes-arm.o 7 7 obj-$(CONFIG_CRYPTO_AES_ARM_BS) += aes-arm-bs.o 8 - obj-$(CONFIG_CRYPTO_BLAKE2B_NEON) += blake2b-neon.o 9 8 obj-$(CONFIG_CRYPTO_NHPOLY1305_NEON) += nhpoly1305-neon.o 10 9 11 10 obj-$(CONFIG_CRYPTO_AES_ARM_CE) += aes-arm-ce.o ··· 12 13 13 14 aes-arm-y := aes-cipher-core.o aes-cipher-glue.o 14 15 aes-arm-bs-y := aes-neonbs-core.o aes-neonbs-glue.o 15 - blake2b-neon-y := blake2b-neon-core.o blake2b-neon-glue.o 16 16 aes-arm-ce-y := aes-ce-core.o aes-ce-glue.o 17 17 ghash-arm-ce-y := ghash-ce-core.o ghash-ce-glue.o 18 18 nhpoly1305-neon-y := nh-neon-core.o nhpoly1305-neon-glue.o
+16 -13
arch/arm/crypto/blake2b-neon-core.S lib/crypto/arm/blake2b-neon-core.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 2 /* 3 - * BLAKE2b digest algorithm, NEON accelerated 3 + * BLAKE2b digest algorithm optimized with ARM NEON instructions. On ARM 4 + * processors that have NEON support but not the ARMv8 Crypto Extensions, 5 + * typically this BLAKE2b implementation is much faster than the SHA-2 family 6 + * and slightly faster than SHA-1. 4 7 * 5 8 * Copyright 2020 Google LLC 6 9 * ··· 16 13 .fpu neon 17 14 18 15 // The arguments to blake2b_compress_neon() 19 - STATE .req r0 20 - BLOCK .req r1 16 + CTX .req r0 17 + DATA .req r1 21 18 NBLOCKS .req r2 22 19 INC .req r3 23 20 ··· 237 234 .endm 238 235 239 236 // 240 - // void blake2b_compress_neon(struct blake2b_state *state, 241 - // const u8 *block, size_t nblocks, u32 inc); 237 + // void blake2b_compress_neon(struct blake2b_ctx *ctx, 238 + // const u8 *data, size_t nblocks, u32 inc); 242 239 // 243 - // Only the first three fields of struct blake2b_state are used: 240 + // Only the first three fields of struct blake2b_ctx are used: 244 241 // u64 h[8]; (inout) 245 242 // u64 t[2]; (inout) 246 243 // u64 f[2]; (in) ··· 258 255 adr ROR24_TABLE, .Lror24_table 259 256 adr ROR16_TABLE, .Lror16_table 260 257 261 - mov ip, STATE 258 + mov ip, CTX 262 259 vld1.64 {q0-q1}, [ip]! // Load h[0..3] 263 260 vld1.64 {q2-q3}, [ip]! // Load h[4..7] 264 261 .Lnext_block: ··· 284 281 // (q8-q9) in an aligned buffer on the stack so that they can be 285 282 // reloaded when needed. (We could just reload directly from the 286 283 // message buffer, but it's faster to use aligned loads.) 287 - vld1.8 {q8-q9}, [BLOCK]! 284 + vld1.8 {q8-q9}, [DATA]! 288 285 veor q6, q6, q14 // v[12..13] = IV[4..5] ^ t[0..1] 289 - vld1.8 {q10-q11}, [BLOCK]! 286 + vld1.8 {q10-q11}, [DATA]! 290 287 veor q7, q7, q15 // v[14..15] = IV[6..7] ^ f[0..1] 291 - vld1.8 {q12-q13}, [BLOCK]! 288 + vld1.8 {q12-q13}, [DATA]! 292 289 vst1.8 {q8-q9}, [sp, :256] 293 - mov ip, STATE 294 - vld1.8 {q14-q15}, [BLOCK]! 290 + mov ip, CTX 291 + vld1.8 {q14-q15}, [DATA]! 295 292 296 293 // Execute the rounds. Each round is provided the order in which it 297 294 // needs to use the message words. ··· 322 319 veor q3, q3, q7 // v[6..7] ^= v[14..15] 323 320 veor q0, q0, q8 // v[0..1] ^= h[0..1] 324 321 veor q1, q1, q9 // v[2..3] ^= h[2..3] 325 - mov ip, STATE 322 + mov ip, CTX 326 323 subs NBLOCKS, NBLOCKS, #1 // nblocks-- 327 324 vst1.64 {q0-q1}, [ip]! // Store new h[0..3] 328 325 veor q2, q2, q10 // v[4..5] ^= h[4..5]
-104
arch/arm/crypto/blake2b-neon-glue.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * BLAKE2b digest algorithm, NEON accelerated 4 - * 5 - * Copyright 2020 Google LLC 6 - */ 7 - 8 - #include <crypto/internal/blake2b.h> 9 - #include <crypto/internal/hash.h> 10 - 11 - #include <linux/module.h> 12 - #include <linux/sizes.h> 13 - 14 - #include <asm/neon.h> 15 - #include <asm/simd.h> 16 - 17 - asmlinkage void blake2b_compress_neon(struct blake2b_state *state, 18 - const u8 *block, size_t nblocks, u32 inc); 19 - 20 - static void blake2b_compress_arch(struct blake2b_state *state, 21 - const u8 *block, size_t nblocks, u32 inc) 22 - { 23 - do { 24 - const size_t blocks = min_t(size_t, nblocks, 25 - SZ_4K / BLAKE2B_BLOCK_SIZE); 26 - 27 - kernel_neon_begin(); 28 - blake2b_compress_neon(state, block, blocks, inc); 29 - kernel_neon_end(); 30 - 31 - nblocks -= blocks; 32 - block += blocks * BLAKE2B_BLOCK_SIZE; 33 - } while (nblocks); 34 - } 35 - 36 - static int crypto_blake2b_update_neon(struct shash_desc *desc, 37 - const u8 *in, unsigned int inlen) 38 - { 39 - return crypto_blake2b_update_bo(desc, in, inlen, blake2b_compress_arch); 40 - } 41 - 42 - static int crypto_blake2b_finup_neon(struct shash_desc *desc, const u8 *in, 43 - unsigned int inlen, u8 *out) 44 - { 45 - return crypto_blake2b_finup(desc, in, inlen, out, 46 - blake2b_compress_arch); 47 - } 48 - 49 - #define BLAKE2B_ALG(name, driver_name, digest_size) \ 50 - { \ 51 - .base.cra_name = name, \ 52 - .base.cra_driver_name = driver_name, \ 53 - .base.cra_priority = 200, \ 54 - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY | \ 55 - CRYPTO_AHASH_ALG_BLOCK_ONLY | \ 56 - CRYPTO_AHASH_ALG_FINAL_NONZERO, \ 57 - .base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \ 58 - .base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \ 59 - .base.cra_module = THIS_MODULE, \ 60 - .digestsize = digest_size, \ 61 - .setkey = crypto_blake2b_setkey, \ 62 - .init = crypto_blake2b_init, \ 63 - .update = crypto_blake2b_update_neon, \ 64 - .finup = crypto_blake2b_finup_neon, \ 65 - .descsize = sizeof(struct blake2b_state), \ 66 - .statesize = BLAKE2B_STATE_SIZE, \ 67 - } 68 - 69 - static struct shash_alg blake2b_neon_algs[] = { 70 - BLAKE2B_ALG("blake2b-160", "blake2b-160-neon", BLAKE2B_160_HASH_SIZE), 71 - BLAKE2B_ALG("blake2b-256", "blake2b-256-neon", BLAKE2B_256_HASH_SIZE), 72 - BLAKE2B_ALG("blake2b-384", "blake2b-384-neon", BLAKE2B_384_HASH_SIZE), 73 - BLAKE2B_ALG("blake2b-512", "blake2b-512-neon", BLAKE2B_512_HASH_SIZE), 74 - }; 75 - 76 - static int __init blake2b_neon_mod_init(void) 77 - { 78 - if (!(elf_hwcap & HWCAP_NEON)) 79 - return -ENODEV; 80 - 81 - return crypto_register_shashes(blake2b_neon_algs, 82 - ARRAY_SIZE(blake2b_neon_algs)); 83 - } 84 - 85 - static void __exit blake2b_neon_mod_exit(void) 86 - { 87 - crypto_unregister_shashes(blake2b_neon_algs, 88 - ARRAY_SIZE(blake2b_neon_algs)); 89 - } 90 - 91 - module_init(blake2b_neon_mod_init); 92 - module_exit(blake2b_neon_mod_exit); 93 - 94 - MODULE_DESCRIPTION("BLAKE2b digest algorithm, NEON accelerated"); 95 - MODULE_LICENSE("GPL"); 96 - MODULE_AUTHOR("Eric Biggers <ebiggers@google.com>"); 97 - MODULE_ALIAS_CRYPTO("blake2b-160"); 98 - MODULE_ALIAS_CRYPTO("blake2b-160-neon"); 99 - MODULE_ALIAS_CRYPTO("blake2b-256"); 100 - MODULE_ALIAS_CRYPTO("blake2b-256-neon"); 101 - MODULE_ALIAS_CRYPTO("blake2b-384"); 102 - MODULE_ALIAS_CRYPTO("blake2b-384-neon"); 103 - MODULE_ALIAS_CRYPTO("blake2b-512"); 104 - MODULE_ALIAS_CRYPTO("blake2b-512-neon");
+1 -1
arch/arm64/configs/defconfig
··· 1783 1783 CONFIG_CRYPTO_BENCHMARK=m 1784 1784 CONFIG_CRYPTO_ECHAINIV=y 1785 1785 CONFIG_CRYPTO_MICHAEL_MIC=m 1786 + CONFIG_CRYPTO_SHA3=m 1786 1787 CONFIG_CRYPTO_ANSI_CPRNG=y 1787 1788 CONFIG_CRYPTO_USER_API_RNG=m 1788 1789 CONFIG_CRYPTO_GHASH_ARM64_CE=y 1789 - CONFIG_CRYPTO_SHA3_ARM64=m 1790 1790 CONFIG_CRYPTO_SM3_ARM64_CE=m 1791 1791 CONFIG_CRYPTO_AES_ARM64_CE_BLK=y 1792 1792 CONFIG_CRYPTO_AES_ARM64_BS=m
-21
arch/arm64/crypto/Kconfig
··· 25 25 Architecture: arm64 using: 26 26 - NEON (Advanced SIMD) extensions 27 27 28 - config CRYPTO_SHA3_ARM64 29 - tristate "Hash functions: SHA-3 (ARMv8.2 Crypto Extensions)" 30 - depends on KERNEL_MODE_NEON 31 - select CRYPTO_HASH 32 - select CRYPTO_SHA3 33 - help 34 - SHA-3 secure hash algorithms (FIPS 202) 35 - 36 - Architecture: arm64 using: 37 - - ARMv8.2 Crypto Extensions 38 - 39 28 config CRYPTO_SM3_NEON 40 29 tristate "Hash functions: SM3 (NEON)" 41 30 depends on KERNEL_MODE_NEON ··· 46 57 47 58 Architecture: arm64 using: 48 59 - ARMv8.2 Crypto Extensions 49 - 50 - config CRYPTO_POLYVAL_ARM64_CE 51 - tristate "Hash functions: POLYVAL (ARMv8 Crypto Extensions)" 52 - depends on KERNEL_MODE_NEON 53 - select CRYPTO_POLYVAL 54 - help 55 - POLYVAL hash function for HCTR2 56 - 57 - Architecture: arm64 using: 58 - - ARMv8 Crypto Extensions 59 60 60 61 config CRYPTO_AES_ARM64 61 62 tristate "Ciphers: AES, modes: ECB, CBC, CTR, CTS, XCTR, XTS"
-6
arch/arm64/crypto/Makefile
··· 5 5 # Copyright (C) 2014 Linaro Ltd <ard.biesheuvel@linaro.org> 6 6 # 7 7 8 - obj-$(CONFIG_CRYPTO_SHA3_ARM64) += sha3-ce.o 9 - sha3-ce-y := sha3-ce-glue.o sha3-ce-core.o 10 - 11 8 obj-$(CONFIG_CRYPTO_SM3_NEON) += sm3-neon.o 12 9 sm3-neon-y := sm3-neon-glue.o sm3-neon-core.o 13 10 ··· 28 31 29 32 obj-$(CONFIG_CRYPTO_GHASH_ARM64_CE) += ghash-ce.o 30 33 ghash-ce-y := ghash-ce-glue.o ghash-ce-core.o 31 - 32 - obj-$(CONFIG_CRYPTO_POLYVAL_ARM64_CE) += polyval-ce.o 33 - polyval-ce-y := polyval-ce-glue.o polyval-ce-core.o 34 34 35 35 obj-$(CONFIG_CRYPTO_AES_ARM64_CE) += aes-ce-cipher.o 36 36 aes-ce-cipher-y := aes-ce-core.o aes-ce-glue.o
+18 -20
arch/arm64/crypto/polyval-ce-core.S lib/crypto/arm64/polyval-ce-core.S
··· 27 27 #include <linux/linkage.h> 28 28 #define STRIDE_BLOCKS 8 29 29 30 - KEY_POWERS .req x0 31 - MSG .req x1 32 - BLOCKS_LEFT .req x2 33 - ACCUMULATOR .req x3 30 + ACCUMULATOR .req x0 31 + KEY_POWERS .req x1 32 + MSG .req x2 33 + BLOCKS_LEFT .req x3 34 34 KEY_START .req x10 35 35 EXTRA_BYTES .req x11 36 36 TMP .req x13 ··· 300 300 .endm 301 301 302 302 /* 303 - * Perform montgomery multiplication in GF(2^128) and store result in op1. 303 + * Computes a = a * b * x^{-128} mod x^128 + x^127 + x^126 + x^121 + 1. 304 304 * 305 - * Computes op1*op2*x^{-128} mod x^128 + x^127 + x^126 + x^121 + 1 306 - * If op1, op2 are in montgomery form, this computes the montgomery 307 - * form of op1*op2. 308 - * 309 - * void pmull_polyval_mul(u8 *op1, const u8 *op2); 305 + * void polyval_mul_pmull(struct polyval_elem *a, 306 + * const struct polyval_elem *b); 310 307 */ 311 - SYM_FUNC_START(pmull_polyval_mul) 308 + SYM_FUNC_START(polyval_mul_pmull) 312 309 adr TMP, .Lgstar 313 310 ld1 {GSTAR.2d}, [TMP] 314 311 ld1 {v0.16b}, [x0] ··· 315 318 montgomery_reduction SUM 316 319 st1 {SUM.16b}, [x0] 317 320 ret 318 - SYM_FUNC_END(pmull_polyval_mul) 321 + SYM_FUNC_END(polyval_mul_pmull) 319 322 320 323 /* 321 324 * Perform polynomial evaluation as specified by POLYVAL. This computes: 322 325 * h^n * accumulator + h^n * m_0 + ... + h^1 * m_{n-1} 323 326 * where n=nblocks, h is the hash key, and m_i are the message blocks. 324 327 * 325 - * x0 - pointer to precomputed key powers h^8 ... h^1 326 - * x1 - pointer to message blocks 327 - * x2 - number of blocks to hash 328 - * x3 - pointer to accumulator 328 + * x0 - pointer to accumulator 329 + * x1 - pointer to precomputed key powers h^8 ... h^1 330 + * x2 - pointer to message blocks 331 + * x3 - number of blocks to hash 329 332 * 330 - * void pmull_polyval_update(const struct polyval_ctx *ctx, const u8 *in, 331 - * size_t nblocks, u8 *accumulator); 333 + * void polyval_blocks_pmull(struct polyval_elem *acc, 334 + * const struct polyval_key *key, 335 + * const u8 *data, size_t nblocks); 332 336 */ 333 - SYM_FUNC_START(pmull_polyval_update) 337 + SYM_FUNC_START(polyval_blocks_pmull) 334 338 adr TMP, .Lgstar 335 339 mov KEY_START, KEY_POWERS 336 340 ld1 {GSTAR.2d}, [TMP] ··· 356 358 .LskipPartial: 357 359 st1 {SUM.16b}, [ACCUMULATOR] 358 360 ret 359 - SYM_FUNC_END(pmull_polyval_update) 361 + SYM_FUNC_END(polyval_blocks_pmull)
-158
arch/arm64/crypto/polyval-ce-glue.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * Glue code for POLYVAL using ARMv8 Crypto Extensions 4 - * 5 - * Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi> 6 - * Copyright (c) 2009 Intel Corp. 7 - * Author: Huang Ying <ying.huang@intel.com> 8 - * Copyright 2021 Google LLC 9 - */ 10 - 11 - /* 12 - * Glue code based on ghash-clmulni-intel_glue.c. 13 - * 14 - * This implementation of POLYVAL uses montgomery multiplication accelerated by 15 - * ARMv8 Crypto Extensions instructions to implement the finite field operations. 16 - */ 17 - 18 - #include <asm/neon.h> 19 - #include <crypto/internal/hash.h> 20 - #include <crypto/polyval.h> 21 - #include <crypto/utils.h> 22 - #include <linux/cpufeature.h> 23 - #include <linux/errno.h> 24 - #include <linux/kernel.h> 25 - #include <linux/module.h> 26 - #include <linux/string.h> 27 - 28 - #define NUM_KEY_POWERS 8 29 - 30 - struct polyval_tfm_ctx { 31 - /* 32 - * These powers must be in the order h^8, ..., h^1. 33 - */ 34 - u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE]; 35 - }; 36 - 37 - struct polyval_desc_ctx { 38 - u8 buffer[POLYVAL_BLOCK_SIZE]; 39 - }; 40 - 41 - asmlinkage void pmull_polyval_update(const struct polyval_tfm_ctx *keys, 42 - const u8 *in, size_t nblocks, u8 *accumulator); 43 - asmlinkage void pmull_polyval_mul(u8 *op1, const u8 *op2); 44 - 45 - static void internal_polyval_update(const struct polyval_tfm_ctx *keys, 46 - const u8 *in, size_t nblocks, u8 *accumulator) 47 - { 48 - kernel_neon_begin(); 49 - pmull_polyval_update(keys, in, nblocks, accumulator); 50 - kernel_neon_end(); 51 - } 52 - 53 - static void internal_polyval_mul(u8 *op1, const u8 *op2) 54 - { 55 - kernel_neon_begin(); 56 - pmull_polyval_mul(op1, op2); 57 - kernel_neon_end(); 58 - } 59 - 60 - static int polyval_arm64_setkey(struct crypto_shash *tfm, 61 - const u8 *key, unsigned int keylen) 62 - { 63 - struct polyval_tfm_ctx *tctx = crypto_shash_ctx(tfm); 64 - int i; 65 - 66 - if (keylen != POLYVAL_BLOCK_SIZE) 67 - return -EINVAL; 68 - 69 - memcpy(tctx->key_powers[NUM_KEY_POWERS-1], key, POLYVAL_BLOCK_SIZE); 70 - 71 - for (i = NUM_KEY_POWERS-2; i >= 0; i--) { 72 - memcpy(tctx->key_powers[i], key, POLYVAL_BLOCK_SIZE); 73 - internal_polyval_mul(tctx->key_powers[i], 74 - tctx->key_powers[i+1]); 75 - } 76 - 77 - return 0; 78 - } 79 - 80 - static int polyval_arm64_init(struct shash_desc *desc) 81 - { 82 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 83 - 84 - memset(dctx, 0, sizeof(*dctx)); 85 - 86 - return 0; 87 - } 88 - 89 - static int polyval_arm64_update(struct shash_desc *desc, 90 - const u8 *src, unsigned int srclen) 91 - { 92 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 93 - const struct polyval_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); 94 - unsigned int nblocks; 95 - 96 - do { 97 - /* allow rescheduling every 4K bytes */ 98 - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; 99 - internal_polyval_update(tctx, src, nblocks, dctx->buffer); 100 - srclen -= nblocks * POLYVAL_BLOCK_SIZE; 101 - src += nblocks * POLYVAL_BLOCK_SIZE; 102 - } while (srclen >= POLYVAL_BLOCK_SIZE); 103 - 104 - return srclen; 105 - } 106 - 107 - static int polyval_arm64_finup(struct shash_desc *desc, const u8 *src, 108 - unsigned int len, u8 *dst) 109 - { 110 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 111 - const struct polyval_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); 112 - 113 - if (len) { 114 - crypto_xor(dctx->buffer, src, len); 115 - internal_polyval_mul(dctx->buffer, 116 - tctx->key_powers[NUM_KEY_POWERS-1]); 117 - } 118 - 119 - memcpy(dst, dctx->buffer, POLYVAL_BLOCK_SIZE); 120 - 121 - return 0; 122 - } 123 - 124 - static struct shash_alg polyval_alg = { 125 - .digestsize = POLYVAL_DIGEST_SIZE, 126 - .init = polyval_arm64_init, 127 - .update = polyval_arm64_update, 128 - .finup = polyval_arm64_finup, 129 - .setkey = polyval_arm64_setkey, 130 - .descsize = sizeof(struct polyval_desc_ctx), 131 - .base = { 132 - .cra_name = "polyval", 133 - .cra_driver_name = "polyval-ce", 134 - .cra_priority = 200, 135 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 136 - .cra_blocksize = POLYVAL_BLOCK_SIZE, 137 - .cra_ctxsize = sizeof(struct polyval_tfm_ctx), 138 - .cra_module = THIS_MODULE, 139 - }, 140 - }; 141 - 142 - static int __init polyval_ce_mod_init(void) 143 - { 144 - return crypto_register_shash(&polyval_alg); 145 - } 146 - 147 - static void __exit polyval_ce_mod_exit(void) 148 - { 149 - crypto_unregister_shash(&polyval_alg); 150 - } 151 - 152 - module_cpu_feature_match(PMULL, polyval_ce_mod_init) 153 - module_exit(polyval_ce_mod_exit); 154 - 155 - MODULE_LICENSE("GPL"); 156 - MODULE_DESCRIPTION("POLYVAL hash function accelerated by ARMv8 Crypto Extensions"); 157 - MODULE_ALIAS_CRYPTO("polyval"); 158 - MODULE_ALIAS_CRYPTO("polyval-ce");
+35 -34
arch/arm64/crypto/sha3-ce-core.S lib/crypto/arm64/sha3-ce-core.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 2 /* 3 - * sha3-ce-core.S - core SHA-3 transform using v8.2 Crypto Extensions 3 + * Core SHA-3 transform using v8.2 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org> 6 6 * ··· 37 37 .endm 38 38 39 39 /* 40 - * int sha3_ce_transform(u64 *st, const u8 *data, int blocks, int dg_size) 40 + * size_t sha3_ce_transform(struct sha3_state *state, const u8 *data, 41 + * size_t nblocks, size_t block_size) 42 + * 43 + * block_size is assumed to be one of 72 (SHA3-512), 104 (SHA3-384), 136 44 + * (SHA3-256 and SHAKE256), 144 (SHA3-224), or 168 (SHAKE128). 41 45 */ 42 46 .text 43 47 SYM_FUNC_START(sha3_ce_transform) ··· 55 51 ld1 {v20.1d-v23.1d}, [x8], #32 56 52 ld1 {v24.1d}, [x8] 57 53 58 - 0: sub w2, w2, #1 54 + 0: sub x2, x2, #1 59 55 mov w8, #24 60 56 adr_l x9, .Lsha3_rcon 61 57 62 58 /* load input */ 63 59 ld1 {v25.8b-v28.8b}, [x1], #32 64 - ld1 {v29.8b-v31.8b}, [x1], #24 60 + ld1 {v29.8b}, [x1], #8 65 61 eor v0.8b, v0.8b, v25.8b 66 62 eor v1.8b, v1.8b, v26.8b 67 63 eor v2.8b, v2.8b, v27.8b 68 64 eor v3.8b, v3.8b, v28.8b 69 65 eor v4.8b, v4.8b, v29.8b 70 - eor v5.8b, v5.8b, v30.8b 71 - eor v6.8b, v6.8b, v31.8b 72 - 73 - tbnz x3, #6, 2f // SHA3-512 74 66 75 67 ld1 {v25.8b-v28.8b}, [x1], #32 76 - ld1 {v29.8b-v30.8b}, [x1], #16 77 - eor v7.8b, v7.8b, v25.8b 78 - eor v8.8b, v8.8b, v26.8b 79 - eor v9.8b, v9.8b, v27.8b 80 - eor v10.8b, v10.8b, v28.8b 81 - eor v11.8b, v11.8b, v29.8b 82 - eor v12.8b, v12.8b, v30.8b 68 + eor v5.8b, v5.8b, v25.8b 69 + eor v6.8b, v6.8b, v26.8b 70 + eor v7.8b, v7.8b, v27.8b 71 + eor v8.8b, v8.8b, v28.8b 72 + cmp x3, #72 73 + b.eq 3f /* SHA3-512 (block_size=72)? */ 83 74 84 - tbnz x3, #4, 1f // SHA3-384 or SHA3-224 75 + ld1 {v25.8b-v28.8b}, [x1], #32 76 + eor v9.8b, v9.8b, v25.8b 77 + eor v10.8b, v10.8b, v26.8b 78 + eor v11.8b, v11.8b, v27.8b 79 + eor v12.8b, v12.8b, v28.8b 80 + cmp x3, #104 81 + b.eq 3f /* SHA3-384 (block_size=104)? */ 85 82 86 - // SHA3-256 87 83 ld1 {v25.8b-v28.8b}, [x1], #32 88 84 eor v13.8b, v13.8b, v25.8b 89 85 eor v14.8b, v14.8b, v26.8b 90 86 eor v15.8b, v15.8b, v27.8b 91 87 eor v16.8b, v16.8b, v28.8b 92 - b 3f 88 + cmp x3, #144 89 + b.lt 3f /* SHA3-256 or SHAKE256 (block_size=136)? */ 90 + b.eq 2f /* SHA3-224 (block_size=144)? */ 93 91 94 - 1: tbz x3, #2, 3f // bit 2 cleared? SHA-384 95 - 96 - // SHA3-224 92 + /* SHAKE128 (block_size=168) */ 97 93 ld1 {v25.8b-v28.8b}, [x1], #32 98 - ld1 {v29.8b}, [x1], #8 99 - eor v13.8b, v13.8b, v25.8b 100 - eor v14.8b, v14.8b, v26.8b 101 - eor v15.8b, v15.8b, v27.8b 102 - eor v16.8b, v16.8b, v28.8b 103 - eor v17.8b, v17.8b, v29.8b 94 + eor v17.8b, v17.8b, v25.8b 95 + eor v18.8b, v18.8b, v26.8b 96 + eor v19.8b, v19.8b, v27.8b 97 + eor v20.8b, v20.8b, v28.8b 104 98 b 3f 105 - 106 - // SHA3-512 107 - 2: ld1 {v25.8b-v26.8b}, [x1], #16 108 - eor v7.8b, v7.8b, v25.8b 109 - eor v8.8b, v8.8b, v26.8b 99 + 2: 100 + /* SHA3-224 (block_size=144) */ 101 + ld1 {v25.8b}, [x1], #8 102 + eor v17.8b, v17.8b, v25.8b 110 103 111 104 3: sub w8, w8, #1 112 105 ··· 186 185 187 186 cbnz w8, 3b 188 187 cond_yield 4f, x8, x9 189 - cbnz w2, 0b 188 + cbnz x2, 0b 190 189 191 190 /* save state */ 192 191 4: st1 { v0.1d- v3.1d}, [x0], #32 ··· 196 195 st1 {v16.1d-v19.1d}, [x0], #32 197 196 st1 {v20.1d-v23.1d}, [x0], #32 198 197 st1 {v24.1d}, [x0] 199 - mov w0, w2 198 + mov x0, x2 200 199 ret 201 200 SYM_FUNC_END(sha3_ce_transform) 202 201
-151
arch/arm64/crypto/sha3-ce-glue.c
··· 1 - // SPDX-License-Identifier: GPL-2.0 2 - /* 3 - * sha3-ce-glue.c - core SHA-3 transform using v8.2 Crypto Extensions 4 - * 5 - * Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org> 6 - * 7 - * This program is free software; you can redistribute it and/or modify 8 - * it under the terms of the GNU General Public License version 2 as 9 - * published by the Free Software Foundation. 10 - */ 11 - 12 - #include <asm/hwcap.h> 13 - #include <asm/neon.h> 14 - #include <asm/simd.h> 15 - #include <crypto/internal/hash.h> 16 - #include <crypto/sha3.h> 17 - #include <linux/cpufeature.h> 18 - #include <linux/kernel.h> 19 - #include <linux/module.h> 20 - #include <linux/string.h> 21 - #include <linux/unaligned.h> 22 - 23 - MODULE_DESCRIPTION("SHA3 secure hash using ARMv8 Crypto Extensions"); 24 - MODULE_AUTHOR("Ard Biesheuvel <ard.biesheuvel@linaro.org>"); 25 - MODULE_LICENSE("GPL v2"); 26 - MODULE_ALIAS_CRYPTO("sha3-224"); 27 - MODULE_ALIAS_CRYPTO("sha3-256"); 28 - MODULE_ALIAS_CRYPTO("sha3-384"); 29 - MODULE_ALIAS_CRYPTO("sha3-512"); 30 - 31 - asmlinkage int sha3_ce_transform(u64 *st, const u8 *data, int blocks, 32 - int md_len); 33 - 34 - static int sha3_update(struct shash_desc *desc, const u8 *data, 35 - unsigned int len) 36 - { 37 - struct sha3_state *sctx = shash_desc_ctx(desc); 38 - struct crypto_shash *tfm = desc->tfm; 39 - unsigned int bs, ds; 40 - int blocks; 41 - 42 - ds = crypto_shash_digestsize(tfm); 43 - bs = crypto_shash_blocksize(tfm); 44 - blocks = len / bs; 45 - len -= blocks * bs; 46 - do { 47 - int rem; 48 - 49 - kernel_neon_begin(); 50 - rem = sha3_ce_transform(sctx->st, data, blocks, ds); 51 - kernel_neon_end(); 52 - data += (blocks - rem) * bs; 53 - blocks = rem; 54 - } while (blocks); 55 - return len; 56 - } 57 - 58 - static int sha3_finup(struct shash_desc *desc, const u8 *src, unsigned int len, 59 - u8 *out) 60 - { 61 - struct sha3_state *sctx = shash_desc_ctx(desc); 62 - struct crypto_shash *tfm = desc->tfm; 63 - __le64 *digest = (__le64 *)out; 64 - u8 block[SHA3_224_BLOCK_SIZE]; 65 - unsigned int bs, ds; 66 - int i; 67 - 68 - ds = crypto_shash_digestsize(tfm); 69 - bs = crypto_shash_blocksize(tfm); 70 - memcpy(block, src, len); 71 - 72 - block[len++] = 0x06; 73 - memset(block + len, 0, bs - len); 74 - block[bs - 1] |= 0x80; 75 - 76 - kernel_neon_begin(); 77 - sha3_ce_transform(sctx->st, block, 1, ds); 78 - kernel_neon_end(); 79 - memzero_explicit(block , sizeof(block)); 80 - 81 - for (i = 0; i < ds / 8; i++) 82 - put_unaligned_le64(sctx->st[i], digest++); 83 - 84 - if (ds & 4) 85 - put_unaligned_le32(sctx->st[i], (__le32 *)digest); 86 - 87 - return 0; 88 - } 89 - 90 - static struct shash_alg algs[] = { { 91 - .digestsize = SHA3_224_DIGEST_SIZE, 92 - .init = crypto_sha3_init, 93 - .update = sha3_update, 94 - .finup = sha3_finup, 95 - .descsize = SHA3_STATE_SIZE, 96 - .base.cra_name = "sha3-224", 97 - .base.cra_driver_name = "sha3-224-ce", 98 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 99 - .base.cra_blocksize = SHA3_224_BLOCK_SIZE, 100 - .base.cra_module = THIS_MODULE, 101 - .base.cra_priority = 200, 102 - }, { 103 - .digestsize = SHA3_256_DIGEST_SIZE, 104 - .init = crypto_sha3_init, 105 - .update = sha3_update, 106 - .finup = sha3_finup, 107 - .descsize = SHA3_STATE_SIZE, 108 - .base.cra_name = "sha3-256", 109 - .base.cra_driver_name = "sha3-256-ce", 110 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 111 - .base.cra_blocksize = SHA3_256_BLOCK_SIZE, 112 - .base.cra_module = THIS_MODULE, 113 - .base.cra_priority = 200, 114 - }, { 115 - .digestsize = SHA3_384_DIGEST_SIZE, 116 - .init = crypto_sha3_init, 117 - .update = sha3_update, 118 - .finup = sha3_finup, 119 - .descsize = SHA3_STATE_SIZE, 120 - .base.cra_name = "sha3-384", 121 - .base.cra_driver_name = "sha3-384-ce", 122 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 123 - .base.cra_blocksize = SHA3_384_BLOCK_SIZE, 124 - .base.cra_module = THIS_MODULE, 125 - .base.cra_priority = 200, 126 - }, { 127 - .digestsize = SHA3_512_DIGEST_SIZE, 128 - .init = crypto_sha3_init, 129 - .update = sha3_update, 130 - .finup = sha3_finup, 131 - .descsize = SHA3_STATE_SIZE, 132 - .base.cra_name = "sha3-512", 133 - .base.cra_driver_name = "sha3-512-ce", 134 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 135 - .base.cra_blocksize = SHA3_512_BLOCK_SIZE, 136 - .base.cra_module = THIS_MODULE, 137 - .base.cra_priority = 200, 138 - } }; 139 - 140 - static int __init sha3_neon_mod_init(void) 141 - { 142 - return crypto_register_shashes(algs, ARRAY_SIZE(algs)); 143 - } 144 - 145 - static void __exit sha3_neon_mod_fini(void) 146 - { 147 - crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); 148 - } 149 - 150 - module_cpu_feature_match(SHA3, sha3_neon_mod_init); 151 - module_exit(sha3_neon_mod_fini);
+1 -2
arch/s390/configs/debug_defconfig
··· 796 796 CONFIG_CRYPTO_MD5=y 797 797 CONFIG_CRYPTO_MICHAEL_MIC=m 798 798 CONFIG_CRYPTO_RMD160=m 799 + CONFIG_CRYPTO_SHA3=m 799 800 CONFIG_CRYPTO_SM3_GENERIC=m 800 801 CONFIG_CRYPTO_WP512=m 801 802 CONFIG_CRYPTO_XCBC=m ··· 810 809 CONFIG_CRYPTO_USER_API_SKCIPHER=m 811 810 CONFIG_CRYPTO_USER_API_RNG=m 812 811 CONFIG_CRYPTO_USER_API_AEAD=m 813 - CONFIG_CRYPTO_SHA3_256_S390=m 814 - CONFIG_CRYPTO_SHA3_512_S390=m 815 812 CONFIG_CRYPTO_GHASH_S390=m 816 813 CONFIG_CRYPTO_AES_S390=m 817 814 CONFIG_CRYPTO_DES_S390=m
+1 -2
arch/s390/configs/defconfig
··· 780 780 CONFIG_CRYPTO_MD5=y 781 781 CONFIG_CRYPTO_MICHAEL_MIC=m 782 782 CONFIG_CRYPTO_RMD160=m 783 + CONFIG_CRYPTO_SHA3=m 783 784 CONFIG_CRYPTO_SM3_GENERIC=m 784 785 CONFIG_CRYPTO_WP512=m 785 786 CONFIG_CRYPTO_XCBC=m ··· 795 794 CONFIG_CRYPTO_USER_API_SKCIPHER=m 796 795 CONFIG_CRYPTO_USER_API_RNG=m 797 796 CONFIG_CRYPTO_USER_API_AEAD=m 798 - CONFIG_CRYPTO_SHA3_256_S390=m 799 - CONFIG_CRYPTO_SHA3_512_S390=m 800 797 CONFIG_CRYPTO_GHASH_S390=m 801 798 CONFIG_CRYPTO_AES_S390=m 802 799 CONFIG_CRYPTO_DES_S390=m
-20
arch/s390/crypto/Kconfig
··· 2 2 3 3 menu "Accelerated Cryptographic Algorithms for CPU (s390)" 4 4 5 - config CRYPTO_SHA3_256_S390 6 - tristate "Hash functions: SHA3-224 and SHA3-256" 7 - select CRYPTO_HASH 8 - help 9 - SHA3-224 and SHA3-256 secure hash algorithms (FIPS 202) 10 - 11 - Architecture: s390 12 - 13 - It is available as of z14. 14 - 15 - config CRYPTO_SHA3_512_S390 16 - tristate "Hash functions: SHA3-384 and SHA3-512" 17 - select CRYPTO_HASH 18 - help 19 - SHA3-384 and SHA3-512 secure hash algorithms (FIPS 202) 20 - 21 - Architecture: s390 22 - 23 - It is available as of z14. 24 - 25 5 config CRYPTO_GHASH_S390 26 6 tristate "Hash functions: GHASH" 27 7 select CRYPTO_HASH
-2
arch/s390/crypto/Makefile
··· 3 3 # Cryptographic API 4 4 # 5 5 6 - obj-$(CONFIG_CRYPTO_SHA3_256_S390) += sha3_256_s390.o sha_common.o 7 - obj-$(CONFIG_CRYPTO_SHA3_512_S390) += sha3_512_s390.o sha_common.o 8 6 obj-$(CONFIG_CRYPTO_DES_S390) += des_s390.o 9 7 obj-$(CONFIG_CRYPTO_AES_S390) += aes_s390.o 10 8 obj-$(CONFIG_CRYPTO_PAES_S390) += paes_s390.o
-51
arch/s390/crypto/sha.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0+ */ 2 - /* 3 - * Cryptographic API. 4 - * 5 - * s390 generic implementation of the SHA Secure Hash Algorithms. 6 - * 7 - * Copyright IBM Corp. 2007 8 - * Author(s): Jan Glauber (jang@de.ibm.com) 9 - */ 10 - #ifndef _CRYPTO_ARCH_S390_SHA_H 11 - #define _CRYPTO_ARCH_S390_SHA_H 12 - 13 - #include <crypto/hash.h> 14 - #include <crypto/sha2.h> 15 - #include <crypto/sha3.h> 16 - #include <linux/build_bug.h> 17 - #include <linux/types.h> 18 - 19 - /* must be big enough for the largest SHA variant */ 20 - #define CPACF_MAX_PARMBLOCK_SIZE SHA3_STATE_SIZE 21 - #define SHA_MAX_BLOCK_SIZE SHA3_224_BLOCK_SIZE 22 - 23 - struct s390_sha_ctx { 24 - u64 count; /* message length in bytes */ 25 - union { 26 - u32 state[CPACF_MAX_PARMBLOCK_SIZE / sizeof(u32)]; 27 - struct { 28 - u64 state[SHA512_DIGEST_SIZE / sizeof(u64)]; 29 - u64 count_hi; 30 - } sha512; 31 - struct { 32 - __le64 state[SHA3_STATE_SIZE / sizeof(u64)]; 33 - } sha3; 34 - }; 35 - int func; /* KIMD function to use */ 36 - bool first_message_part; 37 - }; 38 - 39 - struct shash_desc; 40 - 41 - int s390_sha_update_blocks(struct shash_desc *desc, const u8 *data, 42 - unsigned int len); 43 - int s390_sha_finup(struct shash_desc *desc, const u8 *src, unsigned int len, 44 - u8 *out); 45 - 46 - static inline void __check_s390_sha_ctx_size(void) 47 - { 48 - BUILD_BUG_ON(S390_SHA_CTX_SIZE != sizeof(struct s390_sha_ctx)); 49 - } 50 - 51 - #endif
-157
arch/s390/crypto/sha3_256_s390.c
··· 1 - // SPDX-License-Identifier: GPL-2.0+ 2 - /* 3 - * Cryptographic API. 4 - * 5 - * s390 implementation of the SHA256 and SHA224 Secure Hash Algorithm. 6 - * 7 - * s390 Version: 8 - * Copyright IBM Corp. 2019 9 - * Author(s): Joerg Schmidbauer (jschmidb@de.ibm.com) 10 - */ 11 - #include <asm/cpacf.h> 12 - #include <crypto/internal/hash.h> 13 - #include <crypto/sha3.h> 14 - #include <linux/cpufeature.h> 15 - #include <linux/errno.h> 16 - #include <linux/kernel.h> 17 - #include <linux/module.h> 18 - #include <linux/string.h> 19 - 20 - #include "sha.h" 21 - 22 - static int sha3_256_init(struct shash_desc *desc) 23 - { 24 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 25 - 26 - sctx->first_message_part = test_facility(86); 27 - if (!sctx->first_message_part) 28 - memset(sctx->state, 0, sizeof(sctx->state)); 29 - sctx->count = 0; 30 - sctx->func = CPACF_KIMD_SHA3_256; 31 - 32 - return 0; 33 - } 34 - 35 - static int sha3_256_export(struct shash_desc *desc, void *out) 36 - { 37 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 38 - union { 39 - u8 *u8; 40 - u64 *u64; 41 - } p = { .u8 = out }; 42 - int i; 43 - 44 - if (sctx->first_message_part) { 45 - memset(out, 0, SHA3_STATE_SIZE); 46 - return 0; 47 - } 48 - for (i = 0; i < SHA3_STATE_SIZE / 8; i++) 49 - put_unaligned(le64_to_cpu(sctx->sha3.state[i]), p.u64++); 50 - return 0; 51 - } 52 - 53 - static int sha3_256_import(struct shash_desc *desc, const void *in) 54 - { 55 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 56 - union { 57 - const u8 *u8; 58 - const u64 *u64; 59 - } p = { .u8 = in }; 60 - int i; 61 - 62 - for (i = 0; i < SHA3_STATE_SIZE / 8; i++) 63 - sctx->sha3.state[i] = cpu_to_le64(get_unaligned(p.u64++)); 64 - sctx->count = 0; 65 - sctx->first_message_part = 0; 66 - sctx->func = CPACF_KIMD_SHA3_256; 67 - 68 - return 0; 69 - } 70 - 71 - static int sha3_224_import(struct shash_desc *desc, const void *in) 72 - { 73 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 74 - 75 - sha3_256_import(desc, in); 76 - sctx->func = CPACF_KIMD_SHA3_224; 77 - return 0; 78 - } 79 - 80 - static struct shash_alg sha3_256_alg = { 81 - .digestsize = SHA3_256_DIGEST_SIZE, /* = 32 */ 82 - .init = sha3_256_init, 83 - .update = s390_sha_update_blocks, 84 - .finup = s390_sha_finup, 85 - .export = sha3_256_export, 86 - .import = sha3_256_import, 87 - .descsize = S390_SHA_CTX_SIZE, 88 - .statesize = SHA3_STATE_SIZE, 89 - .base = { 90 - .cra_name = "sha3-256", 91 - .cra_driver_name = "sha3-256-s390", 92 - .cra_priority = 300, 93 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 94 - .cra_blocksize = SHA3_256_BLOCK_SIZE, 95 - .cra_module = THIS_MODULE, 96 - } 97 - }; 98 - 99 - static int sha3_224_init(struct shash_desc *desc) 100 - { 101 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 102 - 103 - sha3_256_init(desc); 104 - sctx->func = CPACF_KIMD_SHA3_224; 105 - return 0; 106 - } 107 - 108 - static struct shash_alg sha3_224_alg = { 109 - .digestsize = SHA3_224_DIGEST_SIZE, 110 - .init = sha3_224_init, 111 - .update = s390_sha_update_blocks, 112 - .finup = s390_sha_finup, 113 - .export = sha3_256_export, /* same as for 256 */ 114 - .import = sha3_224_import, /* function code different! */ 115 - .descsize = S390_SHA_CTX_SIZE, 116 - .statesize = SHA3_STATE_SIZE, 117 - .base = { 118 - .cra_name = "sha3-224", 119 - .cra_driver_name = "sha3-224-s390", 120 - .cra_priority = 300, 121 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 122 - .cra_blocksize = SHA3_224_BLOCK_SIZE, 123 - .cra_module = THIS_MODULE, 124 - } 125 - }; 126 - 127 - static int __init sha3_256_s390_init(void) 128 - { 129 - int ret; 130 - 131 - if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA3_256)) 132 - return -ENODEV; 133 - 134 - ret = crypto_register_shash(&sha3_256_alg); 135 - if (ret < 0) 136 - goto out; 137 - 138 - ret = crypto_register_shash(&sha3_224_alg); 139 - if (ret < 0) 140 - crypto_unregister_shash(&sha3_256_alg); 141 - out: 142 - return ret; 143 - } 144 - 145 - static void __exit sha3_256_s390_fini(void) 146 - { 147 - crypto_unregister_shash(&sha3_224_alg); 148 - crypto_unregister_shash(&sha3_256_alg); 149 - } 150 - 151 - module_cpu_feature_match(S390_CPU_FEATURE_MSA, sha3_256_s390_init); 152 - module_exit(sha3_256_s390_fini); 153 - 154 - MODULE_ALIAS_CRYPTO("sha3-256"); 155 - MODULE_ALIAS_CRYPTO("sha3-224"); 156 - MODULE_LICENSE("GPL"); 157 - MODULE_DESCRIPTION("SHA3-256 and SHA3-224 Secure Hash Algorithm");
-157
arch/s390/crypto/sha3_512_s390.c
··· 1 - // SPDX-License-Identifier: GPL-2.0+ 2 - /* 3 - * Cryptographic API. 4 - * 5 - * s390 implementation of the SHA512 and SHA384 Secure Hash Algorithm. 6 - * 7 - * Copyright IBM Corp. 2019 8 - * Author(s): Joerg Schmidbauer (jschmidb@de.ibm.com) 9 - */ 10 - #include <asm/cpacf.h> 11 - #include <crypto/internal/hash.h> 12 - #include <crypto/sha3.h> 13 - #include <linux/cpufeature.h> 14 - #include <linux/errno.h> 15 - #include <linux/kernel.h> 16 - #include <linux/module.h> 17 - #include <linux/string.h> 18 - 19 - #include "sha.h" 20 - 21 - static int sha3_512_init(struct shash_desc *desc) 22 - { 23 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 24 - 25 - sctx->first_message_part = test_facility(86); 26 - if (!sctx->first_message_part) 27 - memset(sctx->state, 0, sizeof(sctx->state)); 28 - sctx->count = 0; 29 - sctx->func = CPACF_KIMD_SHA3_512; 30 - 31 - return 0; 32 - } 33 - 34 - static int sha3_512_export(struct shash_desc *desc, void *out) 35 - { 36 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 37 - union { 38 - u8 *u8; 39 - u64 *u64; 40 - } p = { .u8 = out }; 41 - int i; 42 - 43 - if (sctx->first_message_part) { 44 - memset(out, 0, SHA3_STATE_SIZE); 45 - return 0; 46 - } 47 - for (i = 0; i < SHA3_STATE_SIZE / 8; i++) 48 - put_unaligned(le64_to_cpu(sctx->sha3.state[i]), p.u64++); 49 - return 0; 50 - } 51 - 52 - static int sha3_512_import(struct shash_desc *desc, const void *in) 53 - { 54 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 55 - union { 56 - const u8 *u8; 57 - const u64 *u64; 58 - } p = { .u8 = in }; 59 - int i; 60 - 61 - for (i = 0; i < SHA3_STATE_SIZE / 8; i++) 62 - sctx->sha3.state[i] = cpu_to_le64(get_unaligned(p.u64++)); 63 - sctx->count = 0; 64 - sctx->first_message_part = 0; 65 - sctx->func = CPACF_KIMD_SHA3_512; 66 - 67 - return 0; 68 - } 69 - 70 - static int sha3_384_import(struct shash_desc *desc, const void *in) 71 - { 72 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 73 - 74 - sha3_512_import(desc, in); 75 - sctx->func = CPACF_KIMD_SHA3_384; 76 - return 0; 77 - } 78 - 79 - static struct shash_alg sha3_512_alg = { 80 - .digestsize = SHA3_512_DIGEST_SIZE, 81 - .init = sha3_512_init, 82 - .update = s390_sha_update_blocks, 83 - .finup = s390_sha_finup, 84 - .export = sha3_512_export, 85 - .import = sha3_512_import, 86 - .descsize = S390_SHA_CTX_SIZE, 87 - .statesize = SHA3_STATE_SIZE, 88 - .base = { 89 - .cra_name = "sha3-512", 90 - .cra_driver_name = "sha3-512-s390", 91 - .cra_priority = 300, 92 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 93 - .cra_blocksize = SHA3_512_BLOCK_SIZE, 94 - .cra_module = THIS_MODULE, 95 - } 96 - }; 97 - 98 - MODULE_ALIAS_CRYPTO("sha3-512"); 99 - 100 - static int sha3_384_init(struct shash_desc *desc) 101 - { 102 - struct s390_sha_ctx *sctx = shash_desc_ctx(desc); 103 - 104 - sha3_512_init(desc); 105 - sctx->func = CPACF_KIMD_SHA3_384; 106 - return 0; 107 - } 108 - 109 - static struct shash_alg sha3_384_alg = { 110 - .digestsize = SHA3_384_DIGEST_SIZE, 111 - .init = sha3_384_init, 112 - .update = s390_sha_update_blocks, 113 - .finup = s390_sha_finup, 114 - .export = sha3_512_export, /* same as for 512 */ 115 - .import = sha3_384_import, /* function code different! */ 116 - .descsize = S390_SHA_CTX_SIZE, 117 - .statesize = SHA3_STATE_SIZE, 118 - .base = { 119 - .cra_name = "sha3-384", 120 - .cra_driver_name = "sha3-384-s390", 121 - .cra_priority = 300, 122 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 123 - .cra_blocksize = SHA3_384_BLOCK_SIZE, 124 - .cra_ctxsize = sizeof(struct s390_sha_ctx), 125 - .cra_module = THIS_MODULE, 126 - } 127 - }; 128 - 129 - MODULE_ALIAS_CRYPTO("sha3-384"); 130 - 131 - static int __init init(void) 132 - { 133 - int ret; 134 - 135 - if (!cpacf_query_func(CPACF_KIMD, CPACF_KIMD_SHA3_512)) 136 - return -ENODEV; 137 - ret = crypto_register_shash(&sha3_512_alg); 138 - if (ret < 0) 139 - goto out; 140 - ret = crypto_register_shash(&sha3_384_alg); 141 - if (ret < 0) 142 - crypto_unregister_shash(&sha3_512_alg); 143 - out: 144 - return ret; 145 - } 146 - 147 - static void __exit fini(void) 148 - { 149 - crypto_unregister_shash(&sha3_512_alg); 150 - crypto_unregister_shash(&sha3_384_alg); 151 - } 152 - 153 - module_cpu_feature_match(S390_CPU_FEATURE_MSA, init); 154 - module_exit(fini); 155 - 156 - MODULE_LICENSE("GPL"); 157 - MODULE_DESCRIPTION("SHA3-512 and SHA3-384 Secure Hash Algorithm");
-117
arch/s390/crypto/sha_common.c
··· 1 - // SPDX-License-Identifier: GPL-2.0+ 2 - /* 3 - * Cryptographic API. 4 - * 5 - * s390 generic implementation of the SHA Secure Hash Algorithms. 6 - * 7 - * Copyright IBM Corp. 2007 8 - * Author(s): Jan Glauber (jang@de.ibm.com) 9 - */ 10 - 11 - #include <crypto/internal/hash.h> 12 - #include <linux/export.h> 13 - #include <linux/module.h> 14 - #include <asm/cpacf.h> 15 - #include "sha.h" 16 - 17 - int s390_sha_update_blocks(struct shash_desc *desc, const u8 *data, 18 - unsigned int len) 19 - { 20 - unsigned int bsize = crypto_shash_blocksize(desc->tfm); 21 - struct s390_sha_ctx *ctx = shash_desc_ctx(desc); 22 - unsigned int n; 23 - int fc; 24 - 25 - fc = ctx->func; 26 - if (ctx->first_message_part) 27 - fc |= CPACF_KIMD_NIP; 28 - 29 - /* process as many blocks as possible */ 30 - n = (len / bsize) * bsize; 31 - ctx->count += n; 32 - switch (ctx->func) { 33 - case CPACF_KLMD_SHA_512: 34 - case CPACF_KLMD_SHA3_384: 35 - if (ctx->count < n) 36 - ctx->sha512.count_hi++; 37 - break; 38 - } 39 - cpacf_kimd(fc, ctx->state, data, n); 40 - ctx->first_message_part = 0; 41 - return len - n; 42 - } 43 - EXPORT_SYMBOL_GPL(s390_sha_update_blocks); 44 - 45 - static int s390_crypto_shash_parmsize(int func) 46 - { 47 - switch (func) { 48 - case CPACF_KLMD_SHA_1: 49 - return 20; 50 - case CPACF_KLMD_SHA_256: 51 - return 32; 52 - case CPACF_KLMD_SHA_512: 53 - return 64; 54 - case CPACF_KLMD_SHA3_224: 55 - case CPACF_KLMD_SHA3_256: 56 - case CPACF_KLMD_SHA3_384: 57 - case CPACF_KLMD_SHA3_512: 58 - return 200; 59 - default: 60 - return -EINVAL; 61 - } 62 - } 63 - 64 - int s390_sha_finup(struct shash_desc *desc, const u8 *src, unsigned int len, 65 - u8 *out) 66 - { 67 - struct s390_sha_ctx *ctx = shash_desc_ctx(desc); 68 - int mbl_offset, fc; 69 - u64 bits; 70 - 71 - ctx->count += len; 72 - 73 - bits = ctx->count * 8; 74 - mbl_offset = s390_crypto_shash_parmsize(ctx->func); 75 - if (mbl_offset < 0) 76 - return -EINVAL; 77 - 78 - mbl_offset = mbl_offset / sizeof(u32); 79 - 80 - /* set total msg bit length (mbl) in CPACF parmblock */ 81 - switch (ctx->func) { 82 - case CPACF_KLMD_SHA_512: 83 - /* The SHA512 parmblock has a 128-bit mbl field. */ 84 - if (ctx->count < len) 85 - ctx->sha512.count_hi++; 86 - ctx->sha512.count_hi <<= 3; 87 - ctx->sha512.count_hi |= ctx->count >> 61; 88 - mbl_offset += sizeof(u64) / sizeof(u32); 89 - fallthrough; 90 - case CPACF_KLMD_SHA_1: 91 - case CPACF_KLMD_SHA_256: 92 - memcpy(ctx->state + mbl_offset, &bits, sizeof(bits)); 93 - break; 94 - case CPACF_KLMD_SHA3_224: 95 - case CPACF_KLMD_SHA3_256: 96 - case CPACF_KLMD_SHA3_384: 97 - case CPACF_KLMD_SHA3_512: 98 - break; 99 - default: 100 - return -EINVAL; 101 - } 102 - 103 - fc = ctx->func; 104 - fc |= test_facility(86) ? CPACF_KLMD_DUFOP : 0; 105 - if (ctx->first_message_part) 106 - fc |= CPACF_KLMD_NIP; 107 - cpacf_klmd(fc, ctx->state, src, len); 108 - 109 - /* copy digest to out */ 110 - memcpy(out, ctx->state, crypto_shash_digestsize(desc->tfm)); 111 - 112 - return 0; 113 - } 114 - EXPORT_SYMBOL_GPL(s390_sha_finup); 115 - 116 - MODULE_LICENSE("GPL"); 117 - MODULE_DESCRIPTION("s390 SHA cipher common functions");
-10
arch/x86/crypto/Kconfig
··· 353 353 Architecture: x86_64 using: 354 354 - AVX2 (Advanced Vector Extensions 2) 355 355 356 - config CRYPTO_POLYVAL_CLMUL_NI 357 - tristate "Hash functions: POLYVAL (CLMUL-NI)" 358 - depends on 64BIT 359 - select CRYPTO_POLYVAL 360 - help 361 - POLYVAL hash function for HCTR2 362 - 363 - Architecture: x86_64 using: 364 - - CLMUL-NI (carry-less multiplication new instructions) 365 - 366 356 config CRYPTO_SM3_AVX_X86_64 367 357 tristate "Hash functions: SM3 (AVX)" 368 358 depends on 64BIT
-3
arch/x86/crypto/Makefile
··· 52 52 obj-$(CONFIG_CRYPTO_GHASH_CLMUL_NI_INTEL) += ghash-clmulni-intel.o 53 53 ghash-clmulni-intel-y := ghash-clmulni-intel_asm.o ghash-clmulni-intel_glue.o 54 54 55 - obj-$(CONFIG_CRYPTO_POLYVAL_CLMUL_NI) += polyval-clmulni.o 56 - polyval-clmulni-y := polyval-clmulni_asm.o polyval-clmulni_glue.o 57 - 58 55 obj-$(CONFIG_CRYPTO_NHPOLY1305_SSE2) += nhpoly1305-sse2.o 59 56 nhpoly1305-sse2-y := nh-sse2-x86_64.o nhpoly1305-sse2-glue.o 60 57 obj-$(CONFIG_CRYPTO_NHPOLY1305_AVX2) += nhpoly1305-avx2.o
+19 -21
arch/x86/crypto/polyval-clmulni_asm.S lib/crypto/x86/polyval-pclmul-avx.S
··· 36 36 #define MI %xmm14 37 37 #define SUM %xmm15 38 38 39 - #define KEY_POWERS %rdi 40 - #define MSG %rsi 41 - #define BLOCKS_LEFT %rdx 42 - #define ACCUMULATOR %rcx 39 + #define ACCUMULATOR %rdi 40 + #define KEY_POWERS %rsi 41 + #define MSG %rdx 42 + #define BLOCKS_LEFT %rcx 43 43 #define TMP %rax 44 44 45 45 .section .rodata.cst16.gstar, "aM", @progbits, 16 ··· 234 234 235 235 movups (MSG), %xmm0 236 236 pxor SUM, %xmm0 237 - movaps (KEY_POWERS), %xmm1 237 + movups (KEY_POWERS), %xmm1 238 238 schoolbook1_noload 239 239 dec BLOCKS_LEFT 240 240 addq $16, MSG ··· 261 261 .endm 262 262 263 263 /* 264 - * Perform montgomery multiplication in GF(2^128) and store result in op1. 264 + * Computes a = a * b * x^{-128} mod x^128 + x^127 + x^126 + x^121 + 1. 265 265 * 266 - * Computes op1*op2*x^{-128} mod x^128 + x^127 + x^126 + x^121 + 1 267 - * If op1, op2 are in montgomery form, this computes the montgomery 268 - * form of op1*op2. 269 - * 270 - * void clmul_polyval_mul(u8 *op1, const u8 *op2); 266 + * void polyval_mul_pclmul_avx(struct polyval_elem *a, 267 + * const struct polyval_elem *b); 271 268 */ 272 - SYM_FUNC_START(clmul_polyval_mul) 269 + SYM_FUNC_START(polyval_mul_pclmul_avx) 273 270 FRAME_BEGIN 274 271 vmovdqa .Lgstar(%rip), GSTAR 275 272 movups (%rdi), %xmm0 ··· 277 280 movups SUM, (%rdi) 278 281 FRAME_END 279 282 RET 280 - SYM_FUNC_END(clmul_polyval_mul) 283 + SYM_FUNC_END(polyval_mul_pclmul_avx) 281 284 282 285 /* 283 286 * Perform polynomial evaluation as specified by POLYVAL. This computes: 284 287 * h^n * accumulator + h^n * m_0 + ... + h^1 * m_{n-1} 285 288 * where n=nblocks, h is the hash key, and m_i are the message blocks. 286 289 * 287 - * rdi - pointer to precomputed key powers h^8 ... h^1 288 - * rsi - pointer to message blocks 289 - * rdx - number of blocks to hash 290 - * rcx - pointer to the accumulator 290 + * rdi - pointer to the accumulator 291 + * rsi - pointer to precomputed key powers h^8 ... h^1 292 + * rdx - pointer to message blocks 293 + * rcx - number of blocks to hash 291 294 * 292 - * void clmul_polyval_update(const struct polyval_tfm_ctx *keys, 293 - * const u8 *in, size_t nblocks, u8 *accumulator); 295 + * void polyval_blocks_pclmul_avx(struct polyval_elem *acc, 296 + * const struct polyval_key *key, 297 + * const u8 *data, size_t nblocks); 294 298 */ 295 - SYM_FUNC_START(clmul_polyval_update) 299 + SYM_FUNC_START(polyval_blocks_pclmul_avx) 296 300 FRAME_BEGIN 297 301 vmovdqa .Lgstar(%rip), GSTAR 298 302 movups (ACCUMULATOR), SUM ··· 316 318 movups SUM, (ACCUMULATOR) 317 319 FRAME_END 318 320 RET 319 - SYM_FUNC_END(clmul_polyval_update) 321 + SYM_FUNC_END(polyval_blocks_pclmul_avx)
-180
arch/x86/crypto/polyval-clmulni_glue.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * Glue code for POLYVAL using PCMULQDQ-NI 4 - * 5 - * Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi> 6 - * Copyright (c) 2009 Intel Corp. 7 - * Author: Huang Ying <ying.huang@intel.com> 8 - * Copyright 2021 Google LLC 9 - */ 10 - 11 - /* 12 - * Glue code based on ghash-clmulni-intel_glue.c. 13 - * 14 - * This implementation of POLYVAL uses montgomery multiplication 15 - * accelerated by PCLMULQDQ-NI to implement the finite field 16 - * operations. 17 - */ 18 - 19 - #include <asm/cpu_device_id.h> 20 - #include <asm/fpu/api.h> 21 - #include <crypto/internal/hash.h> 22 - #include <crypto/polyval.h> 23 - #include <crypto/utils.h> 24 - #include <linux/errno.h> 25 - #include <linux/kernel.h> 26 - #include <linux/module.h> 27 - #include <linux/string.h> 28 - 29 - #define POLYVAL_ALIGN 16 30 - #define POLYVAL_ALIGN_ATTR __aligned(POLYVAL_ALIGN) 31 - #define POLYVAL_ALIGN_EXTRA ((POLYVAL_ALIGN - 1) & ~(CRYPTO_MINALIGN - 1)) 32 - #define POLYVAL_CTX_SIZE (sizeof(struct polyval_tfm_ctx) + POLYVAL_ALIGN_EXTRA) 33 - #define NUM_KEY_POWERS 8 34 - 35 - struct polyval_tfm_ctx { 36 - /* 37 - * These powers must be in the order h^8, ..., h^1. 38 - */ 39 - u8 key_powers[NUM_KEY_POWERS][POLYVAL_BLOCK_SIZE] POLYVAL_ALIGN_ATTR; 40 - }; 41 - 42 - struct polyval_desc_ctx { 43 - u8 buffer[POLYVAL_BLOCK_SIZE]; 44 - }; 45 - 46 - asmlinkage void clmul_polyval_update(const struct polyval_tfm_ctx *keys, 47 - const u8 *in, size_t nblocks, u8 *accumulator); 48 - asmlinkage void clmul_polyval_mul(u8 *op1, const u8 *op2); 49 - 50 - static inline struct polyval_tfm_ctx *polyval_tfm_ctx(struct crypto_shash *tfm) 51 - { 52 - return PTR_ALIGN(crypto_shash_ctx(tfm), POLYVAL_ALIGN); 53 - } 54 - 55 - static void internal_polyval_update(const struct polyval_tfm_ctx *keys, 56 - const u8 *in, size_t nblocks, u8 *accumulator) 57 - { 58 - kernel_fpu_begin(); 59 - clmul_polyval_update(keys, in, nblocks, accumulator); 60 - kernel_fpu_end(); 61 - } 62 - 63 - static void internal_polyval_mul(u8 *op1, const u8 *op2) 64 - { 65 - kernel_fpu_begin(); 66 - clmul_polyval_mul(op1, op2); 67 - kernel_fpu_end(); 68 - } 69 - 70 - static int polyval_x86_setkey(struct crypto_shash *tfm, 71 - const u8 *key, unsigned int keylen) 72 - { 73 - struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(tfm); 74 - int i; 75 - 76 - if (keylen != POLYVAL_BLOCK_SIZE) 77 - return -EINVAL; 78 - 79 - memcpy(tctx->key_powers[NUM_KEY_POWERS-1], key, POLYVAL_BLOCK_SIZE); 80 - 81 - for (i = NUM_KEY_POWERS-2; i >= 0; i--) { 82 - memcpy(tctx->key_powers[i], key, POLYVAL_BLOCK_SIZE); 83 - internal_polyval_mul(tctx->key_powers[i], 84 - tctx->key_powers[i+1]); 85 - } 86 - 87 - return 0; 88 - } 89 - 90 - static int polyval_x86_init(struct shash_desc *desc) 91 - { 92 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 93 - 94 - memset(dctx, 0, sizeof(*dctx)); 95 - 96 - return 0; 97 - } 98 - 99 - static int polyval_x86_update(struct shash_desc *desc, 100 - const u8 *src, unsigned int srclen) 101 - { 102 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 103 - const struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(desc->tfm); 104 - unsigned int nblocks; 105 - 106 - do { 107 - /* Allow rescheduling every 4K bytes. */ 108 - nblocks = min(srclen, 4096U) / POLYVAL_BLOCK_SIZE; 109 - internal_polyval_update(tctx, src, nblocks, dctx->buffer); 110 - srclen -= nblocks * POLYVAL_BLOCK_SIZE; 111 - src += nblocks * POLYVAL_BLOCK_SIZE; 112 - } while (srclen >= POLYVAL_BLOCK_SIZE); 113 - 114 - return srclen; 115 - } 116 - 117 - static int polyval_x86_finup(struct shash_desc *desc, const u8 *src, 118 - unsigned int len, u8 *dst) 119 - { 120 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 121 - const struct polyval_tfm_ctx *tctx = polyval_tfm_ctx(desc->tfm); 122 - 123 - if (len) { 124 - crypto_xor(dctx->buffer, src, len); 125 - internal_polyval_mul(dctx->buffer, 126 - tctx->key_powers[NUM_KEY_POWERS-1]); 127 - } 128 - 129 - memcpy(dst, dctx->buffer, POLYVAL_BLOCK_SIZE); 130 - 131 - return 0; 132 - } 133 - 134 - static struct shash_alg polyval_alg = { 135 - .digestsize = POLYVAL_DIGEST_SIZE, 136 - .init = polyval_x86_init, 137 - .update = polyval_x86_update, 138 - .finup = polyval_x86_finup, 139 - .setkey = polyval_x86_setkey, 140 - .descsize = sizeof(struct polyval_desc_ctx), 141 - .base = { 142 - .cra_name = "polyval", 143 - .cra_driver_name = "polyval-clmulni", 144 - .cra_priority = 200, 145 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 146 - .cra_blocksize = POLYVAL_BLOCK_SIZE, 147 - .cra_ctxsize = POLYVAL_CTX_SIZE, 148 - .cra_module = THIS_MODULE, 149 - }, 150 - }; 151 - 152 - __maybe_unused static const struct x86_cpu_id pcmul_cpu_id[] = { 153 - X86_MATCH_FEATURE(X86_FEATURE_PCLMULQDQ, NULL), 154 - {} 155 - }; 156 - MODULE_DEVICE_TABLE(x86cpu, pcmul_cpu_id); 157 - 158 - static int __init polyval_clmulni_mod_init(void) 159 - { 160 - if (!x86_match_cpu(pcmul_cpu_id)) 161 - return -ENODEV; 162 - 163 - if (!boot_cpu_has(X86_FEATURE_AVX)) 164 - return -ENODEV; 165 - 166 - return crypto_register_shash(&polyval_alg); 167 - } 168 - 169 - static void __exit polyval_clmulni_mod_exit(void) 170 - { 171 - crypto_unregister_shash(&polyval_alg); 172 - } 173 - 174 - module_init(polyval_clmulni_mod_init); 175 - module_exit(polyval_clmulni_mod_exit); 176 - 177 - MODULE_LICENSE("GPL"); 178 - MODULE_DESCRIPTION("POLYVAL hash function accelerated by PCLMULQDQ-NI"); 179 - MODULE_ALIAS_CRYPTO("polyval"); 180 - MODULE_ALIAS_CRYPTO("polyval-clmulni");
+3 -11
crypto/Kconfig
··· 696 696 config CRYPTO_HCTR2 697 697 tristate "HCTR2" 698 698 select CRYPTO_XCTR 699 - select CRYPTO_POLYVAL 699 + select CRYPTO_LIB_POLYVAL 700 700 select CRYPTO_MANAGER 701 701 help 702 702 HCTR2 length-preserving encryption mode ··· 881 881 config CRYPTO_BLAKE2B 882 882 tristate "BLAKE2b" 883 883 select CRYPTO_HASH 884 + select CRYPTO_LIB_BLAKE2B 884 885 help 885 886 BLAKE2b cryptographic hash function (RFC 7693) 886 887 ··· 948 947 This algorithm is required for TKIP, but it should not be used for 949 948 other purposes because of the weakness of the algorithm. 950 949 951 - config CRYPTO_POLYVAL 952 - tristate 953 - select CRYPTO_HASH 954 - select CRYPTO_LIB_GF128MUL 955 - help 956 - POLYVAL hash function for HCTR2 957 - 958 - This is used in HCTR2. It is not a general-purpose 959 - cryptographic hash function. 960 - 961 950 config CRYPTO_RMD160 962 951 tristate "RIPEMD-160" 963 952 select CRYPTO_HASH ··· 996 1005 config CRYPTO_SHA3 997 1006 tristate "SHA-3" 998 1007 select CRYPTO_HASH 1008 + select CRYPTO_LIB_SHA3 999 1009 help 1000 1010 SHA-3 secure hash algorithms (FIPS 202, ISO/IEC 10118-3) 1001 1011
+2 -4
crypto/Makefile
··· 78 78 obj-$(CONFIG_CRYPTO_SHA1) += sha1.o 79 79 obj-$(CONFIG_CRYPTO_SHA256) += sha256.o 80 80 obj-$(CONFIG_CRYPTO_SHA512) += sha512.o 81 - obj-$(CONFIG_CRYPTO_SHA3) += sha3_generic.o 81 + obj-$(CONFIG_CRYPTO_SHA3) += sha3.o 82 82 obj-$(CONFIG_CRYPTO_SM3_GENERIC) += sm3_generic.o 83 83 obj-$(CONFIG_CRYPTO_STREEBOG) += streebog_generic.o 84 84 obj-$(CONFIG_CRYPTO_WP512) += wp512.o 85 85 CFLAGS_wp512.o := $(call cc-option,-fno-schedule-insns) # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79149 86 - obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b_generic.o 87 - CFLAGS_blake2b_generic.o := -Wframe-larger-than=4096 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 86 + obj-$(CONFIG_CRYPTO_BLAKE2B) += blake2b.o 88 87 obj-$(CONFIG_CRYPTO_ECB) += ecb.o 89 88 obj-$(CONFIG_CRYPTO_CBC) += cbc.o 90 89 obj-$(CONFIG_CRYPTO_PCBC) += pcbc.o ··· 172 173 obj-$(CONFIG_CRYPTO_JITTERENTROPY_TESTINTERFACE) += jitterentropy-testing.o 173 174 obj-$(CONFIG_CRYPTO_BENCHMARK) += tcrypt.o 174 175 obj-$(CONFIG_CRYPTO_GHASH) += ghash-generic.o 175 - obj-$(CONFIG_CRYPTO_POLYVAL) += polyval-generic.o 176 176 obj-$(CONFIG_CRYPTO_USER_API) += af_alg.o 177 177 obj-$(CONFIG_CRYPTO_USER_API_HASH) += algif_hash.o 178 178 obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) += algif_skcipher.o
+111
crypto/blake2b.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * Crypto API support for BLAKE2b 4 + * 5 + * Copyright 2025 Google LLC 6 + */ 7 + #include <crypto/blake2b.h> 8 + #include <crypto/internal/hash.h> 9 + #include <linux/kernel.h> 10 + #include <linux/module.h> 11 + 12 + struct blake2b_tfm_ctx { 13 + unsigned int keylen; 14 + u8 key[BLAKE2B_KEY_SIZE]; 15 + }; 16 + 17 + static int crypto_blake2b_setkey(struct crypto_shash *tfm, 18 + const u8 *key, unsigned int keylen) 19 + { 20 + struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(tfm); 21 + 22 + if (keylen > BLAKE2B_KEY_SIZE) 23 + return -EINVAL; 24 + memcpy(tctx->key, key, keylen); 25 + tctx->keylen = keylen; 26 + return 0; 27 + } 28 + 29 + #define BLAKE2B_CTX(desc) ((struct blake2b_ctx *)shash_desc_ctx(desc)) 30 + 31 + static int crypto_blake2b_init(struct shash_desc *desc) 32 + { 33 + const struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); 34 + unsigned int digestsize = crypto_shash_digestsize(desc->tfm); 35 + 36 + blake2b_init_key(BLAKE2B_CTX(desc), digestsize, 37 + tctx->key, tctx->keylen); 38 + return 0; 39 + } 40 + 41 + static int crypto_blake2b_update(struct shash_desc *desc, 42 + const u8 *data, unsigned int len) 43 + { 44 + blake2b_update(BLAKE2B_CTX(desc), data, len); 45 + return 0; 46 + } 47 + 48 + static int crypto_blake2b_final(struct shash_desc *desc, u8 *out) 49 + { 50 + blake2b_final(BLAKE2B_CTX(desc), out); 51 + return 0; 52 + } 53 + 54 + static int crypto_blake2b_digest(struct shash_desc *desc, 55 + const u8 *data, unsigned int len, u8 *out) 56 + { 57 + const struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); 58 + unsigned int digestsize = crypto_shash_digestsize(desc->tfm); 59 + 60 + blake2b(tctx->key, tctx->keylen, data, len, out, digestsize); 61 + return 0; 62 + } 63 + 64 + #define BLAKE2B_ALG(name, digest_size) \ 65 + { \ 66 + .base.cra_name = name, \ 67 + .base.cra_driver_name = name "-lib", \ 68 + .base.cra_priority = 300, \ 69 + .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY, \ 70 + .base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \ 71 + .base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \ 72 + .base.cra_module = THIS_MODULE, \ 73 + .digestsize = digest_size, \ 74 + .setkey = crypto_blake2b_setkey, \ 75 + .init = crypto_blake2b_init, \ 76 + .update = crypto_blake2b_update, \ 77 + .final = crypto_blake2b_final, \ 78 + .digest = crypto_blake2b_digest, \ 79 + .descsize = sizeof(struct blake2b_ctx), \ 80 + } 81 + 82 + static struct shash_alg algs[] = { 83 + BLAKE2B_ALG("blake2b-160", BLAKE2B_160_HASH_SIZE), 84 + BLAKE2B_ALG("blake2b-256", BLAKE2B_256_HASH_SIZE), 85 + BLAKE2B_ALG("blake2b-384", BLAKE2B_384_HASH_SIZE), 86 + BLAKE2B_ALG("blake2b-512", BLAKE2B_512_HASH_SIZE), 87 + }; 88 + 89 + static int __init crypto_blake2b_mod_init(void) 90 + { 91 + return crypto_register_shashes(algs, ARRAY_SIZE(algs)); 92 + } 93 + module_init(crypto_blake2b_mod_init); 94 + 95 + static void __exit crypto_blake2b_mod_exit(void) 96 + { 97 + crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); 98 + } 99 + module_exit(crypto_blake2b_mod_exit); 100 + 101 + MODULE_LICENSE("GPL"); 102 + MODULE_DESCRIPTION("Crypto API support for BLAKE2b"); 103 + 104 + MODULE_ALIAS_CRYPTO("blake2b-160"); 105 + MODULE_ALIAS_CRYPTO("blake2b-160-lib"); 106 + MODULE_ALIAS_CRYPTO("blake2b-256"); 107 + MODULE_ALIAS_CRYPTO("blake2b-256-lib"); 108 + MODULE_ALIAS_CRYPTO("blake2b-384"); 109 + MODULE_ALIAS_CRYPTO("blake2b-384-lib"); 110 + MODULE_ALIAS_CRYPTO("blake2b-512"); 111 + MODULE_ALIAS_CRYPTO("blake2b-512-lib");
-192
crypto/blake2b_generic.c
··· 1 - // SPDX-License-Identifier: (GPL-2.0-only OR Apache-2.0) 2 - /* 3 - * Generic implementation of the BLAKE2b digest algorithm. Based on the BLAKE2b 4 - * reference implementation, but it has been heavily modified for use in the 5 - * kernel. The reference implementation was: 6 - * 7 - * Copyright 2012, Samuel Neves <sneves@dei.uc.pt>. You may use this under 8 - * the terms of the CC0, the OpenSSL Licence, or the Apache Public License 9 - * 2.0, at your option. The terms of these licenses can be found at: 10 - * 11 - * - CC0 1.0 Universal : http://creativecommons.org/publicdomain/zero/1.0 12 - * - OpenSSL license : https://www.openssl.org/source/license.html 13 - * - Apache 2.0 : https://www.apache.org/licenses/LICENSE-2.0 14 - * 15 - * More information about BLAKE2 can be found at https://blake2.net. 16 - */ 17 - 18 - #include <crypto/internal/blake2b.h> 19 - #include <crypto/internal/hash.h> 20 - #include <linux/kernel.h> 21 - #include <linux/module.h> 22 - #include <linux/string.h> 23 - #include <linux/unaligned.h> 24 - 25 - static const u8 blake2b_sigma[12][16] = { 26 - { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }, 27 - { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 }, 28 - { 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 }, 29 - { 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 }, 30 - { 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 }, 31 - { 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 }, 32 - { 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 }, 33 - { 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 }, 34 - { 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 }, 35 - { 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 }, 36 - { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }, 37 - { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 } 38 - }; 39 - 40 - static void blake2b_increment_counter(struct blake2b_state *S, const u64 inc) 41 - { 42 - S->t[0] += inc; 43 - S->t[1] += (S->t[0] < inc); 44 - } 45 - 46 - #define G(r,i,a,b,c,d) \ 47 - do { \ 48 - a = a + b + m[blake2b_sigma[r][2*i+0]]; \ 49 - d = ror64(d ^ a, 32); \ 50 - c = c + d; \ 51 - b = ror64(b ^ c, 24); \ 52 - a = a + b + m[blake2b_sigma[r][2*i+1]]; \ 53 - d = ror64(d ^ a, 16); \ 54 - c = c + d; \ 55 - b = ror64(b ^ c, 63); \ 56 - } while (0) 57 - 58 - #define ROUND(r) \ 59 - do { \ 60 - G(r,0,v[ 0],v[ 4],v[ 8],v[12]); \ 61 - G(r,1,v[ 1],v[ 5],v[ 9],v[13]); \ 62 - G(r,2,v[ 2],v[ 6],v[10],v[14]); \ 63 - G(r,3,v[ 3],v[ 7],v[11],v[15]); \ 64 - G(r,4,v[ 0],v[ 5],v[10],v[15]); \ 65 - G(r,5,v[ 1],v[ 6],v[11],v[12]); \ 66 - G(r,6,v[ 2],v[ 7],v[ 8],v[13]); \ 67 - G(r,7,v[ 3],v[ 4],v[ 9],v[14]); \ 68 - } while (0) 69 - 70 - static void blake2b_compress_one_generic(struct blake2b_state *S, 71 - const u8 block[BLAKE2B_BLOCK_SIZE]) 72 - { 73 - u64 m[16]; 74 - u64 v[16]; 75 - size_t i; 76 - 77 - for (i = 0; i < 16; ++i) 78 - m[i] = get_unaligned_le64(block + i * sizeof(m[i])); 79 - 80 - for (i = 0; i < 8; ++i) 81 - v[i] = S->h[i]; 82 - 83 - v[ 8] = BLAKE2B_IV0; 84 - v[ 9] = BLAKE2B_IV1; 85 - v[10] = BLAKE2B_IV2; 86 - v[11] = BLAKE2B_IV3; 87 - v[12] = BLAKE2B_IV4 ^ S->t[0]; 88 - v[13] = BLAKE2B_IV5 ^ S->t[1]; 89 - v[14] = BLAKE2B_IV6 ^ S->f[0]; 90 - v[15] = BLAKE2B_IV7 ^ S->f[1]; 91 - 92 - ROUND(0); 93 - ROUND(1); 94 - ROUND(2); 95 - ROUND(3); 96 - ROUND(4); 97 - ROUND(5); 98 - ROUND(6); 99 - ROUND(7); 100 - ROUND(8); 101 - ROUND(9); 102 - ROUND(10); 103 - ROUND(11); 104 - #ifdef CONFIG_CC_IS_CLANG 105 - #pragma nounroll /* https://llvm.org/pr45803 */ 106 - #endif 107 - for (i = 0; i < 8; ++i) 108 - S->h[i] = S->h[i] ^ v[i] ^ v[i + 8]; 109 - } 110 - 111 - #undef G 112 - #undef ROUND 113 - 114 - static void blake2b_compress_generic(struct blake2b_state *state, 115 - const u8 *block, size_t nblocks, u32 inc) 116 - { 117 - do { 118 - blake2b_increment_counter(state, inc); 119 - blake2b_compress_one_generic(state, block); 120 - block += BLAKE2B_BLOCK_SIZE; 121 - } while (--nblocks); 122 - } 123 - 124 - static int crypto_blake2b_update_generic(struct shash_desc *desc, 125 - const u8 *in, unsigned int inlen) 126 - { 127 - return crypto_blake2b_update_bo(desc, in, inlen, 128 - blake2b_compress_generic); 129 - } 130 - 131 - static int crypto_blake2b_finup_generic(struct shash_desc *desc, const u8 *in, 132 - unsigned int inlen, u8 *out) 133 - { 134 - return crypto_blake2b_finup(desc, in, inlen, out, 135 - blake2b_compress_generic); 136 - } 137 - 138 - #define BLAKE2B_ALG(name, driver_name, digest_size) \ 139 - { \ 140 - .base.cra_name = name, \ 141 - .base.cra_driver_name = driver_name, \ 142 - .base.cra_priority = 100, \ 143 - .base.cra_flags = CRYPTO_ALG_OPTIONAL_KEY | \ 144 - CRYPTO_AHASH_ALG_BLOCK_ONLY | \ 145 - CRYPTO_AHASH_ALG_FINAL_NONZERO, \ 146 - .base.cra_blocksize = BLAKE2B_BLOCK_SIZE, \ 147 - .base.cra_ctxsize = sizeof(struct blake2b_tfm_ctx), \ 148 - .base.cra_module = THIS_MODULE, \ 149 - .digestsize = digest_size, \ 150 - .setkey = crypto_blake2b_setkey, \ 151 - .init = crypto_blake2b_init, \ 152 - .update = crypto_blake2b_update_generic, \ 153 - .finup = crypto_blake2b_finup_generic, \ 154 - .descsize = BLAKE2B_DESC_SIZE, \ 155 - .statesize = BLAKE2B_STATE_SIZE, \ 156 - } 157 - 158 - static struct shash_alg blake2b_algs[] = { 159 - BLAKE2B_ALG("blake2b-160", "blake2b-160-generic", 160 - BLAKE2B_160_HASH_SIZE), 161 - BLAKE2B_ALG("blake2b-256", "blake2b-256-generic", 162 - BLAKE2B_256_HASH_SIZE), 163 - BLAKE2B_ALG("blake2b-384", "blake2b-384-generic", 164 - BLAKE2B_384_HASH_SIZE), 165 - BLAKE2B_ALG("blake2b-512", "blake2b-512-generic", 166 - BLAKE2B_512_HASH_SIZE), 167 - }; 168 - 169 - static int __init blake2b_mod_init(void) 170 - { 171 - return crypto_register_shashes(blake2b_algs, ARRAY_SIZE(blake2b_algs)); 172 - } 173 - 174 - static void __exit blake2b_mod_fini(void) 175 - { 176 - crypto_unregister_shashes(blake2b_algs, ARRAY_SIZE(blake2b_algs)); 177 - } 178 - 179 - module_init(blake2b_mod_init); 180 - module_exit(blake2b_mod_fini); 181 - 182 - MODULE_AUTHOR("David Sterba <kdave@kernel.org>"); 183 - MODULE_DESCRIPTION("BLAKE2b generic implementation"); 184 - MODULE_LICENSE("GPL"); 185 - MODULE_ALIAS_CRYPTO("blake2b-160"); 186 - MODULE_ALIAS_CRYPTO("blake2b-160-generic"); 187 - MODULE_ALIAS_CRYPTO("blake2b-256"); 188 - MODULE_ALIAS_CRYPTO("blake2b-256-generic"); 189 - MODULE_ALIAS_CRYPTO("blake2b-384"); 190 - MODULE_ALIAS_CRYPTO("blake2b-384-generic"); 191 - MODULE_ALIAS_CRYPTO("blake2b-512"); 192 - MODULE_ALIAS_CRYPTO("blake2b-512-generic");
+62 -160
crypto/hctr2.c
··· 17 17 */ 18 18 19 19 #include <crypto/internal/cipher.h> 20 - #include <crypto/internal/hash.h> 21 20 #include <crypto/internal/skcipher.h> 22 21 #include <crypto/polyval.h> 23 22 #include <crypto/scatterwalk.h> ··· 36 37 struct hctr2_instance_ctx { 37 38 struct crypto_cipher_spawn blockcipher_spawn; 38 39 struct crypto_skcipher_spawn xctr_spawn; 39 - struct crypto_shash_spawn polyval_spawn; 40 40 }; 41 41 42 42 struct hctr2_tfm_ctx { 43 43 struct crypto_cipher *blockcipher; 44 44 struct crypto_skcipher *xctr; 45 - struct crypto_shash *polyval; 45 + struct polyval_key poly_key; 46 + struct polyval_elem hashed_tweaklens[2]; 46 47 u8 L[BLOCKCIPHER_BLOCK_SIZE]; 47 - int hashed_tweak_offset; 48 - /* 49 - * This struct is allocated with extra space for two exported hash 50 - * states. Since the hash state size is not known at compile-time, we 51 - * can't add these to the struct directly. 52 - * 53 - * hashed_tweaklen_divisible; 54 - * hashed_tweaklen_remainder; 55 - */ 56 48 }; 57 49 58 50 struct hctr2_request_ctx { ··· 53 63 struct scatterlist *bulk_part_src; 54 64 struct scatterlist sg_src[2]; 55 65 struct scatterlist sg_dst[2]; 66 + struct polyval_elem hashed_tweak; 56 67 /* 57 - * Sub-request sizes are unknown at compile-time, so they need to go 58 - * after the members with known sizes. 68 + * skcipher sub-request size is unknown at compile-time, so it needs to 69 + * go after the members with known sizes. 59 70 */ 60 71 union { 61 - struct shash_desc hash_desc; 72 + struct polyval_ctx poly_ctx; 62 73 struct skcipher_request xctr_req; 63 74 } u; 64 - /* 65 - * This struct is allocated with extra space for one exported hash 66 - * state. Since the hash state size is not known at compile-time, we 67 - * can't add it to the struct directly. 68 - * 69 - * hashed_tweak; 70 - */ 71 75 }; 72 - 73 - static inline u8 *hctr2_hashed_tweaklen(const struct hctr2_tfm_ctx *tctx, 74 - bool has_remainder) 75 - { 76 - u8 *p = (u8 *)tctx + sizeof(*tctx); 77 - 78 - if (has_remainder) /* For messages not a multiple of block length */ 79 - p += crypto_shash_statesize(tctx->polyval); 80 - return p; 81 - } 82 - 83 - static inline u8 *hctr2_hashed_tweak(const struct hctr2_tfm_ctx *tctx, 84 - struct hctr2_request_ctx *rctx) 85 - { 86 - return (u8 *)rctx + tctx->hashed_tweak_offset; 87 - } 88 76 89 77 /* 90 78 * The input data for each HCTR2 hash step begins with a 16-byte block that ··· 74 106 * 75 107 * These precomputed hashes are stored in hctr2_tfm_ctx. 76 108 */ 77 - static int hctr2_hash_tweaklen(struct hctr2_tfm_ctx *tctx, bool has_remainder) 109 + static void hctr2_hash_tweaklens(struct hctr2_tfm_ctx *tctx) 78 110 { 79 - SHASH_DESC_ON_STACK(shash, tfm->polyval); 80 - __le64 tweak_length_block[2]; 81 - int err; 111 + struct polyval_ctx ctx; 82 112 83 - shash->tfm = tctx->polyval; 84 - memset(tweak_length_block, 0, sizeof(tweak_length_block)); 113 + for (int has_remainder = 0; has_remainder < 2; has_remainder++) { 114 + const __le64 tweak_length_block[2] = { 115 + cpu_to_le64(TWEAK_SIZE * 8 * 2 + 2 + has_remainder), 116 + }; 85 117 86 - tweak_length_block[0] = cpu_to_le64(TWEAK_SIZE * 8 * 2 + 2 + has_remainder); 87 - err = crypto_shash_init(shash); 88 - if (err) 89 - return err; 90 - err = crypto_shash_update(shash, (u8 *)tweak_length_block, 91 - POLYVAL_BLOCK_SIZE); 92 - if (err) 93 - return err; 94 - return crypto_shash_export(shash, hctr2_hashed_tweaklen(tctx, has_remainder)); 118 + polyval_init(&ctx, &tctx->poly_key); 119 + polyval_update(&ctx, (const u8 *)&tweak_length_block, 120 + sizeof(tweak_length_block)); 121 + static_assert(sizeof(tweak_length_block) == POLYVAL_BLOCK_SIZE); 122 + polyval_export_blkaligned( 123 + &ctx, &tctx->hashed_tweaklens[has_remainder]); 124 + } 125 + memzero_explicit(&ctx, sizeof(ctx)); 95 126 } 96 127 97 128 static int hctr2_setkey(struct crypto_skcipher *tfm, const u8 *key, ··· 123 156 tctx->L[0] = 0x01; 124 157 crypto_cipher_encrypt_one(tctx->blockcipher, tctx->L, tctx->L); 125 158 126 - crypto_shash_clear_flags(tctx->polyval, CRYPTO_TFM_REQ_MASK); 127 - crypto_shash_set_flags(tctx->polyval, crypto_skcipher_get_flags(tfm) & 128 - CRYPTO_TFM_REQ_MASK); 129 - err = crypto_shash_setkey(tctx->polyval, hbar, BLOCKCIPHER_BLOCK_SIZE); 130 - if (err) 131 - return err; 159 + static_assert(sizeof(hbar) == POLYVAL_BLOCK_SIZE); 160 + polyval_preparekey(&tctx->poly_key, hbar); 132 161 memzero_explicit(hbar, sizeof(hbar)); 133 162 134 - return hctr2_hash_tweaklen(tctx, true) ?: hctr2_hash_tweaklen(tctx, false); 163 + hctr2_hash_tweaklens(tctx); 164 + return 0; 135 165 } 136 166 137 - static int hctr2_hash_tweak(struct skcipher_request *req) 167 + static void hctr2_hash_tweak(struct skcipher_request *req) 138 168 { 139 169 struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); 140 170 const struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm); 141 171 struct hctr2_request_ctx *rctx = skcipher_request_ctx(req); 142 - struct shash_desc *hash_desc = &rctx->u.hash_desc; 143 - int err; 172 + struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx; 144 173 bool has_remainder = req->cryptlen % POLYVAL_BLOCK_SIZE; 145 174 146 - hash_desc->tfm = tctx->polyval; 147 - err = crypto_shash_import(hash_desc, hctr2_hashed_tweaklen(tctx, has_remainder)); 148 - if (err) 149 - return err; 150 - err = crypto_shash_update(hash_desc, req->iv, TWEAK_SIZE); 151 - if (err) 152 - return err; 175 + polyval_import_blkaligned(poly_ctx, &tctx->poly_key, 176 + &tctx->hashed_tweaklens[has_remainder]); 177 + polyval_update(poly_ctx, req->iv, TWEAK_SIZE); 153 178 154 179 // Store the hashed tweak, since we need it when computing both 155 180 // H(T || N) and H(T || V). 156 - return crypto_shash_export(hash_desc, hctr2_hashed_tweak(tctx, rctx)); 181 + static_assert(TWEAK_SIZE % POLYVAL_BLOCK_SIZE == 0); 182 + polyval_export_blkaligned(poly_ctx, &rctx->hashed_tweak); 157 183 } 158 184 159 - static int hctr2_hash_message(struct skcipher_request *req, 160 - struct scatterlist *sgl, 161 - u8 digest[POLYVAL_DIGEST_SIZE]) 185 + static void hctr2_hash_message(struct skcipher_request *req, 186 + struct scatterlist *sgl, 187 + u8 digest[POLYVAL_DIGEST_SIZE]) 162 188 { 163 - static const u8 padding[BLOCKCIPHER_BLOCK_SIZE] = { 0x1 }; 189 + static const u8 padding = 0x1; 164 190 struct hctr2_request_ctx *rctx = skcipher_request_ctx(req); 165 - struct shash_desc *hash_desc = &rctx->u.hash_desc; 191 + struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx; 166 192 const unsigned int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE; 167 193 struct sg_mapping_iter miter; 168 - unsigned int remainder = bulk_len % BLOCKCIPHER_BLOCK_SIZE; 169 194 int i; 170 - int err = 0; 171 195 int n = 0; 172 196 173 197 sg_miter_start(&miter, sgl, sg_nents(sgl), ··· 166 208 for (i = 0; i < bulk_len; i += n) { 167 209 sg_miter_next(&miter); 168 210 n = min_t(unsigned int, miter.length, bulk_len - i); 169 - err = crypto_shash_update(hash_desc, miter.addr, n); 170 - if (err) 171 - break; 211 + polyval_update(poly_ctx, miter.addr, n); 172 212 } 173 213 sg_miter_stop(&miter); 174 214 175 - if (err) 176 - return err; 177 - 178 - if (remainder) { 179 - err = crypto_shash_update(hash_desc, padding, 180 - BLOCKCIPHER_BLOCK_SIZE - remainder); 181 - if (err) 182 - return err; 183 - } 184 - return crypto_shash_final(hash_desc, digest); 215 + if (req->cryptlen % BLOCKCIPHER_BLOCK_SIZE) 216 + polyval_update(poly_ctx, &padding, 1); 217 + polyval_final(poly_ctx, digest); 185 218 } 186 219 187 220 static int hctr2_finish(struct skcipher_request *req) ··· 180 231 struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); 181 232 const struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm); 182 233 struct hctr2_request_ctx *rctx = skcipher_request_ctx(req); 234 + struct polyval_ctx *poly_ctx = &rctx->u.poly_ctx; 183 235 u8 digest[POLYVAL_DIGEST_SIZE]; 184 - struct shash_desc *hash_desc = &rctx->u.hash_desc; 185 - int err; 186 236 187 237 // U = UU ^ H(T || V) 188 238 // or M = MM ^ H(T || N) 189 - hash_desc->tfm = tctx->polyval; 190 - err = crypto_shash_import(hash_desc, hctr2_hashed_tweak(tctx, rctx)); 191 - if (err) 192 - return err; 193 - err = hctr2_hash_message(req, rctx->bulk_part_dst, digest); 194 - if (err) 195 - return err; 239 + polyval_import_blkaligned(poly_ctx, &tctx->poly_key, 240 + &rctx->hashed_tweak); 241 + hctr2_hash_message(req, rctx->bulk_part_dst, digest); 196 242 crypto_xor(rctx->first_block, digest, BLOCKCIPHER_BLOCK_SIZE); 197 243 198 244 // Copy U (or M) into dst scatterlist ··· 213 269 struct hctr2_request_ctx *rctx = skcipher_request_ctx(req); 214 270 u8 digest[POLYVAL_DIGEST_SIZE]; 215 271 int bulk_len = req->cryptlen - BLOCKCIPHER_BLOCK_SIZE; 216 - int err; 217 272 218 273 // Requests must be at least one block 219 274 if (req->cryptlen < BLOCKCIPHER_BLOCK_SIZE) ··· 230 287 231 288 // MM = M ^ H(T || N) 232 289 // or UU = U ^ H(T || V) 233 - err = hctr2_hash_tweak(req); 234 - if (err) 235 - return err; 236 - err = hctr2_hash_message(req, rctx->bulk_part_src, digest); 237 - if (err) 238 - return err; 290 + hctr2_hash_tweak(req); 291 + hctr2_hash_message(req, rctx->bulk_part_src, digest); 239 292 crypto_xor(digest, rctx->first_block, BLOCKCIPHER_BLOCK_SIZE); 240 293 241 294 // UU = E(MM) ··· 277 338 struct hctr2_tfm_ctx *tctx = crypto_skcipher_ctx(tfm); 278 339 struct crypto_skcipher *xctr; 279 340 struct crypto_cipher *blockcipher; 280 - struct crypto_shash *polyval; 281 - unsigned int subreq_size; 282 341 int err; 283 342 284 343 xctr = crypto_spawn_skcipher(&ictx->xctr_spawn); ··· 289 352 goto err_free_xctr; 290 353 } 291 354 292 - polyval = crypto_spawn_shash(&ictx->polyval_spawn); 293 - if (IS_ERR(polyval)) { 294 - err = PTR_ERR(polyval); 295 - goto err_free_blockcipher; 296 - } 297 - 298 355 tctx->xctr = xctr; 299 356 tctx->blockcipher = blockcipher; 300 - tctx->polyval = polyval; 301 357 302 358 BUILD_BUG_ON(offsetofend(struct hctr2_request_ctx, u) != 303 359 sizeof(struct hctr2_request_ctx)); 304 - subreq_size = max(sizeof_field(struct hctr2_request_ctx, u.hash_desc) + 305 - crypto_shash_descsize(polyval), 306 - sizeof_field(struct hctr2_request_ctx, u.xctr_req) + 307 - crypto_skcipher_reqsize(xctr)); 308 - 309 - tctx->hashed_tweak_offset = offsetof(struct hctr2_request_ctx, u) + 310 - subreq_size; 311 - crypto_skcipher_set_reqsize(tfm, tctx->hashed_tweak_offset + 312 - crypto_shash_statesize(polyval)); 360 + crypto_skcipher_set_reqsize( 361 + tfm, max(sizeof(struct hctr2_request_ctx), 362 + offsetofend(struct hctr2_request_ctx, u.xctr_req) + 363 + crypto_skcipher_reqsize(xctr))); 313 364 return 0; 314 365 315 - err_free_blockcipher: 316 - crypto_free_cipher(blockcipher); 317 366 err_free_xctr: 318 367 crypto_free_skcipher(xctr); 319 368 return err; ··· 311 388 312 389 crypto_free_cipher(tctx->blockcipher); 313 390 crypto_free_skcipher(tctx->xctr); 314 - crypto_free_shash(tctx->polyval); 315 391 } 316 392 317 393 static void hctr2_free_instance(struct skcipher_instance *inst) ··· 319 397 320 398 crypto_drop_cipher(&ictx->blockcipher_spawn); 321 399 crypto_drop_skcipher(&ictx->xctr_spawn); 322 - crypto_drop_shash(&ictx->polyval_spawn); 323 400 kfree(inst); 324 401 } 325 402 326 - static int hctr2_create_common(struct crypto_template *tmpl, 327 - struct rtattr **tb, 328 - const char *xctr_name, 329 - const char *polyval_name) 403 + static int hctr2_create_common(struct crypto_template *tmpl, struct rtattr **tb, 404 + const char *xctr_name) 330 405 { 331 406 struct skcipher_alg_common *xctr_alg; 332 407 u32 mask; 333 408 struct skcipher_instance *inst; 334 409 struct hctr2_instance_ctx *ictx; 335 410 struct crypto_alg *blockcipher_alg; 336 - struct shash_alg *polyval_alg; 337 411 char blockcipher_name[CRYPTO_MAX_ALG_NAME]; 338 412 int len; 339 413 int err; ··· 375 457 if (blockcipher_alg->cra_blocksize != BLOCKCIPHER_BLOCK_SIZE) 376 458 goto err_free_inst; 377 459 378 - /* Polyval ε-∆U hash function */ 379 - err = crypto_grab_shash(&ictx->polyval_spawn, 380 - skcipher_crypto_instance(inst), 381 - polyval_name, 0, mask); 382 - if (err) 383 - goto err_free_inst; 384 - polyval_alg = crypto_spawn_shash_alg(&ictx->polyval_spawn); 385 - 386 - /* Ensure Polyval is being used */ 387 - err = -EINVAL; 388 - if (strcmp(polyval_alg->base.cra_name, "polyval") != 0) 389 - goto err_free_inst; 390 - 391 460 /* Instance fields */ 392 461 393 462 err = -ENAMETOOLONG; ··· 382 477 blockcipher_alg->cra_name) >= CRYPTO_MAX_ALG_NAME) 383 478 goto err_free_inst; 384 479 if (snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME, 385 - "hctr2_base(%s,%s)", 386 - xctr_alg->base.cra_driver_name, 387 - polyval_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME) 480 + "hctr2_base(%s,polyval-lib)", 481 + xctr_alg->base.cra_driver_name) >= CRYPTO_MAX_ALG_NAME) 388 482 goto err_free_inst; 389 483 390 484 inst->alg.base.cra_blocksize = BLOCKCIPHER_BLOCK_SIZE; 391 - inst->alg.base.cra_ctxsize = sizeof(struct hctr2_tfm_ctx) + 392 - polyval_alg->statesize * 2; 485 + inst->alg.base.cra_ctxsize = sizeof(struct hctr2_tfm_ctx); 393 486 inst->alg.base.cra_alignmask = xctr_alg->base.cra_alignmask; 394 - /* 395 - * The hash function is called twice, so it is weighted higher than the 396 - * xctr and blockcipher. 397 - */ 398 487 inst->alg.base.cra_priority = (2 * xctr_alg->base.cra_priority + 399 - 4 * polyval_alg->base.cra_priority + 400 - blockcipher_alg->cra_priority) / 7; 488 + blockcipher_alg->cra_priority) / 489 + 3; 401 490 402 491 inst->alg.setkey = hctr2_setkey; 403 492 inst->alg.encrypt = hctr2_encrypt; ··· 424 525 polyval_name = crypto_attr_alg_name(tb[2]); 425 526 if (IS_ERR(polyval_name)) 426 527 return PTR_ERR(polyval_name); 528 + if (strcmp(polyval_name, "polyval") != 0 && 529 + strcmp(polyval_name, "polyval-lib") != 0) 530 + return -ENOENT; 427 531 428 - return hctr2_create_common(tmpl, tb, xctr_name, polyval_name); 532 + return hctr2_create_common(tmpl, tb, xctr_name); 429 533 } 430 534 431 535 static int hctr2_create(struct crypto_template *tmpl, struct rtattr **tb) ··· 444 542 blockcipher_name) >= CRYPTO_MAX_ALG_NAME) 445 543 return -ENAMETOOLONG; 446 544 447 - return hctr2_create_common(tmpl, tb, xctr_name, "polyval"); 545 + return hctr2_create_common(tmpl, tb, xctr_name); 448 546 } 449 547 450 548 static struct crypto_template hctr2_tmpls[] = {
+2 -10
crypto/jitterentropy-kcapi.c
··· 48 48 49 49 #include "jitterentropy.h" 50 50 51 - #define JENT_CONDITIONING_HASH "sha3-256-generic" 51 + #define JENT_CONDITIONING_HASH "sha3-256" 52 52 53 53 /*************************************************************************** 54 54 * Helper function ··· 230 230 231 231 spin_lock_init(&rng->jent_lock); 232 232 233 - /* 234 - * Use SHA3-256 as conditioner. We allocate only the generic 235 - * implementation as we are not interested in high-performance. The 236 - * execution time of the SHA3 operation is measured and adds to the 237 - * Jitter RNG's unpredictable behavior. If we have a slower hash 238 - * implementation, the execution timing variations are larger. When 239 - * using a fast implementation, we would need to call it more often 240 - * as its variations are lower. 241 - */ 233 + /* Use SHA3-256 as conditioner */ 242 234 hash = crypto_alloc_shash(JENT_CONDITIONING_HASH, 0, 0); 243 235 if (IS_ERR(hash)) { 244 236 pr_err("Cannot allocate conditioning digest\n");
-205
crypto/polyval-generic.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-only 2 - /* 3 - * POLYVAL: hash function for HCTR2. 4 - * 5 - * Copyright (c) 2007 Nokia Siemens Networks - Mikko Herranen <mh1@iki.fi> 6 - * Copyright (c) 2009 Intel Corp. 7 - * Author: Huang Ying <ying.huang@intel.com> 8 - * Copyright 2021 Google LLC 9 - */ 10 - 11 - /* 12 - * Code based on crypto/ghash-generic.c 13 - * 14 - * POLYVAL is a keyed hash function similar to GHASH. POLYVAL uses a different 15 - * modulus for finite field multiplication which makes hardware accelerated 16 - * implementations on little-endian machines faster. POLYVAL is used in the 17 - * kernel to implement HCTR2, but was originally specified for AES-GCM-SIV 18 - * (RFC 8452). 19 - * 20 - * For more information see: 21 - * Length-preserving encryption with HCTR2: 22 - * https://eprint.iacr.org/2021/1441.pdf 23 - * AES-GCM-SIV: Nonce Misuse-Resistant Authenticated Encryption: 24 - * https://datatracker.ietf.org/doc/html/rfc8452 25 - * 26 - * Like GHASH, POLYVAL is not a cryptographic hash function and should 27 - * not be used outside of crypto modes explicitly designed to use POLYVAL. 28 - * 29 - * This implementation uses a convenient trick involving the GHASH and POLYVAL 30 - * fields. This trick allows multiplication in the POLYVAL field to be 31 - * implemented by using multiplication in the GHASH field as a subroutine. An 32 - * element of the POLYVAL field can be converted to an element of the GHASH 33 - * field by computing x*REVERSE(a), where REVERSE reverses the byte-ordering of 34 - * a. Similarly, an element of the GHASH field can be converted back to the 35 - * POLYVAL field by computing REVERSE(x^{-1}*a). For more information, see: 36 - * https://datatracker.ietf.org/doc/html/rfc8452#appendix-A 37 - * 38 - * By using this trick, we do not need to implement the POLYVAL field for the 39 - * generic implementation. 40 - * 41 - * Warning: this generic implementation is not intended to be used in practice 42 - * and is not constant time. For practical use, a hardware accelerated 43 - * implementation of POLYVAL should be used instead. 44 - * 45 - */ 46 - 47 - #include <crypto/gf128mul.h> 48 - #include <crypto/internal/hash.h> 49 - #include <crypto/polyval.h> 50 - #include <crypto/utils.h> 51 - #include <linux/errno.h> 52 - #include <linux/kernel.h> 53 - #include <linux/module.h> 54 - #include <linux/string.h> 55 - #include <linux/unaligned.h> 56 - 57 - struct polyval_tfm_ctx { 58 - struct gf128mul_4k *gf128; 59 - }; 60 - 61 - struct polyval_desc_ctx { 62 - union { 63 - u8 buffer[POLYVAL_BLOCK_SIZE]; 64 - be128 buffer128; 65 - }; 66 - }; 67 - 68 - static void copy_and_reverse(u8 dst[POLYVAL_BLOCK_SIZE], 69 - const u8 src[POLYVAL_BLOCK_SIZE]) 70 - { 71 - u64 a = get_unaligned((const u64 *)&src[0]); 72 - u64 b = get_unaligned((const u64 *)&src[8]); 73 - 74 - put_unaligned(swab64(a), (u64 *)&dst[8]); 75 - put_unaligned(swab64(b), (u64 *)&dst[0]); 76 - } 77 - 78 - static int polyval_setkey(struct crypto_shash *tfm, 79 - const u8 *key, unsigned int keylen) 80 - { 81 - struct polyval_tfm_ctx *ctx = crypto_shash_ctx(tfm); 82 - be128 k; 83 - 84 - if (keylen != POLYVAL_BLOCK_SIZE) 85 - return -EINVAL; 86 - 87 - gf128mul_free_4k(ctx->gf128); 88 - 89 - BUILD_BUG_ON(sizeof(k) != POLYVAL_BLOCK_SIZE); 90 - copy_and_reverse((u8 *)&k, key); 91 - gf128mul_x_lle(&k, &k); 92 - 93 - ctx->gf128 = gf128mul_init_4k_lle(&k); 94 - memzero_explicit(&k, POLYVAL_BLOCK_SIZE); 95 - 96 - if (!ctx->gf128) 97 - return -ENOMEM; 98 - 99 - return 0; 100 - } 101 - 102 - static int polyval_init(struct shash_desc *desc) 103 - { 104 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 105 - 106 - memset(dctx, 0, sizeof(*dctx)); 107 - 108 - return 0; 109 - } 110 - 111 - static int polyval_update(struct shash_desc *desc, 112 - const u8 *src, unsigned int srclen) 113 - { 114 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 115 - const struct polyval_tfm_ctx *ctx = crypto_shash_ctx(desc->tfm); 116 - u8 tmp[POLYVAL_BLOCK_SIZE]; 117 - 118 - do { 119 - copy_and_reverse(tmp, src); 120 - crypto_xor(dctx->buffer, tmp, POLYVAL_BLOCK_SIZE); 121 - gf128mul_4k_lle(&dctx->buffer128, ctx->gf128); 122 - src += POLYVAL_BLOCK_SIZE; 123 - srclen -= POLYVAL_BLOCK_SIZE; 124 - } while (srclen >= POLYVAL_BLOCK_SIZE); 125 - 126 - return srclen; 127 - } 128 - 129 - static int polyval_finup(struct shash_desc *desc, const u8 *src, 130 - unsigned int len, u8 *dst) 131 - { 132 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 133 - 134 - if (len) { 135 - u8 tmp[POLYVAL_BLOCK_SIZE] = {}; 136 - 137 - memcpy(tmp, src, len); 138 - polyval_update(desc, tmp, POLYVAL_BLOCK_SIZE); 139 - } 140 - copy_and_reverse(dst, dctx->buffer); 141 - return 0; 142 - } 143 - 144 - static int polyval_export(struct shash_desc *desc, void *out) 145 - { 146 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 147 - 148 - copy_and_reverse(out, dctx->buffer); 149 - return 0; 150 - } 151 - 152 - static int polyval_import(struct shash_desc *desc, const void *in) 153 - { 154 - struct polyval_desc_ctx *dctx = shash_desc_ctx(desc); 155 - 156 - copy_and_reverse(dctx->buffer, in); 157 - return 0; 158 - } 159 - 160 - static void polyval_exit_tfm(struct crypto_shash *tfm) 161 - { 162 - struct polyval_tfm_ctx *ctx = crypto_shash_ctx(tfm); 163 - 164 - gf128mul_free_4k(ctx->gf128); 165 - } 166 - 167 - static struct shash_alg polyval_alg = { 168 - .digestsize = POLYVAL_DIGEST_SIZE, 169 - .init = polyval_init, 170 - .update = polyval_update, 171 - .finup = polyval_finup, 172 - .setkey = polyval_setkey, 173 - .export = polyval_export, 174 - .import = polyval_import, 175 - .exit_tfm = polyval_exit_tfm, 176 - .statesize = sizeof(struct polyval_desc_ctx), 177 - .descsize = sizeof(struct polyval_desc_ctx), 178 - .base = { 179 - .cra_name = "polyval", 180 - .cra_driver_name = "polyval-generic", 181 - .cra_priority = 100, 182 - .cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 183 - .cra_blocksize = POLYVAL_BLOCK_SIZE, 184 - .cra_ctxsize = sizeof(struct polyval_tfm_ctx), 185 - .cra_module = THIS_MODULE, 186 - }, 187 - }; 188 - 189 - static int __init polyval_mod_init(void) 190 - { 191 - return crypto_register_shash(&polyval_alg); 192 - } 193 - 194 - static void __exit polyval_mod_exit(void) 195 - { 196 - crypto_unregister_shash(&polyval_alg); 197 - } 198 - 199 - module_init(polyval_mod_init); 200 - module_exit(polyval_mod_exit); 201 - 202 - MODULE_LICENSE("GPL"); 203 - MODULE_DESCRIPTION("POLYVAL hash function"); 204 - MODULE_ALIAS_CRYPTO("polyval"); 205 - MODULE_ALIAS_CRYPTO("polyval-generic");
+166
crypto/sha3.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * Crypto API support for SHA-3 4 + * (https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf) 5 + */ 6 + #include <crypto/internal/hash.h> 7 + #include <crypto/sha3.h> 8 + #include <linux/kernel.h> 9 + #include <linux/module.h> 10 + 11 + #define SHA3_CTX(desc) ((struct sha3_ctx *)shash_desc_ctx(desc)) 12 + 13 + static int crypto_sha3_224_init(struct shash_desc *desc) 14 + { 15 + sha3_224_init(SHA3_CTX(desc)); 16 + return 0; 17 + } 18 + 19 + static int crypto_sha3_256_init(struct shash_desc *desc) 20 + { 21 + sha3_256_init(SHA3_CTX(desc)); 22 + return 0; 23 + } 24 + 25 + static int crypto_sha3_384_init(struct shash_desc *desc) 26 + { 27 + sha3_384_init(SHA3_CTX(desc)); 28 + return 0; 29 + } 30 + 31 + static int crypto_sha3_512_init(struct shash_desc *desc) 32 + { 33 + sha3_512_init(SHA3_CTX(desc)); 34 + return 0; 35 + } 36 + 37 + static int crypto_sha3_update(struct shash_desc *desc, const u8 *data, 38 + unsigned int len) 39 + { 40 + sha3_update(SHA3_CTX(desc), data, len); 41 + return 0; 42 + } 43 + 44 + static int crypto_sha3_final(struct shash_desc *desc, u8 *out) 45 + { 46 + sha3_final(SHA3_CTX(desc), out); 47 + return 0; 48 + } 49 + 50 + static int crypto_sha3_224_digest(struct shash_desc *desc, 51 + const u8 *data, unsigned int len, u8 *out) 52 + { 53 + sha3_224(data, len, out); 54 + return 0; 55 + } 56 + 57 + static int crypto_sha3_256_digest(struct shash_desc *desc, 58 + const u8 *data, unsigned int len, u8 *out) 59 + { 60 + sha3_256(data, len, out); 61 + return 0; 62 + } 63 + 64 + static int crypto_sha3_384_digest(struct shash_desc *desc, 65 + const u8 *data, unsigned int len, u8 *out) 66 + { 67 + sha3_384(data, len, out); 68 + return 0; 69 + } 70 + 71 + static int crypto_sha3_512_digest(struct shash_desc *desc, 72 + const u8 *data, unsigned int len, u8 *out) 73 + { 74 + sha3_512(data, len, out); 75 + return 0; 76 + } 77 + 78 + static int crypto_sha3_export_core(struct shash_desc *desc, void *out) 79 + { 80 + memcpy(out, SHA3_CTX(desc), sizeof(struct sha3_ctx)); 81 + return 0; 82 + } 83 + 84 + static int crypto_sha3_import_core(struct shash_desc *desc, const void *in) 85 + { 86 + memcpy(SHA3_CTX(desc), in, sizeof(struct sha3_ctx)); 87 + return 0; 88 + } 89 + 90 + static struct shash_alg algs[] = { { 91 + .digestsize = SHA3_224_DIGEST_SIZE, 92 + .init = crypto_sha3_224_init, 93 + .update = crypto_sha3_update, 94 + .final = crypto_sha3_final, 95 + .digest = crypto_sha3_224_digest, 96 + .export_core = crypto_sha3_export_core, 97 + .import_core = crypto_sha3_import_core, 98 + .descsize = sizeof(struct sha3_ctx), 99 + .base.cra_name = "sha3-224", 100 + .base.cra_driver_name = "sha3-224-lib", 101 + .base.cra_blocksize = SHA3_224_BLOCK_SIZE, 102 + .base.cra_module = THIS_MODULE, 103 + }, { 104 + .digestsize = SHA3_256_DIGEST_SIZE, 105 + .init = crypto_sha3_256_init, 106 + .update = crypto_sha3_update, 107 + .final = crypto_sha3_final, 108 + .digest = crypto_sha3_256_digest, 109 + .export_core = crypto_sha3_export_core, 110 + .import_core = crypto_sha3_import_core, 111 + .descsize = sizeof(struct sha3_ctx), 112 + .base.cra_name = "sha3-256", 113 + .base.cra_driver_name = "sha3-256-lib", 114 + .base.cra_blocksize = SHA3_256_BLOCK_SIZE, 115 + .base.cra_module = THIS_MODULE, 116 + }, { 117 + .digestsize = SHA3_384_DIGEST_SIZE, 118 + .init = crypto_sha3_384_init, 119 + .update = crypto_sha3_update, 120 + .final = crypto_sha3_final, 121 + .digest = crypto_sha3_384_digest, 122 + .export_core = crypto_sha3_export_core, 123 + .import_core = crypto_sha3_import_core, 124 + .descsize = sizeof(struct sha3_ctx), 125 + .base.cra_name = "sha3-384", 126 + .base.cra_driver_name = "sha3-384-lib", 127 + .base.cra_blocksize = SHA3_384_BLOCK_SIZE, 128 + .base.cra_module = THIS_MODULE, 129 + }, { 130 + .digestsize = SHA3_512_DIGEST_SIZE, 131 + .init = crypto_sha3_512_init, 132 + .update = crypto_sha3_update, 133 + .final = crypto_sha3_final, 134 + .digest = crypto_sha3_512_digest, 135 + .export_core = crypto_sha3_export_core, 136 + .import_core = crypto_sha3_import_core, 137 + .descsize = sizeof(struct sha3_ctx), 138 + .base.cra_name = "sha3-512", 139 + .base.cra_driver_name = "sha3-512-lib", 140 + .base.cra_blocksize = SHA3_512_BLOCK_SIZE, 141 + .base.cra_module = THIS_MODULE, 142 + } }; 143 + 144 + static int __init crypto_sha3_mod_init(void) 145 + { 146 + return crypto_register_shashes(algs, ARRAY_SIZE(algs)); 147 + } 148 + module_init(crypto_sha3_mod_init); 149 + 150 + static void __exit crypto_sha3_mod_exit(void) 151 + { 152 + crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); 153 + } 154 + module_exit(crypto_sha3_mod_exit); 155 + 156 + MODULE_LICENSE("GPL"); 157 + MODULE_DESCRIPTION("Crypto API support for SHA-3"); 158 + 159 + MODULE_ALIAS_CRYPTO("sha3-224"); 160 + MODULE_ALIAS_CRYPTO("sha3-224-lib"); 161 + MODULE_ALIAS_CRYPTO("sha3-256"); 162 + MODULE_ALIAS_CRYPTO("sha3-256-lib"); 163 + MODULE_ALIAS_CRYPTO("sha3-384"); 164 + MODULE_ALIAS_CRYPTO("sha3-384-lib"); 165 + MODULE_ALIAS_CRYPTO("sha3-512"); 166 + MODULE_ALIAS_CRYPTO("sha3-512-lib");
-290
crypto/sha3_generic.c
··· 1 - // SPDX-License-Identifier: GPL-2.0-or-later 2 - /* 3 - * Cryptographic API. 4 - * 5 - * SHA-3, as specified in 6 - * https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf 7 - * 8 - * SHA-3 code by Jeff Garzik <jeff@garzik.org> 9 - * Ard Biesheuvel <ard.biesheuvel@linaro.org> 10 - */ 11 - #include <crypto/internal/hash.h> 12 - #include <crypto/sha3.h> 13 - #include <linux/kernel.h> 14 - #include <linux/module.h> 15 - #include <linux/string.h> 16 - #include <linux/unaligned.h> 17 - 18 - /* 19 - * On some 32-bit architectures (h8300), GCC ends up using 20 - * over 1 KB of stack if we inline the round calculation into the loop 21 - * in keccakf(). On the other hand, on 64-bit architectures with plenty 22 - * of [64-bit wide] general purpose registers, not inlining it severely 23 - * hurts performance. So let's use 64-bitness as a heuristic to decide 24 - * whether to inline or not. 25 - */ 26 - #ifdef CONFIG_64BIT 27 - #define SHA3_INLINE inline 28 - #else 29 - #define SHA3_INLINE noinline 30 - #endif 31 - 32 - #define KECCAK_ROUNDS 24 33 - 34 - static const u64 keccakf_rndc[24] = { 35 - 0x0000000000000001ULL, 0x0000000000008082ULL, 0x800000000000808aULL, 36 - 0x8000000080008000ULL, 0x000000000000808bULL, 0x0000000080000001ULL, 37 - 0x8000000080008081ULL, 0x8000000000008009ULL, 0x000000000000008aULL, 38 - 0x0000000000000088ULL, 0x0000000080008009ULL, 0x000000008000000aULL, 39 - 0x000000008000808bULL, 0x800000000000008bULL, 0x8000000000008089ULL, 40 - 0x8000000000008003ULL, 0x8000000000008002ULL, 0x8000000000000080ULL, 41 - 0x000000000000800aULL, 0x800000008000000aULL, 0x8000000080008081ULL, 42 - 0x8000000000008080ULL, 0x0000000080000001ULL, 0x8000000080008008ULL 43 - }; 44 - 45 - /* update the state with given number of rounds */ 46 - 47 - static SHA3_INLINE void keccakf_round(u64 st[25]) 48 - { 49 - u64 t[5], tt, bc[5]; 50 - 51 - /* Theta */ 52 - bc[0] = st[0] ^ st[5] ^ st[10] ^ st[15] ^ st[20]; 53 - bc[1] = st[1] ^ st[6] ^ st[11] ^ st[16] ^ st[21]; 54 - bc[2] = st[2] ^ st[7] ^ st[12] ^ st[17] ^ st[22]; 55 - bc[3] = st[3] ^ st[8] ^ st[13] ^ st[18] ^ st[23]; 56 - bc[4] = st[4] ^ st[9] ^ st[14] ^ st[19] ^ st[24]; 57 - 58 - t[0] = bc[4] ^ rol64(bc[1], 1); 59 - t[1] = bc[0] ^ rol64(bc[2], 1); 60 - t[2] = bc[1] ^ rol64(bc[3], 1); 61 - t[3] = bc[2] ^ rol64(bc[4], 1); 62 - t[4] = bc[3] ^ rol64(bc[0], 1); 63 - 64 - st[0] ^= t[0]; 65 - 66 - /* Rho Pi */ 67 - tt = st[1]; 68 - st[ 1] = rol64(st[ 6] ^ t[1], 44); 69 - st[ 6] = rol64(st[ 9] ^ t[4], 20); 70 - st[ 9] = rol64(st[22] ^ t[2], 61); 71 - st[22] = rol64(st[14] ^ t[4], 39); 72 - st[14] = rol64(st[20] ^ t[0], 18); 73 - st[20] = rol64(st[ 2] ^ t[2], 62); 74 - st[ 2] = rol64(st[12] ^ t[2], 43); 75 - st[12] = rol64(st[13] ^ t[3], 25); 76 - st[13] = rol64(st[19] ^ t[4], 8); 77 - st[19] = rol64(st[23] ^ t[3], 56); 78 - st[23] = rol64(st[15] ^ t[0], 41); 79 - st[15] = rol64(st[ 4] ^ t[4], 27); 80 - st[ 4] = rol64(st[24] ^ t[4], 14); 81 - st[24] = rol64(st[21] ^ t[1], 2); 82 - st[21] = rol64(st[ 8] ^ t[3], 55); 83 - st[ 8] = rol64(st[16] ^ t[1], 45); 84 - st[16] = rol64(st[ 5] ^ t[0], 36); 85 - st[ 5] = rol64(st[ 3] ^ t[3], 28); 86 - st[ 3] = rol64(st[18] ^ t[3], 21); 87 - st[18] = rol64(st[17] ^ t[2], 15); 88 - st[17] = rol64(st[11] ^ t[1], 10); 89 - st[11] = rol64(st[ 7] ^ t[2], 6); 90 - st[ 7] = rol64(st[10] ^ t[0], 3); 91 - st[10] = rol64( tt ^ t[1], 1); 92 - 93 - /* Chi */ 94 - bc[ 0] = ~st[ 1] & st[ 2]; 95 - bc[ 1] = ~st[ 2] & st[ 3]; 96 - bc[ 2] = ~st[ 3] & st[ 4]; 97 - bc[ 3] = ~st[ 4] & st[ 0]; 98 - bc[ 4] = ~st[ 0] & st[ 1]; 99 - st[ 0] ^= bc[ 0]; 100 - st[ 1] ^= bc[ 1]; 101 - st[ 2] ^= bc[ 2]; 102 - st[ 3] ^= bc[ 3]; 103 - st[ 4] ^= bc[ 4]; 104 - 105 - bc[ 0] = ~st[ 6] & st[ 7]; 106 - bc[ 1] = ~st[ 7] & st[ 8]; 107 - bc[ 2] = ~st[ 8] & st[ 9]; 108 - bc[ 3] = ~st[ 9] & st[ 5]; 109 - bc[ 4] = ~st[ 5] & st[ 6]; 110 - st[ 5] ^= bc[ 0]; 111 - st[ 6] ^= bc[ 1]; 112 - st[ 7] ^= bc[ 2]; 113 - st[ 8] ^= bc[ 3]; 114 - st[ 9] ^= bc[ 4]; 115 - 116 - bc[ 0] = ~st[11] & st[12]; 117 - bc[ 1] = ~st[12] & st[13]; 118 - bc[ 2] = ~st[13] & st[14]; 119 - bc[ 3] = ~st[14] & st[10]; 120 - bc[ 4] = ~st[10] & st[11]; 121 - st[10] ^= bc[ 0]; 122 - st[11] ^= bc[ 1]; 123 - st[12] ^= bc[ 2]; 124 - st[13] ^= bc[ 3]; 125 - st[14] ^= bc[ 4]; 126 - 127 - bc[ 0] = ~st[16] & st[17]; 128 - bc[ 1] = ~st[17] & st[18]; 129 - bc[ 2] = ~st[18] & st[19]; 130 - bc[ 3] = ~st[19] & st[15]; 131 - bc[ 4] = ~st[15] & st[16]; 132 - st[15] ^= bc[ 0]; 133 - st[16] ^= bc[ 1]; 134 - st[17] ^= bc[ 2]; 135 - st[18] ^= bc[ 3]; 136 - st[19] ^= bc[ 4]; 137 - 138 - bc[ 0] = ~st[21] & st[22]; 139 - bc[ 1] = ~st[22] & st[23]; 140 - bc[ 2] = ~st[23] & st[24]; 141 - bc[ 3] = ~st[24] & st[20]; 142 - bc[ 4] = ~st[20] & st[21]; 143 - st[20] ^= bc[ 0]; 144 - st[21] ^= bc[ 1]; 145 - st[22] ^= bc[ 2]; 146 - st[23] ^= bc[ 3]; 147 - st[24] ^= bc[ 4]; 148 - } 149 - 150 - static void keccakf(u64 st[25]) 151 - { 152 - int round; 153 - 154 - for (round = 0; round < KECCAK_ROUNDS; round++) { 155 - keccakf_round(st); 156 - /* Iota */ 157 - st[0] ^= keccakf_rndc[round]; 158 - } 159 - } 160 - 161 - int crypto_sha3_init(struct shash_desc *desc) 162 - { 163 - struct sha3_state *sctx = shash_desc_ctx(desc); 164 - 165 - memset(sctx->st, 0, sizeof(sctx->st)); 166 - return 0; 167 - } 168 - EXPORT_SYMBOL(crypto_sha3_init); 169 - 170 - static int crypto_sha3_update(struct shash_desc *desc, const u8 *data, 171 - unsigned int len) 172 - { 173 - unsigned int rsiz = crypto_shash_blocksize(desc->tfm); 174 - struct sha3_state *sctx = shash_desc_ctx(desc); 175 - unsigned int rsizw = rsiz / 8; 176 - 177 - do { 178 - int i; 179 - 180 - for (i = 0; i < rsizw; i++) 181 - sctx->st[i] ^= get_unaligned_le64(data + 8 * i); 182 - keccakf(sctx->st); 183 - 184 - data += rsiz; 185 - len -= rsiz; 186 - } while (len >= rsiz); 187 - return len; 188 - } 189 - 190 - static int crypto_sha3_finup(struct shash_desc *desc, const u8 *src, 191 - unsigned int len, u8 *out) 192 - { 193 - unsigned int digest_size = crypto_shash_digestsize(desc->tfm); 194 - unsigned int rsiz = crypto_shash_blocksize(desc->tfm); 195 - struct sha3_state *sctx = shash_desc_ctx(desc); 196 - __le64 block[SHA3_224_BLOCK_SIZE / 8] = {}; 197 - __le64 *digest = (__le64 *)out; 198 - unsigned int rsizw = rsiz / 8; 199 - u8 *p; 200 - int i; 201 - 202 - p = memcpy(block, src, len); 203 - p[len++] = 0x06; 204 - p[rsiz - 1] |= 0x80; 205 - 206 - for (i = 0; i < rsizw; i++) 207 - sctx->st[i] ^= le64_to_cpu(block[i]); 208 - memzero_explicit(block, sizeof(block)); 209 - 210 - keccakf(sctx->st); 211 - 212 - for (i = 0; i < digest_size / 8; i++) 213 - put_unaligned_le64(sctx->st[i], digest++); 214 - 215 - if (digest_size & 4) 216 - put_unaligned_le32(sctx->st[i], (__le32 *)digest); 217 - 218 - return 0; 219 - } 220 - 221 - static struct shash_alg algs[] = { { 222 - .digestsize = SHA3_224_DIGEST_SIZE, 223 - .init = crypto_sha3_init, 224 - .update = crypto_sha3_update, 225 - .finup = crypto_sha3_finup, 226 - .descsize = SHA3_STATE_SIZE, 227 - .base.cra_name = "sha3-224", 228 - .base.cra_driver_name = "sha3-224-generic", 229 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 230 - .base.cra_blocksize = SHA3_224_BLOCK_SIZE, 231 - .base.cra_module = THIS_MODULE, 232 - }, { 233 - .digestsize = SHA3_256_DIGEST_SIZE, 234 - .init = crypto_sha3_init, 235 - .update = crypto_sha3_update, 236 - .finup = crypto_sha3_finup, 237 - .descsize = SHA3_STATE_SIZE, 238 - .base.cra_name = "sha3-256", 239 - .base.cra_driver_name = "sha3-256-generic", 240 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 241 - .base.cra_blocksize = SHA3_256_BLOCK_SIZE, 242 - .base.cra_module = THIS_MODULE, 243 - }, { 244 - .digestsize = SHA3_384_DIGEST_SIZE, 245 - .init = crypto_sha3_init, 246 - .update = crypto_sha3_update, 247 - .finup = crypto_sha3_finup, 248 - .descsize = SHA3_STATE_SIZE, 249 - .base.cra_name = "sha3-384", 250 - .base.cra_driver_name = "sha3-384-generic", 251 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 252 - .base.cra_blocksize = SHA3_384_BLOCK_SIZE, 253 - .base.cra_module = THIS_MODULE, 254 - }, { 255 - .digestsize = SHA3_512_DIGEST_SIZE, 256 - .init = crypto_sha3_init, 257 - .update = crypto_sha3_update, 258 - .finup = crypto_sha3_finup, 259 - .descsize = SHA3_STATE_SIZE, 260 - .base.cra_name = "sha3-512", 261 - .base.cra_driver_name = "sha3-512-generic", 262 - .base.cra_flags = CRYPTO_AHASH_ALG_BLOCK_ONLY, 263 - .base.cra_blocksize = SHA3_512_BLOCK_SIZE, 264 - .base.cra_module = THIS_MODULE, 265 - } }; 266 - 267 - static int __init sha3_generic_mod_init(void) 268 - { 269 - return crypto_register_shashes(algs, ARRAY_SIZE(algs)); 270 - } 271 - 272 - static void __exit sha3_generic_mod_fini(void) 273 - { 274 - crypto_unregister_shashes(algs, ARRAY_SIZE(algs)); 275 - } 276 - 277 - module_init(sha3_generic_mod_init); 278 - module_exit(sha3_generic_mod_fini); 279 - 280 - MODULE_LICENSE("GPL"); 281 - MODULE_DESCRIPTION("SHA-3 Secure Hash Algorithm"); 282 - 283 - MODULE_ALIAS_CRYPTO("sha3-224"); 284 - MODULE_ALIAS_CRYPTO("sha3-224-generic"); 285 - MODULE_ALIAS_CRYPTO("sha3-256"); 286 - MODULE_ALIAS_CRYPTO("sha3-256-generic"); 287 - MODULE_ALIAS_CRYPTO("sha3-384"); 288 - MODULE_ALIAS_CRYPTO("sha3-384-generic"); 289 - MODULE_ALIAS_CRYPTO("sha3-512"); 290 - MODULE_ALIAS_CRYPTO("sha3-512-generic");
+13 -2
crypto/testmgr.c
··· 4332 4332 .fips_allowed = 1, 4333 4333 }, { 4334 4334 .alg = "blake2b-160", 4335 + .generic_driver = "blake2b-160-lib", 4335 4336 .test = alg_test_hash, 4336 4337 .fips_allowed = 0, 4337 4338 .suite = { ··· 4340 4339 } 4341 4340 }, { 4342 4341 .alg = "blake2b-256", 4342 + .generic_driver = "blake2b-256-lib", 4343 4343 .test = alg_test_hash, 4344 4344 .fips_allowed = 0, 4345 4345 .suite = { ··· 4348 4346 } 4349 4347 }, { 4350 4348 .alg = "blake2b-384", 4349 + .generic_driver = "blake2b-384-lib", 4351 4350 .test = alg_test_hash, 4352 4351 .fips_allowed = 0, 4353 4352 .suite = { ··· 4356 4353 } 4357 4354 }, { 4358 4355 .alg = "blake2b-512", 4356 + .generic_driver = "blake2b-512-lib", 4359 4357 .test = alg_test_hash, 4360 4358 .fips_allowed = 0, 4361 4359 .suite = { ··· 5059 5055 } 5060 5056 }, { 5061 5057 .alg = "hctr2(aes)", 5062 - .generic_driver = 5063 - "hctr2_base(xctr(aes-generic),polyval-generic)", 5058 + .generic_driver = "hctr2_base(xctr(aes-generic),polyval-lib)", 5064 5059 .test = alg_test_skcipher, 5065 5060 .suite = { 5066 5061 .cipher = __VECS(aes_hctr2_tv_template) ··· 5103 5100 } 5104 5101 }, { 5105 5102 .alg = "hmac(sha3-224)", 5103 + .generic_driver = "hmac(sha3-224-lib)", 5106 5104 .test = alg_test_hash, 5107 5105 .fips_allowed = 1, 5108 5106 .suite = { ··· 5111 5107 } 5112 5108 }, { 5113 5109 .alg = "hmac(sha3-256)", 5110 + .generic_driver = "hmac(sha3-256-lib)", 5114 5111 .test = alg_test_hash, 5115 5112 .fips_allowed = 1, 5116 5113 .suite = { ··· 5119 5114 } 5120 5115 }, { 5121 5116 .alg = "hmac(sha3-384)", 5117 + .generic_driver = "hmac(sha3-384-lib)", 5122 5118 .test = alg_test_hash, 5123 5119 .fips_allowed = 1, 5124 5120 .suite = { ··· 5127 5121 } 5128 5122 }, { 5129 5123 .alg = "hmac(sha3-512)", 5124 + .generic_driver = "hmac(sha3-512-lib)", 5130 5125 .test = alg_test_hash, 5131 5126 .fips_allowed = 1, 5132 5127 .suite = { ··· 5481 5474 } 5482 5475 }, { 5483 5476 .alg = "sha3-224", 5477 + .generic_driver = "sha3-224-lib", 5484 5478 .test = alg_test_hash, 5485 5479 .fips_allowed = 1, 5486 5480 .suite = { ··· 5489 5481 } 5490 5482 }, { 5491 5483 .alg = "sha3-256", 5484 + .generic_driver = "sha3-256-lib", 5492 5485 .test = alg_test_hash, 5493 5486 .fips_allowed = 1, 5494 5487 .suite = { ··· 5497 5488 } 5498 5489 }, { 5499 5490 .alg = "sha3-384", 5491 + .generic_driver = "sha3-384-lib", 5500 5492 .test = alg_test_hash, 5501 5493 .fips_allowed = 1, 5502 5494 .suite = { ··· 5505 5495 } 5506 5496 }, { 5507 5497 .alg = "sha3-512", 5498 + .generic_driver = "sha3-512-lib", 5508 5499 .test = alg_test_hash, 5509 5500 .fips_allowed = 1, 5510 5501 .suite = {
+3 -3
drivers/char/random.c
··· 636 636 }; 637 637 638 638 static struct { 639 - struct blake2s_state hash; 639 + struct blake2s_ctx hash; 640 640 spinlock_t lock; 641 641 unsigned int init_bits; 642 642 } input_pool = { ··· 701 701 702 702 /* next_key = HASHPRF(seed, RDSEED || 0) */ 703 703 block.counter = 0; 704 - blake2s(next_key, (u8 *)&block, seed, sizeof(next_key), sizeof(block), sizeof(seed)); 704 + blake2s(seed, sizeof(seed), (const u8 *)&block, sizeof(block), next_key, sizeof(next_key)); 705 705 blake2s_init_key(&input_pool.hash, BLAKE2S_HASH_SIZE, next_key, sizeof(next_key)); 706 706 707 707 spin_unlock_irqrestore(&input_pool.lock, flags); ··· 711 711 i = min_t(size_t, len, BLAKE2S_HASH_SIZE); 712 712 /* output = HASHPRF(seed, RDSEED || ++counter) */ 713 713 ++block.counter; 714 - blake2s(buf, (u8 *)&block, seed, i, sizeof(block), sizeof(seed)); 714 + blake2s(seed, sizeof(seed), (const u8 *)&block, sizeof(block), buf, i); 715 715 len -= i; 716 716 buf += i; 717 717 }
+9 -9
drivers/net/wireguard/cookie.c
··· 33 33 const u8 pubkey[NOISE_PUBLIC_KEY_LEN], 34 34 const u8 label[COOKIE_KEY_LABEL_LEN]) 35 35 { 36 - struct blake2s_state blake; 36 + struct blake2s_ctx blake; 37 37 38 38 blake2s_init(&blake, NOISE_SYMMETRIC_KEY_LEN); 39 39 blake2s_update(&blake, label, COOKIE_KEY_LABEL_LEN); ··· 77 77 { 78 78 len = len - sizeof(struct message_macs) + 79 79 offsetof(struct message_macs, mac1); 80 - blake2s(mac1, message, key, COOKIE_LEN, len, NOISE_SYMMETRIC_KEY_LEN); 80 + blake2s(key, NOISE_SYMMETRIC_KEY_LEN, message, len, mac1, COOKIE_LEN); 81 81 } 82 82 83 83 static void compute_mac2(u8 mac2[COOKIE_LEN], const void *message, size_t len, ··· 85 85 { 86 86 len = len - sizeof(struct message_macs) + 87 87 offsetof(struct message_macs, mac2); 88 - blake2s(mac2, message, cookie, COOKIE_LEN, len, COOKIE_LEN); 88 + blake2s(cookie, COOKIE_LEN, message, len, mac2, COOKIE_LEN); 89 89 } 90 90 91 91 static void make_cookie(u8 cookie[COOKIE_LEN], struct sk_buff *skb, 92 92 struct cookie_checker *checker) 93 93 { 94 - struct blake2s_state state; 94 + struct blake2s_ctx blake; 95 95 96 96 if (wg_birthdate_has_expired(checker->secret_birthdate, 97 97 COOKIE_SECRET_MAX_AGE)) { ··· 103 103 104 104 down_read(&checker->secret_lock); 105 105 106 - blake2s_init_key(&state, COOKIE_LEN, checker->secret, NOISE_HASH_LEN); 106 + blake2s_init_key(&blake, COOKIE_LEN, checker->secret, NOISE_HASH_LEN); 107 107 if (skb->protocol == htons(ETH_P_IP)) 108 - blake2s_update(&state, (u8 *)&ip_hdr(skb)->saddr, 108 + blake2s_update(&blake, (u8 *)&ip_hdr(skb)->saddr, 109 109 sizeof(struct in_addr)); 110 110 else if (skb->protocol == htons(ETH_P_IPV6)) 111 - blake2s_update(&state, (u8 *)&ipv6_hdr(skb)->saddr, 111 + blake2s_update(&blake, (u8 *)&ipv6_hdr(skb)->saddr, 112 112 sizeof(struct in6_addr)); 113 - blake2s_update(&state, (u8 *)&udp_hdr(skb)->source, sizeof(__be16)); 114 - blake2s_final(&state, cookie); 113 + blake2s_update(&blake, (u8 *)&udp_hdr(skb)->source, sizeof(__be16)); 114 + blake2s_final(&blake, cookie); 115 115 116 116 up_read(&checker->secret_lock); 117 117 }
+16 -16
drivers/net/wireguard/noise.c
··· 33 33 34 34 void __init wg_noise_init(void) 35 35 { 36 - struct blake2s_state blake; 36 + struct blake2s_ctx blake; 37 37 38 - blake2s(handshake_init_chaining_key, handshake_name, NULL, 39 - NOISE_HASH_LEN, sizeof(handshake_name), 0); 38 + blake2s(NULL, 0, handshake_name, sizeof(handshake_name), 39 + handshake_init_chaining_key, NOISE_HASH_LEN); 40 40 blake2s_init(&blake, NOISE_HASH_LEN); 41 41 blake2s_update(&blake, handshake_init_chaining_key, NOISE_HASH_LEN); 42 42 blake2s_update(&blake, identifier_name, sizeof(identifier_name)); ··· 304 304 305 305 static void hmac(u8 *out, const u8 *in, const u8 *key, const size_t inlen, const size_t keylen) 306 306 { 307 - struct blake2s_state state; 307 + struct blake2s_ctx blake; 308 308 u8 x_key[BLAKE2S_BLOCK_SIZE] __aligned(__alignof__(u32)) = { 0 }; 309 309 u8 i_hash[BLAKE2S_HASH_SIZE] __aligned(__alignof__(u32)); 310 310 int i; 311 311 312 312 if (keylen > BLAKE2S_BLOCK_SIZE) { 313 - blake2s_init(&state, BLAKE2S_HASH_SIZE); 314 - blake2s_update(&state, key, keylen); 315 - blake2s_final(&state, x_key); 313 + blake2s_init(&blake, BLAKE2S_HASH_SIZE); 314 + blake2s_update(&blake, key, keylen); 315 + blake2s_final(&blake, x_key); 316 316 } else 317 317 memcpy(x_key, key, keylen); 318 318 319 319 for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i) 320 320 x_key[i] ^= 0x36; 321 321 322 - blake2s_init(&state, BLAKE2S_HASH_SIZE); 323 - blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE); 324 - blake2s_update(&state, in, inlen); 325 - blake2s_final(&state, i_hash); 322 + blake2s_init(&blake, BLAKE2S_HASH_SIZE); 323 + blake2s_update(&blake, x_key, BLAKE2S_BLOCK_SIZE); 324 + blake2s_update(&blake, in, inlen); 325 + blake2s_final(&blake, i_hash); 326 326 327 327 for (i = 0; i < BLAKE2S_BLOCK_SIZE; ++i) 328 328 x_key[i] ^= 0x5c ^ 0x36; 329 329 330 - blake2s_init(&state, BLAKE2S_HASH_SIZE); 331 - blake2s_update(&state, x_key, BLAKE2S_BLOCK_SIZE); 332 - blake2s_update(&state, i_hash, BLAKE2S_HASH_SIZE); 333 - blake2s_final(&state, i_hash); 330 + blake2s_init(&blake, BLAKE2S_HASH_SIZE); 331 + blake2s_update(&blake, x_key, BLAKE2S_BLOCK_SIZE); 332 + blake2s_update(&blake, i_hash, BLAKE2S_HASH_SIZE); 333 + blake2s_final(&blake, i_hash); 334 334 335 335 memcpy(out, i_hash, BLAKE2S_HASH_SIZE); 336 336 memzero_explicit(x_key, BLAKE2S_BLOCK_SIZE); ··· 431 431 432 432 static void mix_hash(u8 hash[NOISE_HASH_LEN], const u8 *src, size_t src_len) 433 433 { 434 - struct blake2s_state blake; 434 + struct blake2s_ctx blake; 435 435 436 436 blake2s_init(&blake, NOISE_HASH_LEN); 437 437 blake2s_update(&blake, hash, NOISE_HASH_LEN);
+121 -22
include/crypto/blake2b.h
··· 7 7 #include <linux/types.h> 8 8 #include <linux/string.h> 9 9 10 - struct blake2b_state { 11 - /* 'h', 't', and 'f' are used in assembly code, so keep them as-is. */ 12 - u64 h[8]; 13 - u64 t[2]; 14 - /* The true state ends here. The rest is temporary storage. */ 15 - u64 f[2]; 16 - }; 17 - 18 10 enum blake2b_lengths { 19 11 BLAKE2B_BLOCK_SIZE = 128, 20 12 BLAKE2B_HASH_SIZE = 64, 21 13 BLAKE2B_KEY_SIZE = 64, 22 - BLAKE2B_STATE_SIZE = offsetof(struct blake2b_state, f), 23 - BLAKE2B_DESC_SIZE = sizeof(struct blake2b_state), 24 14 25 15 BLAKE2B_160_HASH_SIZE = 20, 26 16 BLAKE2B_256_HASH_SIZE = 32, 27 17 BLAKE2B_384_HASH_SIZE = 48, 28 18 BLAKE2B_512_HASH_SIZE = 64, 19 + }; 20 + 21 + /** 22 + * struct blake2b_ctx - Context for hashing a message with BLAKE2b 23 + * @h: compression function state 24 + * @t: block counter 25 + * @f: finalization indicator 26 + * @buf: partial block buffer; 'buflen' bytes are valid 27 + * @buflen: number of bytes buffered in @buf 28 + * @outlen: length of output hash value in bytes, at most BLAKE2B_HASH_SIZE 29 + */ 30 + struct blake2b_ctx { 31 + /* 'h', 't', and 'f' are used in assembly code, so keep them as-is. */ 32 + u64 h[8]; 33 + u64 t[2]; 34 + u64 f[2]; 35 + u8 buf[BLAKE2B_BLOCK_SIZE]; 36 + unsigned int buflen; 37 + unsigned int outlen; 29 38 }; 30 39 31 40 enum blake2b_iv { ··· 48 39 BLAKE2B_IV7 = 0x5BE0CD19137E2179ULL, 49 40 }; 50 41 51 - static inline void __blake2b_init(struct blake2b_state *state, size_t outlen, 52 - size_t keylen) 42 + static inline void __blake2b_init(struct blake2b_ctx *ctx, size_t outlen, 43 + const void *key, size_t keylen) 53 44 { 54 - state->h[0] = BLAKE2B_IV0 ^ (0x01010000 | keylen << 8 | outlen); 55 - state->h[1] = BLAKE2B_IV1; 56 - state->h[2] = BLAKE2B_IV2; 57 - state->h[3] = BLAKE2B_IV3; 58 - state->h[4] = BLAKE2B_IV4; 59 - state->h[5] = BLAKE2B_IV5; 60 - state->h[6] = BLAKE2B_IV6; 61 - state->h[7] = BLAKE2B_IV7; 62 - state->t[0] = 0; 63 - state->t[1] = 0; 45 + ctx->h[0] = BLAKE2B_IV0 ^ (0x01010000 | keylen << 8 | outlen); 46 + ctx->h[1] = BLAKE2B_IV1; 47 + ctx->h[2] = BLAKE2B_IV2; 48 + ctx->h[3] = BLAKE2B_IV3; 49 + ctx->h[4] = BLAKE2B_IV4; 50 + ctx->h[5] = BLAKE2B_IV5; 51 + ctx->h[6] = BLAKE2B_IV6; 52 + ctx->h[7] = BLAKE2B_IV7; 53 + ctx->t[0] = 0; 54 + ctx->t[1] = 0; 55 + ctx->f[0] = 0; 56 + ctx->f[1] = 0; 57 + ctx->buflen = 0; 58 + ctx->outlen = outlen; 59 + if (keylen) { 60 + memcpy(ctx->buf, key, keylen); 61 + memset(&ctx->buf[keylen], 0, BLAKE2B_BLOCK_SIZE - keylen); 62 + ctx->buflen = BLAKE2B_BLOCK_SIZE; 63 + } 64 + } 65 + 66 + /** 67 + * blake2b_init() - Initialize a BLAKE2b context for a new message (unkeyed) 68 + * @ctx: the context to initialize 69 + * @outlen: length of output hash value in bytes, at most BLAKE2B_HASH_SIZE 70 + * 71 + * Context: Any context. 72 + */ 73 + static inline void blake2b_init(struct blake2b_ctx *ctx, size_t outlen) 74 + { 75 + __blake2b_init(ctx, outlen, NULL, 0); 76 + } 77 + 78 + /** 79 + * blake2b_init_key() - Initialize a BLAKE2b context for a new message (keyed) 80 + * @ctx: the context to initialize 81 + * @outlen: length of output hash value in bytes, at most BLAKE2B_HASH_SIZE 82 + * @key: the key 83 + * @keylen: the key length in bytes, at most BLAKE2B_KEY_SIZE 84 + * 85 + * Context: Any context. 86 + */ 87 + static inline void blake2b_init_key(struct blake2b_ctx *ctx, size_t outlen, 88 + const void *key, size_t keylen) 89 + { 90 + WARN_ON(IS_ENABLED(DEBUG) && (!outlen || outlen > BLAKE2B_HASH_SIZE || 91 + !key || !keylen || keylen > BLAKE2B_KEY_SIZE)); 92 + 93 + __blake2b_init(ctx, outlen, key, keylen); 94 + } 95 + 96 + /** 97 + * blake2b_update() - Update a BLAKE2b context with message data 98 + * @ctx: the context to update; must have been initialized 99 + * @in: the message data 100 + * @inlen: the data length in bytes 101 + * 102 + * This can be called any number of times. 103 + * 104 + * Context: Any context. 105 + */ 106 + void blake2b_update(struct blake2b_ctx *ctx, const u8 *in, size_t inlen); 107 + 108 + /** 109 + * blake2b_final() - Finish computing a BLAKE2b hash 110 + * @ctx: the context to finalize; must have been initialized 111 + * @out: (output) the resulting BLAKE2b hash. Its length will be equal to the 112 + * @outlen that was passed to blake2b_init() or blake2b_init_key(). 113 + * 114 + * After finishing, this zeroizes @ctx. So the caller does not need to do it. 115 + * 116 + * Context: Any context. 117 + */ 118 + void blake2b_final(struct blake2b_ctx *ctx, u8 *out); 119 + 120 + /** 121 + * blake2b() - Compute BLAKE2b hash in one shot 122 + * @key: the key, or NULL for an unkeyed hash 123 + * @keylen: the key length in bytes (at most BLAKE2B_KEY_SIZE), or 0 for an 124 + * unkeyed hash 125 + * @in: the message data 126 + * @inlen: the data length in bytes 127 + * @out: (output) the resulting BLAKE2b hash, with length @outlen 128 + * @outlen: length of output hash value in bytes, at most BLAKE2B_HASH_SIZE 129 + * 130 + * Context: Any context. 131 + */ 132 + static inline void blake2b(const u8 *key, size_t keylen, 133 + const u8 *in, size_t inlen, 134 + u8 *out, size_t outlen) 135 + { 136 + struct blake2b_ctx ctx; 137 + 138 + WARN_ON(IS_ENABLED(DEBUG) && ((!in && inlen > 0) || !out || !outlen || 139 + outlen > BLAKE2B_HASH_SIZE || keylen > BLAKE2B_KEY_SIZE || 140 + (!key && keylen))); 141 + 142 + __blake2b_init(&ctx, outlen, key, keylen); 143 + blake2b_update(&ctx, in, inlen); 144 + blake2b_final(&ctx, out); 64 145 } 65 146 66 147 #endif /* _CRYPTO_BLAKE2B_H */
+91 -35
include/crypto/blake2s.h
··· 22 22 BLAKE2S_256_HASH_SIZE = 32, 23 23 }; 24 24 25 - struct blake2s_state { 25 + /** 26 + * struct blake2s_ctx - Context for hashing a message with BLAKE2s 27 + * @h: compression function state 28 + * @t: block counter 29 + * @f: finalization indicator 30 + * @buf: partial block buffer; 'buflen' bytes are valid 31 + * @buflen: number of bytes buffered in @buf 32 + * @outlen: length of output hash value in bytes, at most BLAKE2S_HASH_SIZE 33 + */ 34 + struct blake2s_ctx { 26 35 /* 'h', 't', and 'f' are used in assembly code, so keep them as-is. */ 27 36 u32 h[8]; 28 37 u32 t[2]; ··· 52 43 BLAKE2S_IV7 = 0x5BE0CD19UL, 53 44 }; 54 45 55 - static inline void __blake2s_init(struct blake2s_state *state, size_t outlen, 46 + static inline void __blake2s_init(struct blake2s_ctx *ctx, size_t outlen, 56 47 const void *key, size_t keylen) 57 48 { 58 - state->h[0] = BLAKE2S_IV0 ^ (0x01010000 | keylen << 8 | outlen); 59 - state->h[1] = BLAKE2S_IV1; 60 - state->h[2] = BLAKE2S_IV2; 61 - state->h[3] = BLAKE2S_IV3; 62 - state->h[4] = BLAKE2S_IV4; 63 - state->h[5] = BLAKE2S_IV5; 64 - state->h[6] = BLAKE2S_IV6; 65 - state->h[7] = BLAKE2S_IV7; 66 - state->t[0] = 0; 67 - state->t[1] = 0; 68 - state->f[0] = 0; 69 - state->f[1] = 0; 70 - state->buflen = 0; 71 - state->outlen = outlen; 49 + ctx->h[0] = BLAKE2S_IV0 ^ (0x01010000 | keylen << 8 | outlen); 50 + ctx->h[1] = BLAKE2S_IV1; 51 + ctx->h[2] = BLAKE2S_IV2; 52 + ctx->h[3] = BLAKE2S_IV3; 53 + ctx->h[4] = BLAKE2S_IV4; 54 + ctx->h[5] = BLAKE2S_IV5; 55 + ctx->h[6] = BLAKE2S_IV6; 56 + ctx->h[7] = BLAKE2S_IV7; 57 + ctx->t[0] = 0; 58 + ctx->t[1] = 0; 59 + ctx->f[0] = 0; 60 + ctx->f[1] = 0; 61 + ctx->buflen = 0; 62 + ctx->outlen = outlen; 72 63 if (keylen) { 73 - memcpy(state->buf, key, keylen); 74 - memset(&state->buf[keylen], 0, BLAKE2S_BLOCK_SIZE - keylen); 75 - state->buflen = BLAKE2S_BLOCK_SIZE; 64 + memcpy(ctx->buf, key, keylen); 65 + memset(&ctx->buf[keylen], 0, BLAKE2S_BLOCK_SIZE - keylen); 66 + ctx->buflen = BLAKE2S_BLOCK_SIZE; 76 67 } 77 68 } 78 69 79 - static inline void blake2s_init(struct blake2s_state *state, 80 - const size_t outlen) 70 + /** 71 + * blake2s_init() - Initialize a BLAKE2s context for a new message (unkeyed) 72 + * @ctx: the context to initialize 73 + * @outlen: length of output hash value in bytes, at most BLAKE2S_HASH_SIZE 74 + * 75 + * Context: Any context. 76 + */ 77 + static inline void blake2s_init(struct blake2s_ctx *ctx, size_t outlen) 81 78 { 82 - __blake2s_init(state, outlen, NULL, 0); 79 + __blake2s_init(ctx, outlen, NULL, 0); 83 80 } 84 81 85 - static inline void blake2s_init_key(struct blake2s_state *state, 86 - const size_t outlen, const void *key, 87 - const size_t keylen) 82 + /** 83 + * blake2s_init_key() - Initialize a BLAKE2s context for a new message (keyed) 84 + * @ctx: the context to initialize 85 + * @outlen: length of output hash value in bytes, at most BLAKE2S_HASH_SIZE 86 + * @key: the key 87 + * @keylen: the key length in bytes, at most BLAKE2S_KEY_SIZE 88 + * 89 + * Context: Any context. 90 + */ 91 + static inline void blake2s_init_key(struct blake2s_ctx *ctx, size_t outlen, 92 + const void *key, size_t keylen) 88 93 { 89 94 WARN_ON(IS_ENABLED(DEBUG) && (!outlen || outlen > BLAKE2S_HASH_SIZE || 90 95 !key || !keylen || keylen > BLAKE2S_KEY_SIZE)); 91 96 92 - __blake2s_init(state, outlen, key, keylen); 97 + __blake2s_init(ctx, outlen, key, keylen); 93 98 } 94 99 95 - void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen); 96 - void blake2s_final(struct blake2s_state *state, u8 *out); 100 + /** 101 + * blake2s_update() - Update a BLAKE2s context with message data 102 + * @ctx: the context to update; must have been initialized 103 + * @in: the message data 104 + * @inlen: the data length in bytes 105 + * 106 + * This can be called any number of times. 107 + * 108 + * Context: Any context. 109 + */ 110 + void blake2s_update(struct blake2s_ctx *ctx, const u8 *in, size_t inlen); 97 111 98 - static inline void blake2s(u8 *out, const u8 *in, const u8 *key, 99 - const size_t outlen, const size_t inlen, 100 - const size_t keylen) 112 + /** 113 + * blake2s_final() - Finish computing a BLAKE2s hash 114 + * @ctx: the context to finalize; must have been initialized 115 + * @out: (output) the resulting BLAKE2s hash. Its length will be equal to the 116 + * @outlen that was passed to blake2s_init() or blake2s_init_key(). 117 + * 118 + * After finishing, this zeroizes @ctx. So the caller does not need to do it. 119 + * 120 + * Context: Any context. 121 + */ 122 + void blake2s_final(struct blake2s_ctx *ctx, u8 *out); 123 + 124 + /** 125 + * blake2s() - Compute BLAKE2s hash in one shot 126 + * @key: the key, or NULL for an unkeyed hash 127 + * @keylen: the key length in bytes (at most BLAKE2S_KEY_SIZE), or 0 for an 128 + * unkeyed hash 129 + * @in: the message data 130 + * @inlen: the data length in bytes 131 + * @out: (output) the resulting BLAKE2s hash, with length @outlen 132 + * @outlen: length of output hash value in bytes, at most BLAKE2S_HASH_SIZE 133 + * 134 + * Context: Any context. 135 + */ 136 + static inline void blake2s(const u8 *key, size_t keylen, 137 + const u8 *in, size_t inlen, 138 + u8 *out, size_t outlen) 101 139 { 102 - struct blake2s_state state; 140 + struct blake2s_ctx ctx; 103 141 104 142 WARN_ON(IS_ENABLED(DEBUG) && ((!in && inlen > 0) || !out || !outlen || 105 143 outlen > BLAKE2S_HASH_SIZE || keylen > BLAKE2S_KEY_SIZE || 106 144 (!key && keylen))); 107 145 108 - __blake2s_init(&state, outlen, key, keylen); 109 - blake2s_update(&state, in, inlen); 110 - blake2s_final(&state, out); 146 + __blake2s_init(&ctx, outlen, key, keylen); 147 + blake2s_update(&ctx, in, inlen); 148 + blake2s_final(&ctx, out); 111 149 } 112 150 113 151 #endif /* _CRYPTO_BLAKE2S_H */
-101
include/crypto/internal/blake2b.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 OR MIT */ 2 - /* 3 - * Helper functions for BLAKE2b implementations. 4 - * Keep this in sync with the corresponding BLAKE2s header. 5 - */ 6 - 7 - #ifndef _CRYPTO_INTERNAL_BLAKE2B_H 8 - #define _CRYPTO_INTERNAL_BLAKE2B_H 9 - 10 - #include <asm/byteorder.h> 11 - #include <crypto/blake2b.h> 12 - #include <crypto/internal/hash.h> 13 - #include <linux/array_size.h> 14 - #include <linux/compiler.h> 15 - #include <linux/build_bug.h> 16 - #include <linux/errno.h> 17 - #include <linux/math.h> 18 - #include <linux/string.h> 19 - #include <linux/types.h> 20 - 21 - static inline void blake2b_set_lastblock(struct blake2b_state *state) 22 - { 23 - state->f[0] = -1; 24 - state->f[1] = 0; 25 - } 26 - 27 - static inline void blake2b_set_nonlast(struct blake2b_state *state) 28 - { 29 - state->f[0] = 0; 30 - state->f[1] = 0; 31 - } 32 - 33 - typedef void (*blake2b_compress_t)(struct blake2b_state *state, 34 - const u8 *block, size_t nblocks, u32 inc); 35 - 36 - /* Helper functions for shash implementations of BLAKE2b */ 37 - 38 - struct blake2b_tfm_ctx { 39 - u8 key[BLAKE2B_BLOCK_SIZE]; 40 - unsigned int keylen; 41 - }; 42 - 43 - static inline int crypto_blake2b_setkey(struct crypto_shash *tfm, 44 - const u8 *key, unsigned int keylen) 45 - { 46 - struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(tfm); 47 - 48 - if (keylen > BLAKE2B_KEY_SIZE) 49 - return -EINVAL; 50 - 51 - BUILD_BUG_ON(BLAKE2B_KEY_SIZE > BLAKE2B_BLOCK_SIZE); 52 - 53 - memcpy(tctx->key, key, keylen); 54 - memset(tctx->key + keylen, 0, BLAKE2B_BLOCK_SIZE - keylen); 55 - tctx->keylen = keylen; 56 - 57 - return 0; 58 - } 59 - 60 - static inline int crypto_blake2b_init(struct shash_desc *desc) 61 - { 62 - const struct blake2b_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); 63 - struct blake2b_state *state = shash_desc_ctx(desc); 64 - unsigned int outlen = crypto_shash_digestsize(desc->tfm); 65 - 66 - __blake2b_init(state, outlen, tctx->keylen); 67 - return tctx->keylen ? 68 - crypto_shash_update(desc, tctx->key, BLAKE2B_BLOCK_SIZE) : 0; 69 - } 70 - 71 - static inline int crypto_blake2b_update_bo(struct shash_desc *desc, 72 - const u8 *in, unsigned int inlen, 73 - blake2b_compress_t compress) 74 - { 75 - struct blake2b_state *state = shash_desc_ctx(desc); 76 - 77 - blake2b_set_nonlast(state); 78 - compress(state, in, inlen / BLAKE2B_BLOCK_SIZE, BLAKE2B_BLOCK_SIZE); 79 - return inlen - round_down(inlen, BLAKE2B_BLOCK_SIZE); 80 - } 81 - 82 - static inline int crypto_blake2b_finup(struct shash_desc *desc, const u8 *in, 83 - unsigned int inlen, u8 *out, 84 - blake2b_compress_t compress) 85 - { 86 - struct blake2b_state *state = shash_desc_ctx(desc); 87 - u8 buf[BLAKE2B_BLOCK_SIZE]; 88 - int i; 89 - 90 - memcpy(buf, in, inlen); 91 - memset(buf + inlen, 0, BLAKE2B_BLOCK_SIZE - inlen); 92 - blake2b_set_lastblock(state); 93 - compress(state, buf, 1, inlen); 94 - for (i = 0; i < ARRAY_SIZE(state->h); i++) 95 - __cpu_to_le64s(&state->h[i]); 96 - memcpy(out, state->h, crypto_shash_digestsize(desc->tfm)); 97 - memzero_explicit(buf, sizeof(buf)); 98 - return 0; 99 - } 100 - 101 - #endif /* _CRYPTO_INTERNAL_BLAKE2B_H */
+179 -3
include/crypto/polyval.h
··· 1 - /* SPDX-License-Identifier: GPL-2.0 */ 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 2 /* 3 - * Common values for the Polyval hash algorithm 3 + * POLYVAL library API 4 4 * 5 - * Copyright 2021 Google LLC 5 + * Copyright 2025 Google LLC 6 6 */ 7 7 8 8 #ifndef _CRYPTO_POLYVAL_H 9 9 #define _CRYPTO_POLYVAL_H 10 10 11 + #include <linux/string.h> 12 + #include <linux/types.h> 13 + 11 14 #define POLYVAL_BLOCK_SIZE 16 12 15 #define POLYVAL_DIGEST_SIZE 16 13 16 17 + /** 18 + * struct polyval_elem - An element of the POLYVAL finite field 19 + * @bytes: View of the element as a byte array (unioned with @lo and @hi) 20 + * @lo: The low 64 terms of the element's polynomial 21 + * @hi: The high 64 terms of the element's polynomial 22 + * 23 + * This represents an element of the finite field GF(2^128), using the POLYVAL 24 + * convention: little-endian byte order and natural bit order. 25 + */ 26 + struct polyval_elem { 27 + union { 28 + u8 bytes[POLYVAL_BLOCK_SIZE]; 29 + struct { 30 + __le64 lo; 31 + __le64 hi; 32 + }; 33 + }; 34 + }; 35 + 36 + /** 37 + * struct polyval_key - Prepared key for POLYVAL 38 + * 39 + * This may contain just the raw key H, or it may contain precomputed key 40 + * powers, depending on the platform's POLYVAL implementation. Use 41 + * polyval_preparekey() to initialize this. 42 + * 43 + * By H^i we mean H^(i-1) * H * x^-128, with base case H^1 = H. I.e. the 44 + * exponentiation repeats the POLYVAL dot operation, with its "extra" x^-128. 45 + */ 46 + struct polyval_key { 47 + #ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH 48 + #ifdef CONFIG_ARM64 49 + /** @h_powers: Powers of the hash key H^8 through H^1 */ 50 + struct polyval_elem h_powers[8]; 51 + #elif defined(CONFIG_X86) 52 + /** @h_powers: Powers of the hash key H^8 through H^1 */ 53 + struct polyval_elem h_powers[8]; 54 + #else 55 + #error "Unhandled arch" 14 56 #endif 57 + #else /* CONFIG_CRYPTO_LIB_POLYVAL_ARCH */ 58 + /** @h: The hash key H */ 59 + struct polyval_elem h; 60 + #endif /* !CONFIG_CRYPTO_LIB_POLYVAL_ARCH */ 61 + }; 62 + 63 + /** 64 + * struct polyval_ctx - Context for computing a POLYVAL value 65 + * @key: Pointer to the prepared POLYVAL key. The user of the API is 66 + * responsible for ensuring that the key lives as long as the context. 67 + * @acc: The accumulator 68 + * @partial: Number of data bytes processed so far modulo POLYVAL_BLOCK_SIZE 69 + */ 70 + struct polyval_ctx { 71 + const struct polyval_key *key; 72 + struct polyval_elem acc; 73 + size_t partial; 74 + }; 75 + 76 + /** 77 + * polyval_preparekey() - Prepare a POLYVAL key 78 + * @key: (output) The key structure to initialize 79 + * @raw_key: The raw hash key 80 + * 81 + * Initialize a POLYVAL key structure from a raw key. This may be a simple 82 + * copy, or it may involve precomputing powers of the key, depending on the 83 + * platform's POLYVAL implementation. 84 + * 85 + * Context: Any context. 86 + */ 87 + #ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH 88 + void polyval_preparekey(struct polyval_key *key, 89 + const u8 raw_key[POLYVAL_BLOCK_SIZE]); 90 + 91 + #else 92 + static inline void polyval_preparekey(struct polyval_key *key, 93 + const u8 raw_key[POLYVAL_BLOCK_SIZE]) 94 + { 95 + /* Just a simple copy, so inline it. */ 96 + memcpy(key->h.bytes, raw_key, POLYVAL_BLOCK_SIZE); 97 + } 98 + #endif 99 + 100 + /** 101 + * polyval_init() - Initialize a POLYVAL context for a new message 102 + * @ctx: The context to initialize 103 + * @key: The key to use. Note that a pointer to the key is saved in the 104 + * context, so the key must live at least as long as the context. 105 + */ 106 + static inline void polyval_init(struct polyval_ctx *ctx, 107 + const struct polyval_key *key) 108 + { 109 + *ctx = (struct polyval_ctx){ .key = key }; 110 + } 111 + 112 + /** 113 + * polyval_import_blkaligned() - Import a POLYVAL accumulator value 114 + * @ctx: The context to initialize 115 + * @key: The key to import. Note that a pointer to the key is saved in the 116 + * context, so the key must live at least as long as the context. 117 + * @acc: The accumulator value to import. 118 + * 119 + * This imports an accumulator that was saved by polyval_export_blkaligned(). 120 + * The same key must be used. 121 + */ 122 + static inline void 123 + polyval_import_blkaligned(struct polyval_ctx *ctx, 124 + const struct polyval_key *key, 125 + const struct polyval_elem *acc) 126 + { 127 + *ctx = (struct polyval_ctx){ .key = key, .acc = *acc }; 128 + } 129 + 130 + /** 131 + * polyval_export_blkaligned() - Export a POLYVAL accumulator value 132 + * @ctx: The context to export the accumulator value from 133 + * @acc: (output) The exported accumulator value 134 + * 135 + * This exports the accumulator from a POLYVAL context. The number of data 136 + * bytes processed so far must be a multiple of POLYVAL_BLOCK_SIZE. 137 + */ 138 + static inline void polyval_export_blkaligned(const struct polyval_ctx *ctx, 139 + struct polyval_elem *acc) 140 + { 141 + *acc = ctx->acc; 142 + } 143 + 144 + /** 145 + * polyval_update() - Update a POLYVAL context with message data 146 + * @ctx: The context to update; must have been initialized 147 + * @data: The message data 148 + * @len: The data length in bytes. Doesn't need to be block-aligned. 149 + * 150 + * This can be called any number of times. 151 + * 152 + * Context: Any context. 153 + */ 154 + void polyval_update(struct polyval_ctx *ctx, const u8 *data, size_t len); 155 + 156 + /** 157 + * polyval_final() - Finish computing a POLYVAL value 158 + * @ctx: The context to finalize 159 + * @out: The output value 160 + * 161 + * If the total data length isn't a multiple of POLYVAL_BLOCK_SIZE, then the 162 + * final block is automatically zero-padded. 163 + * 164 + * After finishing, this zeroizes @ctx. So the caller does not need to do it. 165 + * 166 + * Context: Any context. 167 + */ 168 + void polyval_final(struct polyval_ctx *ctx, u8 out[POLYVAL_BLOCK_SIZE]); 169 + 170 + /** 171 + * polyval() - Compute a POLYVAL value 172 + * @key: The prepared key 173 + * @data: The message data 174 + * @len: The data length in bytes. Doesn't need to be block-aligned. 175 + * @out: The output value 176 + * 177 + * Context: Any context. 178 + */ 179 + static inline void polyval(const struct polyval_key *key, 180 + const u8 *data, size_t len, 181 + u8 out[POLYVAL_BLOCK_SIZE]) 182 + { 183 + struct polyval_ctx ctx; 184 + 185 + polyval_init(&ctx, key); 186 + polyval_update(&ctx, data, len); 187 + polyval_final(&ctx, out); 188 + } 189 + 190 + #endif /* _CRYPTO_POLYVAL_H */
+315 -5
include/crypto/sha3.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 2 /* 3 3 * Common values for SHA-3 algorithms 4 + * 5 + * See also Documentation/crypto/sha3.rst 4 6 */ 5 7 #ifndef __CRYPTO_SHA3_H__ 6 8 #define __CRYPTO_SHA3_H__ 7 9 8 10 #include <linux/types.h> 11 + #include <linux/string.h> 9 12 10 13 #define SHA3_224_DIGEST_SIZE (224 / 8) 11 14 #define SHA3_224_BLOCK_SIZE (200 - 2 * SHA3_224_DIGEST_SIZE) ··· 26 23 #define SHA3_512_BLOCK_SIZE (200 - 2 * SHA3_512_DIGEST_SIZE) 27 24 #define SHA3_512_EXPORT_SIZE SHA3_STATE_SIZE + SHA3_512_BLOCK_SIZE + 1 28 25 26 + /* 27 + * SHAKE128 and SHAKE256 actually have variable output size, but this is used to 28 + * calculate the block size (rate) analogously to the above. 29 + */ 30 + #define SHAKE128_DEFAULT_SIZE (128 / 8) 31 + #define SHAKE128_BLOCK_SIZE (200 - 2 * SHAKE128_DEFAULT_SIZE) 32 + #define SHAKE256_DEFAULT_SIZE (256 / 8) 33 + #define SHAKE256_BLOCK_SIZE (200 - 2 * SHAKE256_DEFAULT_SIZE) 34 + 29 35 #define SHA3_STATE_SIZE 200 30 36 31 - struct shash_desc; 32 - 37 + /* 38 + * State for the Keccak-f[1600] permutation: 25 64-bit words. 39 + * 40 + * We usually keep the state words as little-endian, to make absorbing and 41 + * squeezing easier. (It means that absorbing and squeezing can just treat the 42 + * state as a byte array.) The state words are converted to native-endian only 43 + * temporarily by implementations of the permutation that need native-endian 44 + * words. Of course, that conversion is a no-op on little-endian machines. 45 + */ 33 46 struct sha3_state { 34 - u64 st[SHA3_STATE_SIZE / 8]; 47 + union { 48 + __le64 words[SHA3_STATE_SIZE / 8]; 49 + u8 bytes[SHA3_STATE_SIZE]; 50 + 51 + u64 native_words[SHA3_STATE_SIZE / 8]; /* see comment above */ 52 + }; 35 53 }; 36 54 37 - int crypto_sha3_init(struct shash_desc *desc); 55 + /* Internal context, shared by the digests (SHA3-*) and the XOFs (SHAKE*) */ 56 + struct __sha3_ctx { 57 + struct sha3_state state; 58 + u8 digest_size; /* Digests only: the digest size in bytes */ 59 + u8 block_size; /* Block size in bytes */ 60 + u8 absorb_offset; /* Index of next state byte to absorb into */ 61 + u8 squeeze_offset; /* XOFs only: index of next state byte to extract */ 62 + }; 38 63 39 - #endif 64 + void __sha3_update(struct __sha3_ctx *ctx, const u8 *in, size_t in_len); 65 + 66 + /** 67 + * struct sha3_ctx - Context for SHA3-224, SHA3-256, SHA3-384, or SHA3-512 68 + * @ctx: private 69 + */ 70 + struct sha3_ctx { 71 + struct __sha3_ctx ctx; 72 + }; 73 + 74 + /** 75 + * sha3_zeroize_ctx() - Zeroize a SHA-3 context 76 + * @ctx: The context to zeroize 77 + * 78 + * This is already called by sha3_final(). Call this explicitly when abandoning 79 + * a context without calling sha3_final(). 80 + */ 81 + static inline void sha3_zeroize_ctx(struct sha3_ctx *ctx) 82 + { 83 + memzero_explicit(ctx, sizeof(*ctx)); 84 + } 85 + 86 + /** 87 + * struct shake_ctx - Context for SHAKE128 or SHAKE256 88 + * @ctx: private 89 + */ 90 + struct shake_ctx { 91 + struct __sha3_ctx ctx; 92 + }; 93 + 94 + /** 95 + * shake_zeroize_ctx() - Zeroize a SHAKE context 96 + * @ctx: The context to zeroize 97 + * 98 + * Call this after the last squeeze. 99 + */ 100 + static inline void shake_zeroize_ctx(struct shake_ctx *ctx) 101 + { 102 + memzero_explicit(ctx, sizeof(*ctx)); 103 + } 104 + 105 + /** 106 + * sha3_224_init() - Initialize a context for SHA3-224 107 + * @ctx: The context to initialize 108 + * 109 + * This begins a new SHA3-224 message digest computation. 110 + * 111 + * Context: Any context. 112 + */ 113 + static inline void sha3_224_init(struct sha3_ctx *ctx) 114 + { 115 + *ctx = (struct sha3_ctx){ 116 + .ctx.digest_size = SHA3_224_DIGEST_SIZE, 117 + .ctx.block_size = SHA3_224_BLOCK_SIZE, 118 + }; 119 + } 120 + 121 + /** 122 + * sha3_256_init() - Initialize a context for SHA3-256 123 + * @ctx: The context to initialize 124 + * 125 + * This begins a new SHA3-256 message digest computation. 126 + * 127 + * Context: Any context. 128 + */ 129 + static inline void sha3_256_init(struct sha3_ctx *ctx) 130 + { 131 + *ctx = (struct sha3_ctx){ 132 + .ctx.digest_size = SHA3_256_DIGEST_SIZE, 133 + .ctx.block_size = SHA3_256_BLOCK_SIZE, 134 + }; 135 + } 136 + 137 + /** 138 + * sha3_384_init() - Initialize a context for SHA3-384 139 + * @ctx: The context to initialize 140 + * 141 + * This begins a new SHA3-384 message digest computation. 142 + * 143 + * Context: Any context. 144 + */ 145 + static inline void sha3_384_init(struct sha3_ctx *ctx) 146 + { 147 + *ctx = (struct sha3_ctx){ 148 + .ctx.digest_size = SHA3_384_DIGEST_SIZE, 149 + .ctx.block_size = SHA3_384_BLOCK_SIZE, 150 + }; 151 + } 152 + 153 + /** 154 + * sha3_512_init() - Initialize a context for SHA3-512 155 + * @ctx: The context to initialize 156 + * 157 + * This begins a new SHA3-512 message digest computation. 158 + * 159 + * Context: Any context. 160 + */ 161 + static inline void sha3_512_init(struct sha3_ctx *ctx) 162 + { 163 + *ctx = (struct sha3_ctx){ 164 + .ctx.digest_size = SHA3_512_DIGEST_SIZE, 165 + .ctx.block_size = SHA3_512_BLOCK_SIZE, 166 + }; 167 + } 168 + 169 + /** 170 + * sha3_update() - Update a SHA-3 digest context with input data 171 + * @ctx: The context to update; must have been initialized 172 + * @in: The input data 173 + * @in_len: Length of the input data in bytes 174 + * 175 + * This can be called any number of times to add data to a SHA3-224, SHA3-256, 176 + * SHA3-384, or SHA3-512 digest (depending on which init function was called). 177 + * 178 + * Context: Any context. 179 + */ 180 + static inline void sha3_update(struct sha3_ctx *ctx, 181 + const u8 *in, size_t in_len) 182 + { 183 + __sha3_update(&ctx->ctx, in, in_len); 184 + } 185 + 186 + /** 187 + * sha3_final() - Finish computing a SHA-3 message digest 188 + * @ctx: The context to finalize; must have been initialized 189 + * @out: (output) The resulting SHA3-224, SHA3-256, SHA3-384, or SHA3-512 190 + * message digest, matching the init function that was called. Note that 191 + * the size differs for each one; see SHA3_*_DIGEST_SIZE. 192 + * 193 + * After finishing, this zeroizes @ctx. So the caller does not need to do it. 194 + * 195 + * Context: Any context. 196 + */ 197 + void sha3_final(struct sha3_ctx *ctx, u8 *out); 198 + 199 + /** 200 + * shake128_init() - Initialize a context for SHAKE128 201 + * @ctx: The context to initialize 202 + * 203 + * This begins a new SHAKE128 extendable-output function (XOF) computation. 204 + * 205 + * Context: Any context. 206 + */ 207 + static inline void shake128_init(struct shake_ctx *ctx) 208 + { 209 + *ctx = (struct shake_ctx){ 210 + .ctx.block_size = SHAKE128_BLOCK_SIZE, 211 + }; 212 + } 213 + 214 + /** 215 + * shake256_init() - Initialize a context for SHAKE256 216 + * @ctx: The context to initialize 217 + * 218 + * This begins a new SHAKE256 extendable-output function (XOF) computation. 219 + * 220 + * Context: Any context. 221 + */ 222 + static inline void shake256_init(struct shake_ctx *ctx) 223 + { 224 + *ctx = (struct shake_ctx){ 225 + .ctx.block_size = SHAKE256_BLOCK_SIZE, 226 + }; 227 + } 228 + 229 + /** 230 + * shake_update() - Update a SHAKE context with input data 231 + * @ctx: The context to update; must have been initialized 232 + * @in: The input data 233 + * @in_len: Length of the input data in bytes 234 + * 235 + * This can be called any number of times to add more input data to SHAKE128 or 236 + * SHAKE256. This cannot be called after squeezing has begun. 237 + * 238 + * Context: Any context. 239 + */ 240 + static inline void shake_update(struct shake_ctx *ctx, 241 + const u8 *in, size_t in_len) 242 + { 243 + __sha3_update(&ctx->ctx, in, in_len); 244 + } 245 + 246 + /** 247 + * shake_squeeze() - Generate output from SHAKE128 or SHAKE256 248 + * @ctx: The context to squeeze; must have been initialized 249 + * @out: Where to write the resulting output data 250 + * @out_len: The amount of data to extract to @out in bytes 251 + * 252 + * This may be called multiple times. A number of consecutive squeezes laid 253 + * end-to-end will yield the same output as one big squeeze generating the same 254 + * total amount of output. More input cannot be provided after squeezing has 255 + * begun. After the last squeeze, call shake_zeroize_ctx(). 256 + * 257 + * Context: Any context. 258 + */ 259 + void shake_squeeze(struct shake_ctx *ctx, u8 *out, size_t out_len); 260 + 261 + /** 262 + * sha3_224() - Compute SHA3-224 digest in one shot 263 + * @in: The input data to be digested 264 + * @in_len: Length of the input data in bytes 265 + * @out: The buffer into which the digest will be stored 266 + * 267 + * Convenience function that computes a SHA3-224 digest. Use this instead of 268 + * the incremental API if you're able to provide all the input at once. 269 + * 270 + * Context: Any context. 271 + */ 272 + void sha3_224(const u8 *in, size_t in_len, u8 out[SHA3_224_DIGEST_SIZE]); 273 + 274 + /** 275 + * sha3_256() - Compute SHA3-256 digest in one shot 276 + * @in: The input data to be digested 277 + * @in_len: Length of the input data in bytes 278 + * @out: The buffer into which the digest will be stored 279 + * 280 + * Convenience function that computes a SHA3-256 digest. Use this instead of 281 + * the incremental API if you're able to provide all the input at once. 282 + * 283 + * Context: Any context. 284 + */ 285 + void sha3_256(const u8 *in, size_t in_len, u8 out[SHA3_256_DIGEST_SIZE]); 286 + 287 + /** 288 + * sha3_384() - Compute SHA3-384 digest in one shot 289 + * @in: The input data to be digested 290 + * @in_len: Length of the input data in bytes 291 + * @out: The buffer into which the digest will be stored 292 + * 293 + * Convenience function that computes a SHA3-384 digest. Use this instead of 294 + * the incremental API if you're able to provide all the input at once. 295 + * 296 + * Context: Any context. 297 + */ 298 + void sha3_384(const u8 *in, size_t in_len, u8 out[SHA3_384_DIGEST_SIZE]); 299 + 300 + /** 301 + * sha3_512() - Compute SHA3-512 digest in one shot 302 + * @in: The input data to be digested 303 + * @in_len: Length of the input data in bytes 304 + * @out: The buffer into which the digest will be stored 305 + * 306 + * Convenience function that computes a SHA3-512 digest. Use this instead of 307 + * the incremental API if you're able to provide all the input at once. 308 + * 309 + * Context: Any context. 310 + */ 311 + void sha3_512(const u8 *in, size_t in_len, u8 out[SHA3_512_DIGEST_SIZE]); 312 + 313 + /** 314 + * shake128() - Compute SHAKE128 in one shot 315 + * @in: The input data to be used 316 + * @in_len: Length of the input data in bytes 317 + * @out: The buffer into which the output will be stored 318 + * @out_len: Length of the output to produce in bytes 319 + * 320 + * Convenience function that computes SHAKE128 in one shot. Use this instead of 321 + * the incremental API if you're able to provide all the input at once as well 322 + * as receive all the output at once. All output lengths are supported. 323 + * 324 + * Context: Any context. 325 + */ 326 + void shake128(const u8 *in, size_t in_len, u8 *out, size_t out_len); 327 + 328 + /** 329 + * shake256() - Compute SHAKE256 in one shot 330 + * @in: The input data to be used 331 + * @in_len: Length of the input data in bytes 332 + * @out: The buffer into which the output will be stored 333 + * @out_len: Length of the output to produce in bytes 334 + * 335 + * Convenience function that computes SHAKE256 in one shot. Use this instead of 336 + * the incremental API if you're able to provide all the input at once as well 337 + * as receive all the output at once. All output lengths are supported. 338 + * 339 + * Context: Any context. 340 + */ 341 + void shake256(const u8 *in, size_t in_len, u8 *out, size_t out_len); 342 + 343 + #endif /* __CRYPTO_SHA3_H__ */
+16
include/linux/byteorder/generic.h
··· 173 173 } 174 174 } 175 175 176 + static inline void le64_to_cpu_array(u64 *buf, unsigned int words) 177 + { 178 + while (words--) { 179 + __le64_to_cpus(buf); 180 + buf++; 181 + } 182 + } 183 + 184 + static inline void cpu_to_le64_array(u64 *buf, unsigned int words) 185 + { 186 + while (words--) { 187 + __cpu_to_le64s(buf); 188 + buf++; 189 + } 190 + } 191 + 176 192 static inline void memcpy_from_le32(u32 *dst, const __le32 *src, size_t words) 177 193 { 178 194 size_t i;
+36
lib/crypto/Kconfig
··· 28 28 config CRYPTO_LIB_GF128MUL 29 29 tristate 30 30 31 + config CRYPTO_LIB_BLAKE2B 32 + tristate 33 + help 34 + The BLAKE2b library functions. Select this if your module uses any of 35 + the functions from <crypto/blake2b.h>. 36 + 37 + config CRYPTO_LIB_BLAKE2B_ARCH 38 + bool 39 + depends on CRYPTO_LIB_BLAKE2B && !UML 40 + default y if ARM && KERNEL_MODE_NEON 41 + 31 42 # BLAKE2s support is always built-in, so there's no CRYPTO_LIB_BLAKE2S option. 32 43 33 44 config CRYPTO_LIB_BLAKE2S_ARCH ··· 135 124 default 9 if ARM || ARM64 136 125 default 1 137 126 127 + config CRYPTO_LIB_POLYVAL 128 + tristate 129 + help 130 + The POLYVAL library functions. Select this if your module uses any of 131 + the functions from <crypto/polyval.h>. 132 + 133 + config CRYPTO_LIB_POLYVAL_ARCH 134 + bool 135 + depends on CRYPTO_LIB_POLYVAL && !UML 136 + default y if ARM64 && KERNEL_MODE_NEON 137 + default y if X86_64 138 + 138 139 config CRYPTO_LIB_CHACHA20POLY1305 139 140 tristate 140 141 select CRYPTO_LIB_CHACHA ··· 206 183 default y if S390 207 184 default y if SPARC64 208 185 default y if X86_64 186 + 187 + config CRYPTO_LIB_SHA3 188 + tristate 189 + select CRYPTO_LIB_UTILS 190 + help 191 + The SHA3 library functions. Select this if your module uses any of 192 + the functions from <crypto/sha3.h>. 193 + 194 + config CRYPTO_LIB_SHA3_ARCH 195 + bool 196 + depends on CRYPTO_LIB_SHA3 && !UML 197 + default y if ARM64 && KERNEL_MODE_NEON 198 + default y if S390 209 199 210 200 config CRYPTO_LIB_SM3 211 201 tristate
+30
lib/crypto/Makefile
··· 31 31 32 32 ################################################################################ 33 33 34 + obj-$(CONFIG_CRYPTO_LIB_BLAKE2B) += libblake2b.o 35 + libblake2b-y := blake2b.o 36 + CFLAGS_blake2b.o := -Wframe-larger-than=4096 # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105930 37 + ifeq ($(CONFIG_CRYPTO_LIB_BLAKE2B_ARCH),y) 38 + CFLAGS_blake2b.o += -I$(src)/$(SRCARCH) 39 + libblake2b-$(CONFIG_ARM) += arm/blake2b-neon-core.o 40 + endif # CONFIG_CRYPTO_LIB_BLAKE2B_ARCH 41 + 42 + ################################################################################ 43 + 34 44 # blake2s is used by the /dev/random driver which is always builtin 35 45 obj-y += blake2s.o 36 46 ifeq ($(CONFIG_CRYPTO_LIB_BLAKE2S_ARCH),y) ··· 198 188 199 189 ################################################################################ 200 190 191 + obj-$(CONFIG_CRYPTO_LIB_POLYVAL) += libpolyval.o 192 + libpolyval-y := polyval.o 193 + ifeq ($(CONFIG_CRYPTO_LIB_POLYVAL_ARCH),y) 194 + CFLAGS_polyval.o += -I$(src)/$(SRCARCH) 195 + libpolyval-$(CONFIG_ARM64) += arm64/polyval-ce-core.o 196 + libpolyval-$(CONFIG_X86) += x86/polyval-pclmul-avx.o 197 + endif 198 + 199 + ################################################################################ 200 + 201 201 obj-$(CONFIG_CRYPTO_LIB_SHA1) += libsha1.o 202 202 libsha1-y := sha1.o 203 203 ifeq ($(CONFIG_CRYPTO_LIB_SHA1_ARCH),y) ··· 285 265 x86/sha512-avx-asm.o \ 286 266 x86/sha512-avx2-asm.o 287 267 endif # CONFIG_CRYPTO_LIB_SHA512_ARCH 268 + 269 + ################################################################################ 270 + 271 + obj-$(CONFIG_CRYPTO_LIB_SHA3) += libsha3.o 272 + libsha3-y := sha3.o 273 + 274 + ifeq ($(CONFIG_CRYPTO_LIB_SHA3_ARCH),y) 275 + CFLAGS_sha3.o += -I$(src)/$(SRCARCH) 276 + libsha3-$(CONFIG_ARM64) += arm64/sha3-ce-core.o 277 + endif # CONFIG_CRYPTO_LIB_SHA3_ARCH 288 278 289 279 ################################################################################ 290 280
+41
lib/crypto/arm/blake2b.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * BLAKE2b digest algorithm, NEON accelerated 4 + * 5 + * Copyright 2020 Google LLC 6 + */ 7 + 8 + #include <asm/neon.h> 9 + #include <asm/simd.h> 10 + 11 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_neon); 12 + 13 + asmlinkage void blake2b_compress_neon(struct blake2b_ctx *ctx, 14 + const u8 *data, size_t nblocks, u32 inc); 15 + 16 + static void blake2b_compress(struct blake2b_ctx *ctx, 17 + const u8 *data, size_t nblocks, u32 inc) 18 + { 19 + if (!static_branch_likely(&have_neon) || !may_use_simd()) { 20 + blake2b_compress_generic(ctx, data, nblocks, inc); 21 + return; 22 + } 23 + do { 24 + const size_t blocks = min_t(size_t, nblocks, 25 + SZ_4K / BLAKE2B_BLOCK_SIZE); 26 + 27 + kernel_neon_begin(); 28 + blake2b_compress_neon(ctx, data, blocks, inc); 29 + kernel_neon_end(); 30 + 31 + data += blocks * BLAKE2B_BLOCK_SIZE; 32 + nblocks -= blocks; 33 + } while (nblocks); 34 + } 35 + 36 + #define blake2b_mod_init_arch blake2b_mod_init_arch 37 + static void blake2b_mod_init_arch(void) 38 + { 39 + if (elf_hwcap & HWCAP_NEON) 40 + static_branch_enable(&have_neon); 41 + }
+11 -11
lib/crypto/arm/blake2s-core.S
··· 115 115 116 116 // Execute one round of BLAKE2s by updating the state matrix v[0..15]. v[0..9] 117 117 // are in r0..r9. The stack pointer points to 8 bytes of scratch space for 118 - // spilling v[8..9], then to v[9..15], then to the message block. r10-r12 and 118 + // spilling v[8..9], then to v[10..15], then to the message block. r10-r12 and 119 119 // r14 are free to use. The macro arguments s0-s15 give the order in which the 120 120 // message words are used in this round. 121 121 // ··· 170 170 .endm 171 171 172 172 // 173 - // void blake2s_compress(struct blake2s_state *state, 174 - // const u8 *block, size_t nblocks, u32 inc); 173 + // void blake2s_compress(struct blake2s_ctx *ctx, 174 + // const u8 *data, size_t nblocks, u32 inc); 175 175 // 176 - // Only the first three fields of struct blake2s_state are used: 176 + // Only the first three fields of struct blake2s_ctx are used: 177 177 // u32 h[8]; (inout) 178 178 // u32 t[2]; (inout) 179 179 // u32 f[2]; (in) ··· 183 183 push {r0-r2,r4-r11,lr} // keep this an even number 184 184 185 185 .Lnext_block: 186 - // r0 is 'state' 187 - // r1 is 'block' 186 + // r0 is 'ctx' 187 + // r1 is 'data' 188 188 // r3 is 'inc' 189 189 190 190 // Load and increment the counter t[0..1]. ··· 209 209 .Lcopy_block_done: 210 210 str r1, [sp, #68] // Update message pointer 211 211 212 - // Calculate v[8..15]. Push v[9..15] onto the stack, and leave space 212 + // Calculate v[8..15]. Push v[10..15] onto the stack, and leave space 213 213 // for spilling v[8..9]. Leave v[8..9] in r8-r9. 214 - mov r14, r0 // r14 = state 214 + mov r14, r0 // r14 = ctx 215 215 adr r12, .Lblake2s_IV 216 216 ldmia r12!, {r8-r9} // load IV[0..1] 217 217 __ldrd r0, r1, r14, 40 // load f[0..1] 218 - ldm r12, {r2-r7} // load IV[3..7] 218 + ldm r12, {r2-r7} // load IV[2..7] 219 219 eor r4, r4, r10 // v[12] = IV[4] ^ t[0] 220 220 eor r5, r5, r11 // v[13] = IV[5] ^ t[1] 221 221 eor r6, r6, r0 // v[14] = IV[6] ^ f[0] 222 222 eor r7, r7, r1 // v[15] = IV[7] ^ f[1] 223 - push {r2-r7} // push v[9..15] 223 + push {r2-r7} // push v[10..15] 224 224 sub sp, sp, #8 // leave space for v[8..9] 225 225 226 226 // Load h[0..7] == v[0..7]. ··· 275 275 // Advance to the next block, if there is one. Note that if there are 276 276 // multiple blocks, then 'inc' (the counter increment amount) must be 277 277 // 64. So we can simply set it to 64 without re-loading it. 278 - ldm sp, {r0, r1, r2} // load (state, block, nblocks) 278 + ldm sp, {r0, r1, r2} // load (ctx, data, nblocks) 279 279 mov r3, #64 // set 'inc' 280 280 subs r2, r2, #1 // nblocks-- 281 281 str r2, [sp, #8]
+2 -2
lib/crypto/arm/blake2s.h
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 2 3 3 /* defined in blake2s-core.S */ 4 - void blake2s_compress(struct blake2s_state *state, const u8 *block, 5 - size_t nblocks, u32 inc); 4 + void blake2s_compress(struct blake2s_ctx *ctx, 5 + const u8 *data, size_t nblocks, u32 inc);
+1 -1
lib/crypto/arm/sha1-armv7-neon.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 - /* sha1-armv7-neon.S - ARM/NEON accelerated SHA-1 transform function 2 + /* ARM/NEON accelerated SHA-1 transform function 3 3 * 4 4 * Copyright © 2013-2014 Jussi Kivilinna <jussi.kivilinna@iki.fi> 5 5 */
+1 -1
lib/crypto/arm/sha1-ce-core.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 2 /* 3 - * sha1-ce-core.S - SHA-1 secure hash using ARMv8 Crypto Extensions 3 + * SHA-1 secure hash using ARMv8 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2015 Linaro Ltd. 6 6 * Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
+1 -1
lib/crypto/arm/sha256-ce.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 2 /* 3 - * sha256-ce.S - SHA-224/256 secure hash using ARMv8 Crypto Extensions 3 + * SHA-224/256 secure hash using ARMv8 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2015 Linaro Ltd. 6 6 * Author: Ard Biesheuvel <ard.biesheuvel@linaro.org>
+82
lib/crypto/arm64/polyval.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * POLYVAL library functions, arm64 optimized 4 + * 5 + * Copyright 2025 Google LLC 6 + */ 7 + #include <asm/neon.h> 8 + #include <asm/simd.h> 9 + #include <linux/cpufeature.h> 10 + 11 + #define NUM_H_POWERS 8 12 + 13 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_pmull); 14 + 15 + asmlinkage void polyval_mul_pmull(struct polyval_elem *a, 16 + const struct polyval_elem *b); 17 + asmlinkage void polyval_blocks_pmull(struct polyval_elem *acc, 18 + const struct polyval_key *key, 19 + const u8 *data, size_t nblocks); 20 + 21 + static void polyval_preparekey_arch(struct polyval_key *key, 22 + const u8 raw_key[POLYVAL_BLOCK_SIZE]) 23 + { 24 + static_assert(ARRAY_SIZE(key->h_powers) == NUM_H_POWERS); 25 + memcpy(&key->h_powers[NUM_H_POWERS - 1], raw_key, POLYVAL_BLOCK_SIZE); 26 + if (static_branch_likely(&have_pmull) && may_use_simd()) { 27 + kernel_neon_begin(); 28 + for (int i = NUM_H_POWERS - 2; i >= 0; i--) { 29 + key->h_powers[i] = key->h_powers[i + 1]; 30 + polyval_mul_pmull(&key->h_powers[i], 31 + &key->h_powers[NUM_H_POWERS - 1]); 32 + } 33 + kernel_neon_end(); 34 + } else { 35 + for (int i = NUM_H_POWERS - 2; i >= 0; i--) { 36 + key->h_powers[i] = key->h_powers[i + 1]; 37 + polyval_mul_generic(&key->h_powers[i], 38 + &key->h_powers[NUM_H_POWERS - 1]); 39 + } 40 + } 41 + } 42 + 43 + static void polyval_mul_arch(struct polyval_elem *acc, 44 + const struct polyval_key *key) 45 + { 46 + if (static_branch_likely(&have_pmull) && may_use_simd()) { 47 + kernel_neon_begin(); 48 + polyval_mul_pmull(acc, &key->h_powers[NUM_H_POWERS - 1]); 49 + kernel_neon_end(); 50 + } else { 51 + polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); 52 + } 53 + } 54 + 55 + static void polyval_blocks_arch(struct polyval_elem *acc, 56 + const struct polyval_key *key, 57 + const u8 *data, size_t nblocks) 58 + { 59 + if (static_branch_likely(&have_pmull) && may_use_simd()) { 60 + do { 61 + /* Allow rescheduling every 4 KiB. */ 62 + size_t n = min_t(size_t, nblocks, 63 + 4096 / POLYVAL_BLOCK_SIZE); 64 + 65 + kernel_neon_begin(); 66 + polyval_blocks_pmull(acc, key, data, n); 67 + kernel_neon_end(); 68 + data += n * POLYVAL_BLOCK_SIZE; 69 + nblocks -= n; 70 + } while (nblocks); 71 + } else { 72 + polyval_blocks_generic(acc, &key->h_powers[NUM_H_POWERS - 1], 73 + data, nblocks); 74 + } 75 + } 76 + 77 + #define polyval_mod_init_arch polyval_mod_init_arch 78 + static void polyval_mod_init_arch(void) 79 + { 80 + if (cpu_have_named_feature(PMULL)) 81 + static_branch_enable(&have_pmull); 82 + }
+1 -1
lib/crypto/arm64/sha1-ce-core.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 2 /* 3 - * sha1-ce-core.S - SHA-1 secure hash using ARMv8 Crypto Extensions 3 + * SHA-1 secure hash using ARMv8 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2014 Linaro Ltd <ard.biesheuvel@linaro.org> 6 6 */
+1 -1
lib/crypto/arm64/sha256-ce.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0-only */ 2 2 /* 3 - * sha2-ce-core.S - core SHA-224/SHA-256 transform using v8 Crypto Extensions 3 + * Core SHA-224/SHA-256 transform using v8 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2014 Linaro Ltd <ard.biesheuvel@linaro.org> 6 6 */
+62
lib/crypto/arm64/sha3.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0 */ 2 + /* 3 + * Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org> 4 + * 5 + * This program is free software; you can redistribute it and/or modify 6 + * it under the terms of the GNU General Public License version 2 as 7 + * published by the Free Software Foundation. 8 + */ 9 + 10 + #include <asm/neon.h> 11 + #include <asm/simd.h> 12 + #include <linux/cpufeature.h> 13 + 14 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_sha3); 15 + 16 + asmlinkage size_t sha3_ce_transform(struct sha3_state *state, const u8 *data, 17 + size_t nblocks, size_t block_size); 18 + 19 + static void sha3_absorb_blocks(struct sha3_state *state, const u8 *data, 20 + size_t nblocks, size_t block_size) 21 + { 22 + if (static_branch_likely(&have_sha3) && likely(may_use_simd())) { 23 + do { 24 + size_t rem; 25 + 26 + kernel_neon_begin(); 27 + rem = sha3_ce_transform(state, data, nblocks, 28 + block_size); 29 + kernel_neon_end(); 30 + data += (nblocks - rem) * block_size; 31 + nblocks = rem; 32 + } while (nblocks); 33 + } else { 34 + sha3_absorb_blocks_generic(state, data, nblocks, block_size); 35 + } 36 + } 37 + 38 + static void sha3_keccakf(struct sha3_state *state) 39 + { 40 + if (static_branch_likely(&have_sha3) && likely(may_use_simd())) { 41 + /* 42 + * Passing zeroes into sha3_ce_transform() gives the plain 43 + * Keccak-f permutation, which is what we want here. Any 44 + * supported block size may be used. Use SHA3_512_BLOCK_SIZE 45 + * since it's the shortest. 46 + */ 47 + static const u8 zeroes[SHA3_512_BLOCK_SIZE]; 48 + 49 + kernel_neon_begin(); 50 + sha3_ce_transform(state, zeroes, 1, sizeof(zeroes)); 51 + kernel_neon_end(); 52 + } else { 53 + sha3_keccakf_generic(state); 54 + } 55 + } 56 + 57 + #define sha3_mod_init_arch sha3_mod_init_arch 58 + static void sha3_mod_init_arch(void) 59 + { 60 + if (cpu_have_named_feature(SHA3)) 61 + static_branch_enable(&have_sha3); 62 + }
+1 -1
lib/crypto/arm64/sha512-ce-core.S
··· 1 1 /* SPDX-License-Identifier: GPL-2.0 */ 2 2 /* 3 - * sha512-ce-core.S - core SHA-384/SHA-512 transform using v8 Crypto Extensions 3 + * Core SHA-384/SHA-512 transform using v8 Crypto Extensions 4 4 * 5 5 * Copyright (C) 2018 Linaro Ltd <ard.biesheuvel@linaro.org> 6 6 *
+174
lib/crypto/blake2b.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 OR MIT 2 + /* 3 + * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. 4 + * Copyright 2025 Google LLC 5 + * 6 + * This is an implementation of the BLAKE2b hash and PRF functions. 7 + * 8 + * Information: https://blake2.net/ 9 + */ 10 + 11 + #include <crypto/blake2b.h> 12 + #include <linux/bug.h> 13 + #include <linux/export.h> 14 + #include <linux/kernel.h> 15 + #include <linux/module.h> 16 + #include <linux/string.h> 17 + #include <linux/types.h> 18 + 19 + static const u8 blake2b_sigma[12][16] = { 20 + { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }, 21 + { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 }, 22 + { 11, 8, 12, 0, 5, 2, 15, 13, 10, 14, 3, 6, 7, 1, 9, 4 }, 23 + { 7, 9, 3, 1, 13, 12, 11, 14, 2, 6, 5, 10, 4, 0, 15, 8 }, 24 + { 9, 0, 5, 7, 2, 4, 10, 15, 14, 1, 11, 12, 6, 8, 3, 13 }, 25 + { 2, 12, 6, 10, 0, 11, 8, 3, 4, 13, 7, 5, 15, 14, 1, 9 }, 26 + { 12, 5, 1, 15, 14, 13, 4, 10, 0, 7, 6, 3, 9, 2, 8, 11 }, 27 + { 13, 11, 7, 14, 12, 1, 3, 9, 5, 0, 15, 4, 8, 6, 2, 10 }, 28 + { 6, 15, 14, 9, 11, 3, 0, 8, 12, 2, 13, 7, 1, 4, 10, 5 }, 29 + { 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 }, 30 + { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 }, 31 + { 14, 10, 4, 8, 9, 15, 13, 6, 1, 12, 0, 2, 11, 7, 5, 3 } 32 + }; 33 + 34 + static inline void blake2b_increment_counter(struct blake2b_ctx *ctx, u32 inc) 35 + { 36 + ctx->t[0] += inc; 37 + ctx->t[1] += (ctx->t[0] < inc); 38 + } 39 + 40 + static void __maybe_unused 41 + blake2b_compress_generic(struct blake2b_ctx *ctx, 42 + const u8 *data, size_t nblocks, u32 inc) 43 + { 44 + u64 m[16]; 45 + u64 v[16]; 46 + int i; 47 + 48 + WARN_ON(IS_ENABLED(DEBUG) && 49 + (nblocks > 1 && inc != BLAKE2B_BLOCK_SIZE)); 50 + 51 + while (nblocks > 0) { 52 + blake2b_increment_counter(ctx, inc); 53 + memcpy(m, data, BLAKE2B_BLOCK_SIZE); 54 + le64_to_cpu_array(m, ARRAY_SIZE(m)); 55 + memcpy(v, ctx->h, 64); 56 + v[ 8] = BLAKE2B_IV0; 57 + v[ 9] = BLAKE2B_IV1; 58 + v[10] = BLAKE2B_IV2; 59 + v[11] = BLAKE2B_IV3; 60 + v[12] = BLAKE2B_IV4 ^ ctx->t[0]; 61 + v[13] = BLAKE2B_IV5 ^ ctx->t[1]; 62 + v[14] = BLAKE2B_IV6 ^ ctx->f[0]; 63 + v[15] = BLAKE2B_IV7 ^ ctx->f[1]; 64 + 65 + #define G(r, i, a, b, c, d) do { \ 66 + a += b + m[blake2b_sigma[r][2 * i + 0]]; \ 67 + d = ror64(d ^ a, 32); \ 68 + c += d; \ 69 + b = ror64(b ^ c, 24); \ 70 + a += b + m[blake2b_sigma[r][2 * i + 1]]; \ 71 + d = ror64(d ^ a, 16); \ 72 + c += d; \ 73 + b = ror64(b ^ c, 63); \ 74 + } while (0) 75 + 76 + #define ROUND(r) do { \ 77 + G(r, 0, v[0], v[ 4], v[ 8], v[12]); \ 78 + G(r, 1, v[1], v[ 5], v[ 9], v[13]); \ 79 + G(r, 2, v[2], v[ 6], v[10], v[14]); \ 80 + G(r, 3, v[3], v[ 7], v[11], v[15]); \ 81 + G(r, 4, v[0], v[ 5], v[10], v[15]); \ 82 + G(r, 5, v[1], v[ 6], v[11], v[12]); \ 83 + G(r, 6, v[2], v[ 7], v[ 8], v[13]); \ 84 + G(r, 7, v[3], v[ 4], v[ 9], v[14]); \ 85 + } while (0) 86 + ROUND(0); 87 + ROUND(1); 88 + ROUND(2); 89 + ROUND(3); 90 + ROUND(4); 91 + ROUND(5); 92 + ROUND(6); 93 + ROUND(7); 94 + ROUND(8); 95 + ROUND(9); 96 + ROUND(10); 97 + ROUND(11); 98 + 99 + #undef G 100 + #undef ROUND 101 + 102 + for (i = 0; i < 8; ++i) 103 + ctx->h[i] ^= v[i] ^ v[i + 8]; 104 + 105 + data += BLAKE2B_BLOCK_SIZE; 106 + --nblocks; 107 + } 108 + } 109 + 110 + #ifdef CONFIG_CRYPTO_LIB_BLAKE2B_ARCH 111 + #include "blake2b.h" /* $(SRCARCH)/blake2b.h */ 112 + #else 113 + #define blake2b_compress blake2b_compress_generic 114 + #endif 115 + 116 + static inline void blake2b_set_lastblock(struct blake2b_ctx *ctx) 117 + { 118 + ctx->f[0] = -1; 119 + } 120 + 121 + void blake2b_update(struct blake2b_ctx *ctx, const u8 *in, size_t inlen) 122 + { 123 + const size_t fill = BLAKE2B_BLOCK_SIZE - ctx->buflen; 124 + 125 + if (unlikely(!inlen)) 126 + return; 127 + if (inlen > fill) { 128 + memcpy(ctx->buf + ctx->buflen, in, fill); 129 + blake2b_compress(ctx, ctx->buf, 1, BLAKE2B_BLOCK_SIZE); 130 + ctx->buflen = 0; 131 + in += fill; 132 + inlen -= fill; 133 + } 134 + if (inlen > BLAKE2B_BLOCK_SIZE) { 135 + const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2B_BLOCK_SIZE); 136 + 137 + blake2b_compress(ctx, in, nblocks - 1, BLAKE2B_BLOCK_SIZE); 138 + in += BLAKE2B_BLOCK_SIZE * (nblocks - 1); 139 + inlen -= BLAKE2B_BLOCK_SIZE * (nblocks - 1); 140 + } 141 + memcpy(ctx->buf + ctx->buflen, in, inlen); 142 + ctx->buflen += inlen; 143 + } 144 + EXPORT_SYMBOL(blake2b_update); 145 + 146 + void blake2b_final(struct blake2b_ctx *ctx, u8 *out) 147 + { 148 + WARN_ON(IS_ENABLED(DEBUG) && !out); 149 + blake2b_set_lastblock(ctx); 150 + memset(ctx->buf + ctx->buflen, 0, 151 + BLAKE2B_BLOCK_SIZE - ctx->buflen); /* Padding */ 152 + blake2b_compress(ctx, ctx->buf, 1, ctx->buflen); 153 + cpu_to_le64_array(ctx->h, ARRAY_SIZE(ctx->h)); 154 + memcpy(out, ctx->h, ctx->outlen); 155 + memzero_explicit(ctx, sizeof(*ctx)); 156 + } 157 + EXPORT_SYMBOL(blake2b_final); 158 + 159 + #ifdef blake2b_mod_init_arch 160 + static int __init blake2b_mod_init(void) 161 + { 162 + blake2b_mod_init_arch(); 163 + return 0; 164 + } 165 + subsys_initcall(blake2b_mod_init); 166 + 167 + static void __exit blake2b_mod_exit(void) 168 + { 169 + } 170 + module_exit(blake2b_mod_exit); 171 + #endif 172 + 173 + MODULE_DESCRIPTION("BLAKE2b hash function"); 174 + MODULE_LICENSE("GPL");
+33 -33
lib/crypto/blake2s.c
··· 29 29 { 10, 2, 8, 4, 7, 6, 1, 5, 15, 11, 9, 14, 3, 12, 13, 0 }, 30 30 }; 31 31 32 - static inline void blake2s_increment_counter(struct blake2s_state *state, 33 - const u32 inc) 32 + static inline void blake2s_increment_counter(struct blake2s_ctx *ctx, u32 inc) 34 33 { 35 - state->t[0] += inc; 36 - state->t[1] += (state->t[0] < inc); 34 + ctx->t[0] += inc; 35 + ctx->t[1] += (ctx->t[0] < inc); 37 36 } 38 37 39 38 static void __maybe_unused 40 - blake2s_compress_generic(struct blake2s_state *state, const u8 *block, 41 - size_t nblocks, const u32 inc) 39 + blake2s_compress_generic(struct blake2s_ctx *ctx, 40 + const u8 *data, size_t nblocks, u32 inc) 42 41 { 43 42 u32 m[16]; 44 43 u32 v[16]; ··· 47 48 (nblocks > 1 && inc != BLAKE2S_BLOCK_SIZE)); 48 49 49 50 while (nblocks > 0) { 50 - blake2s_increment_counter(state, inc); 51 - memcpy(m, block, BLAKE2S_BLOCK_SIZE); 51 + blake2s_increment_counter(ctx, inc); 52 + memcpy(m, data, BLAKE2S_BLOCK_SIZE); 52 53 le32_to_cpu_array(m, ARRAY_SIZE(m)); 53 - memcpy(v, state->h, 32); 54 + memcpy(v, ctx->h, 32); 54 55 v[ 8] = BLAKE2S_IV0; 55 56 v[ 9] = BLAKE2S_IV1; 56 57 v[10] = BLAKE2S_IV2; 57 58 v[11] = BLAKE2S_IV3; 58 - v[12] = BLAKE2S_IV4 ^ state->t[0]; 59 - v[13] = BLAKE2S_IV5 ^ state->t[1]; 60 - v[14] = BLAKE2S_IV6 ^ state->f[0]; 61 - v[15] = BLAKE2S_IV7 ^ state->f[1]; 59 + v[12] = BLAKE2S_IV4 ^ ctx->t[0]; 60 + v[13] = BLAKE2S_IV5 ^ ctx->t[1]; 61 + v[14] = BLAKE2S_IV6 ^ ctx->f[0]; 62 + v[15] = BLAKE2S_IV7 ^ ctx->f[1]; 62 63 63 64 #define G(r, i, a, b, c, d) do { \ 64 65 a += b + m[blake2s_sigma[r][2 * i + 0]]; \ ··· 96 97 #undef ROUND 97 98 98 99 for (i = 0; i < 8; ++i) 99 - state->h[i] ^= v[i] ^ v[i + 8]; 100 + ctx->h[i] ^= v[i] ^ v[i + 8]; 100 101 101 - block += BLAKE2S_BLOCK_SIZE; 102 + data += BLAKE2S_BLOCK_SIZE; 102 103 --nblocks; 103 104 } 104 105 } ··· 109 110 #define blake2s_compress blake2s_compress_generic 110 111 #endif 111 112 112 - static inline void blake2s_set_lastblock(struct blake2s_state *state) 113 + static inline void blake2s_set_lastblock(struct blake2s_ctx *ctx) 113 114 { 114 - state->f[0] = -1; 115 + ctx->f[0] = -1; 115 116 } 116 117 117 - void blake2s_update(struct blake2s_state *state, const u8 *in, size_t inlen) 118 + void blake2s_update(struct blake2s_ctx *ctx, const u8 *in, size_t inlen) 118 119 { 119 - const size_t fill = BLAKE2S_BLOCK_SIZE - state->buflen; 120 + const size_t fill = BLAKE2S_BLOCK_SIZE - ctx->buflen; 120 121 121 122 if (unlikely(!inlen)) 122 123 return; 123 124 if (inlen > fill) { 124 - memcpy(state->buf + state->buflen, in, fill); 125 - blake2s_compress(state, state->buf, 1, BLAKE2S_BLOCK_SIZE); 126 - state->buflen = 0; 125 + memcpy(ctx->buf + ctx->buflen, in, fill); 126 + blake2s_compress(ctx, ctx->buf, 1, BLAKE2S_BLOCK_SIZE); 127 + ctx->buflen = 0; 127 128 in += fill; 128 129 inlen -= fill; 129 130 } 130 131 if (inlen > BLAKE2S_BLOCK_SIZE) { 131 132 const size_t nblocks = DIV_ROUND_UP(inlen, BLAKE2S_BLOCK_SIZE); 132 - blake2s_compress(state, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); 133 + 134 + blake2s_compress(ctx, in, nblocks - 1, BLAKE2S_BLOCK_SIZE); 133 135 in += BLAKE2S_BLOCK_SIZE * (nblocks - 1); 134 136 inlen -= BLAKE2S_BLOCK_SIZE * (nblocks - 1); 135 137 } 136 - memcpy(state->buf + state->buflen, in, inlen); 137 - state->buflen += inlen; 138 + memcpy(ctx->buf + ctx->buflen, in, inlen); 139 + ctx->buflen += inlen; 138 140 } 139 141 EXPORT_SYMBOL(blake2s_update); 140 142 141 - void blake2s_final(struct blake2s_state *state, u8 *out) 143 + void blake2s_final(struct blake2s_ctx *ctx, u8 *out) 142 144 { 143 145 WARN_ON(IS_ENABLED(DEBUG) && !out); 144 - blake2s_set_lastblock(state); 145 - memset(state->buf + state->buflen, 0, 146 - BLAKE2S_BLOCK_SIZE - state->buflen); /* Padding */ 147 - blake2s_compress(state, state->buf, 1, state->buflen); 148 - cpu_to_le32_array(state->h, ARRAY_SIZE(state->h)); 149 - memcpy(out, state->h, state->outlen); 150 - memzero_explicit(state, sizeof(*state)); 146 + blake2s_set_lastblock(ctx); 147 + memset(ctx->buf + ctx->buflen, 0, 148 + BLAKE2S_BLOCK_SIZE - ctx->buflen); /* Padding */ 149 + blake2s_compress(ctx, ctx->buf, 1, ctx->buflen); 150 + cpu_to_le32_array(ctx->h, ARRAY_SIZE(ctx->h)); 151 + memcpy(out, ctx->h, ctx->outlen); 152 + memzero_explicit(ctx, sizeof(*ctx)); 151 153 } 152 154 EXPORT_SYMBOL(blake2s_final); 153 155
+45
lib/crypto/fips.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* This file was generated by: gen-fips-testvecs.py */ 3 + 4 + #include <linux/fips.h> 5 + 6 + static const u8 fips_test_data[] __initconst __maybe_unused = { 7 + 0x66, 0x69, 0x70, 0x73, 0x20, 0x74, 0x65, 0x73, 8 + 0x74, 0x20, 0x64, 0x61, 0x74, 0x61, 0x00, 0x00, 9 + }; 10 + 11 + static const u8 fips_test_key[] __initconst __maybe_unused = { 12 + 0x66, 0x69, 0x70, 0x73, 0x20, 0x74, 0x65, 0x73, 13 + 0x74, 0x20, 0x6b, 0x65, 0x79, 0x00, 0x00, 0x00, 14 + }; 15 + 16 + static const u8 fips_test_hmac_sha1_value[] __initconst __maybe_unused = { 17 + 0x29, 0xa9, 0x88, 0xb8, 0x5c, 0xb4, 0xaf, 0x4b, 18 + 0x97, 0x2a, 0xee, 0x87, 0x5b, 0x0a, 0x02, 0x55, 19 + 0x99, 0xbf, 0x86, 0x78, 20 + }; 21 + 22 + static const u8 fips_test_hmac_sha256_value[] __initconst __maybe_unused = { 23 + 0x59, 0x25, 0x85, 0xcc, 0x40, 0xe9, 0x64, 0x2f, 24 + 0xe9, 0xbf, 0x82, 0xb7, 0xd3, 0x15, 0x3d, 0x43, 25 + 0x22, 0x0b, 0x4c, 0x00, 0x90, 0x14, 0x25, 0xcf, 26 + 0x9e, 0x13, 0x2b, 0xc2, 0x30, 0xe6, 0xe8, 0x93, 27 + }; 28 + 29 + static const u8 fips_test_hmac_sha512_value[] __initconst __maybe_unused = { 30 + 0x6b, 0xea, 0x5d, 0x27, 0x49, 0x5b, 0x3f, 0xea, 31 + 0xde, 0x2d, 0xfa, 0x32, 0x75, 0xdb, 0x77, 0xc8, 32 + 0x26, 0xe9, 0x4e, 0x95, 0x4d, 0xad, 0x88, 0x02, 33 + 0x87, 0xf9, 0x52, 0x0a, 0xd1, 0x92, 0x80, 0x1d, 34 + 0x92, 0x7e, 0x3c, 0xbd, 0xb1, 0x3c, 0x49, 0x98, 35 + 0x44, 0x9c, 0x8f, 0xee, 0x3f, 0x02, 0x71, 0x51, 36 + 0x57, 0x0b, 0x15, 0x38, 0x95, 0xd8, 0xa3, 0x81, 37 + 0xba, 0xb3, 0x15, 0x37, 0x5c, 0x6d, 0x57, 0x2b, 38 + }; 39 + 40 + static const u8 fips_test_sha3_256_value[] __initconst __maybe_unused = { 41 + 0x77, 0xc4, 0x8b, 0x69, 0x70, 0x5f, 0x0a, 0xb1, 42 + 0xb1, 0xa5, 0x82, 0x0a, 0x22, 0x2b, 0x49, 0x31, 43 + 0xba, 0x9b, 0xb6, 0xaa, 0x32, 0xa7, 0x97, 0x00, 44 + 0x98, 0xdb, 0xff, 0xe7, 0xc6, 0xde, 0xb5, 0x82, 45 + };
+307
lib/crypto/polyval.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * POLYVAL library functions 4 + * 5 + * Copyright 2025 Google LLC 6 + */ 7 + 8 + #include <crypto/polyval.h> 9 + #include <linux/export.h> 10 + #include <linux/module.h> 11 + #include <linux/string.h> 12 + #include <linux/unaligned.h> 13 + 14 + /* 15 + * POLYVAL is an almost-XOR-universal hash function. Similar to GHASH, POLYVAL 16 + * interprets the message as the coefficients of a polynomial in GF(2^128) and 17 + * evaluates that polynomial at a secret point. POLYVAL has a simple 18 + * mathematical relationship with GHASH, but it uses a better field convention 19 + * which makes it easier and faster to implement. 20 + * 21 + * POLYVAL is not a cryptographic hash function, and it should be used only by 22 + * algorithms that are specifically designed to use it. 23 + * 24 + * POLYVAL is specified by "AES-GCM-SIV: Nonce Misuse-Resistant Authenticated 25 + * Encryption" (https://datatracker.ietf.org/doc/html/rfc8452) 26 + * 27 + * POLYVAL is also used by HCTR2. See "Length-preserving encryption with HCTR2" 28 + * (https://eprint.iacr.org/2021/1441.pdf). 29 + * 30 + * This file provides a library API for POLYVAL. This API can delegate to 31 + * either a generic implementation or an architecture-optimized implementation. 32 + * 33 + * For the generic implementation, we don't use the traditional table approach 34 + * to GF(2^128) multiplication. That approach is not constant-time and requires 35 + * a lot of memory. Instead, we use a different approach which emulates 36 + * carryless multiplication using standard multiplications by spreading the data 37 + * bits apart using "holes". This allows the carries to spill harmlessly. This 38 + * approach is borrowed from BoringSSL, which in turn credits BearSSL's 39 + * documentation (https://bearssl.org/constanttime.html#ghash-for-gcm) for the 40 + * "holes" trick and a presentation by Shay Gueron 41 + * (https://crypto.stanford.edu/RealWorldCrypto/slides/gueron.pdf) for the 42 + * 256-bit => 128-bit reduction algorithm. 43 + */ 44 + 45 + #ifdef CONFIG_ARCH_SUPPORTS_INT128 46 + 47 + /* Do a 64 x 64 => 128 bit carryless multiplication. */ 48 + static void clmul64(u64 a, u64 b, u64 *out_lo, u64 *out_hi) 49 + { 50 + /* 51 + * With 64-bit multiplicands and one term every 4 bits, there would be 52 + * up to 64 / 4 = 16 one bits per column when each multiplication is 53 + * written out as a series of additions in the schoolbook manner. 54 + * Unfortunately, that doesn't work since the value 16 is 1 too large to 55 + * fit in 4 bits. Carries would sometimes overflow into the next term. 56 + * 57 + * Using one term every 5 bits would work. However, that would cost 58 + * 5 x 5 = 25 multiplications instead of 4 x 4 = 16. 59 + * 60 + * Instead, mask off 4 bits from one multiplicand, giving a max of 15 61 + * one bits per column. Then handle those 4 bits separately. 62 + */ 63 + u64 a0 = a & 0x1111111111111110; 64 + u64 a1 = a & 0x2222222222222220; 65 + u64 a2 = a & 0x4444444444444440; 66 + u64 a3 = a & 0x8888888888888880; 67 + 68 + u64 b0 = b & 0x1111111111111111; 69 + u64 b1 = b & 0x2222222222222222; 70 + u64 b2 = b & 0x4444444444444444; 71 + u64 b3 = b & 0x8888888888888888; 72 + 73 + /* Multiply the high 60 bits of @a by @b. */ 74 + u128 c0 = (a0 * (u128)b0) ^ (a1 * (u128)b3) ^ 75 + (a2 * (u128)b2) ^ (a3 * (u128)b1); 76 + u128 c1 = (a0 * (u128)b1) ^ (a1 * (u128)b0) ^ 77 + (a2 * (u128)b3) ^ (a3 * (u128)b2); 78 + u128 c2 = (a0 * (u128)b2) ^ (a1 * (u128)b1) ^ 79 + (a2 * (u128)b0) ^ (a3 * (u128)b3); 80 + u128 c3 = (a0 * (u128)b3) ^ (a1 * (u128)b2) ^ 81 + (a2 * (u128)b1) ^ (a3 * (u128)b0); 82 + 83 + /* Multiply the low 4 bits of @a by @b. */ 84 + u64 e0 = -(a & 1) & b; 85 + u64 e1 = -((a >> 1) & 1) & b; 86 + u64 e2 = -((a >> 2) & 1) & b; 87 + u64 e3 = -((a >> 3) & 1) & b; 88 + u64 extra_lo = e0 ^ (e1 << 1) ^ (e2 << 2) ^ (e3 << 3); 89 + u64 extra_hi = (e1 >> 63) ^ (e2 >> 62) ^ (e3 >> 61); 90 + 91 + /* Add all the intermediate products together. */ 92 + *out_lo = (((u64)c0) & 0x1111111111111111) ^ 93 + (((u64)c1) & 0x2222222222222222) ^ 94 + (((u64)c2) & 0x4444444444444444) ^ 95 + (((u64)c3) & 0x8888888888888888) ^ extra_lo; 96 + *out_hi = (((u64)(c0 >> 64)) & 0x1111111111111111) ^ 97 + (((u64)(c1 >> 64)) & 0x2222222222222222) ^ 98 + (((u64)(c2 >> 64)) & 0x4444444444444444) ^ 99 + (((u64)(c3 >> 64)) & 0x8888888888888888) ^ extra_hi; 100 + } 101 + 102 + #else /* CONFIG_ARCH_SUPPORTS_INT128 */ 103 + 104 + /* Do a 32 x 32 => 64 bit carryless multiplication. */ 105 + static u64 clmul32(u32 a, u32 b) 106 + { 107 + /* 108 + * With 32-bit multiplicands and one term every 4 bits, there are up to 109 + * 32 / 4 = 8 one bits per column when each multiplication is written 110 + * out as a series of additions in the schoolbook manner. The value 8 111 + * fits in 4 bits, so the carries don't overflow into the next term. 112 + */ 113 + u32 a0 = a & 0x11111111; 114 + u32 a1 = a & 0x22222222; 115 + u32 a2 = a & 0x44444444; 116 + u32 a3 = a & 0x88888888; 117 + 118 + u32 b0 = b & 0x11111111; 119 + u32 b1 = b & 0x22222222; 120 + u32 b2 = b & 0x44444444; 121 + u32 b3 = b & 0x88888888; 122 + 123 + u64 c0 = (a0 * (u64)b0) ^ (a1 * (u64)b3) ^ 124 + (a2 * (u64)b2) ^ (a3 * (u64)b1); 125 + u64 c1 = (a0 * (u64)b1) ^ (a1 * (u64)b0) ^ 126 + (a2 * (u64)b3) ^ (a3 * (u64)b2); 127 + u64 c2 = (a0 * (u64)b2) ^ (a1 * (u64)b1) ^ 128 + (a2 * (u64)b0) ^ (a3 * (u64)b3); 129 + u64 c3 = (a0 * (u64)b3) ^ (a1 * (u64)b2) ^ 130 + (a2 * (u64)b1) ^ (a3 * (u64)b0); 131 + 132 + /* Add all the intermediate products together. */ 133 + return (c0 & 0x1111111111111111) ^ 134 + (c1 & 0x2222222222222222) ^ 135 + (c2 & 0x4444444444444444) ^ 136 + (c3 & 0x8888888888888888); 137 + } 138 + 139 + /* Do a 64 x 64 => 128 bit carryless multiplication. */ 140 + static void clmul64(u64 a, u64 b, u64 *out_lo, u64 *out_hi) 141 + { 142 + u32 a_lo = (u32)a; 143 + u32 a_hi = a >> 32; 144 + u32 b_lo = (u32)b; 145 + u32 b_hi = b >> 32; 146 + 147 + /* Karatsuba multiplication */ 148 + u64 lo = clmul32(a_lo, b_lo); 149 + u64 hi = clmul32(a_hi, b_hi); 150 + u64 mi = clmul32(a_lo ^ a_hi, b_lo ^ b_hi) ^ lo ^ hi; 151 + 152 + *out_lo = lo ^ (mi << 32); 153 + *out_hi = hi ^ (mi >> 32); 154 + } 155 + #endif /* !CONFIG_ARCH_SUPPORTS_INT128 */ 156 + 157 + /* Compute @a = @a * @b * x^-128 in the POLYVAL field. */ 158 + static void __maybe_unused 159 + polyval_mul_generic(struct polyval_elem *a, const struct polyval_elem *b) 160 + { 161 + u64 c0, c1, c2, c3, mi0, mi1; 162 + 163 + /* 164 + * Carryless-multiply @a by @b using Karatsuba multiplication. Store 165 + * the 256-bit product in @c0 (low) through @c3 (high). 166 + */ 167 + clmul64(le64_to_cpu(a->lo), le64_to_cpu(b->lo), &c0, &c1); 168 + clmul64(le64_to_cpu(a->hi), le64_to_cpu(b->hi), &c2, &c3); 169 + clmul64(le64_to_cpu(a->lo ^ a->hi), le64_to_cpu(b->lo ^ b->hi), 170 + &mi0, &mi1); 171 + mi0 ^= c0 ^ c2; 172 + mi1 ^= c1 ^ c3; 173 + c1 ^= mi0; 174 + c2 ^= mi1; 175 + 176 + /* 177 + * Cancel out the low 128 bits of the product by adding multiples of 178 + * G(x) = x^128 + x^127 + x^126 + x^121 + 1. Do this in two steps, each 179 + * of which cancels out 64 bits. Note that we break G(x) into three 180 + * parts: 1, x^64 * (x^63 + x^62 + x^57), and x^128 * 1. 181 + */ 182 + 183 + /* 184 + * First, add G(x) times c0 as follows: 185 + * 186 + * (c0, c1, c2) = (0, 187 + * c1 + (c0 * (x^63 + x^62 + x^57) mod x^64), 188 + * c2 + c0 + floor((c0 * (x^63 + x^62 + x^57)) / x^64)) 189 + */ 190 + c1 ^= (c0 << 63) ^ (c0 << 62) ^ (c0 << 57); 191 + c2 ^= c0 ^ (c0 >> 1) ^ (c0 >> 2) ^ (c0 >> 7); 192 + 193 + /* 194 + * Second, add G(x) times the new c1: 195 + * 196 + * (c1, c2, c3) = (0, 197 + * c2 + (c1 * (x^63 + x^62 + x^57) mod x^64), 198 + * c3 + c1 + floor((c1 * (x^63 + x^62 + x^57)) / x^64)) 199 + */ 200 + c2 ^= (c1 << 63) ^ (c1 << 62) ^ (c1 << 57); 201 + c3 ^= c1 ^ (c1 >> 1) ^ (c1 >> 2) ^ (c1 >> 7); 202 + 203 + /* Return (c2, c3). This implicitly multiplies by x^-128. */ 204 + a->lo = cpu_to_le64(c2); 205 + a->hi = cpu_to_le64(c3); 206 + } 207 + 208 + static void __maybe_unused 209 + polyval_blocks_generic(struct polyval_elem *acc, const struct polyval_elem *key, 210 + const u8 *data, size_t nblocks) 211 + { 212 + do { 213 + acc->lo ^= get_unaligned((__le64 *)data); 214 + acc->hi ^= get_unaligned((__le64 *)(data + 8)); 215 + polyval_mul_generic(acc, key); 216 + data += POLYVAL_BLOCK_SIZE; 217 + } while (--nblocks); 218 + } 219 + 220 + /* Include the arch-optimized implementation of POLYVAL, if one is available. */ 221 + #ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH 222 + #include "polyval.h" /* $(SRCARCH)/polyval.h */ 223 + void polyval_preparekey(struct polyval_key *key, 224 + const u8 raw_key[POLYVAL_BLOCK_SIZE]) 225 + { 226 + polyval_preparekey_arch(key, raw_key); 227 + } 228 + EXPORT_SYMBOL_GPL(polyval_preparekey); 229 + #endif /* Else, polyval_preparekey() is an inline function. */ 230 + 231 + /* 232 + * polyval_mul_generic() and polyval_blocks_generic() take the key as a 233 + * polyval_elem rather than a polyval_key, so that arch-optimized 234 + * implementations with a different key format can use it as a fallback (if they 235 + * have H^1 stored somewhere in their struct). Thus, the following dispatch 236 + * code is needed to pass the appropriate key argument. 237 + */ 238 + 239 + static void polyval_mul(struct polyval_ctx *ctx) 240 + { 241 + #ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH 242 + polyval_mul_arch(&ctx->acc, ctx->key); 243 + #else 244 + polyval_mul_generic(&ctx->acc, &ctx->key->h); 245 + #endif 246 + } 247 + 248 + static void polyval_blocks(struct polyval_ctx *ctx, 249 + const u8 *data, size_t nblocks) 250 + { 251 + #ifdef CONFIG_CRYPTO_LIB_POLYVAL_ARCH 252 + polyval_blocks_arch(&ctx->acc, ctx->key, data, nblocks); 253 + #else 254 + polyval_blocks_generic(&ctx->acc, &ctx->key->h, data, nblocks); 255 + #endif 256 + } 257 + 258 + void polyval_update(struct polyval_ctx *ctx, const u8 *data, size_t len) 259 + { 260 + if (unlikely(ctx->partial)) { 261 + size_t n = min(len, POLYVAL_BLOCK_SIZE - ctx->partial); 262 + 263 + len -= n; 264 + while (n--) 265 + ctx->acc.bytes[ctx->partial++] ^= *data++; 266 + if (ctx->partial < POLYVAL_BLOCK_SIZE) 267 + return; 268 + polyval_mul(ctx); 269 + } 270 + if (len >= POLYVAL_BLOCK_SIZE) { 271 + size_t nblocks = len / POLYVAL_BLOCK_SIZE; 272 + 273 + polyval_blocks(ctx, data, nblocks); 274 + data += len & ~(POLYVAL_BLOCK_SIZE - 1); 275 + len &= POLYVAL_BLOCK_SIZE - 1; 276 + } 277 + for (size_t i = 0; i < len; i++) 278 + ctx->acc.bytes[i] ^= data[i]; 279 + ctx->partial = len; 280 + } 281 + EXPORT_SYMBOL_GPL(polyval_update); 282 + 283 + void polyval_final(struct polyval_ctx *ctx, u8 out[POLYVAL_BLOCK_SIZE]) 284 + { 285 + if (unlikely(ctx->partial)) 286 + polyval_mul(ctx); 287 + memcpy(out, &ctx->acc, POLYVAL_BLOCK_SIZE); 288 + memzero_explicit(ctx, sizeof(*ctx)); 289 + } 290 + EXPORT_SYMBOL_GPL(polyval_final); 291 + 292 + #ifdef polyval_mod_init_arch 293 + static int __init polyval_mod_init(void) 294 + { 295 + polyval_mod_init_arch(); 296 + return 0; 297 + } 298 + subsys_initcall(polyval_mod_init); 299 + 300 + static void __exit polyval_mod_exit(void) 301 + { 302 + } 303 + module_exit(polyval_mod_exit); 304 + #endif 305 + 306 + MODULE_DESCRIPTION("POLYVAL almost-XOR-universal hash function"); 307 + MODULE_LICENSE("GPL");
+151
lib/crypto/s390/sha3.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * SHA-3 optimized using the CP Assist for Cryptographic Functions (CPACF) 4 + * 5 + * Copyright 2025 Google LLC 6 + */ 7 + #include <asm/cpacf.h> 8 + #include <linux/cpufeature.h> 9 + 10 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_sha3); 11 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_sha3_init_optim); 12 + 13 + static void sha3_absorb_blocks(struct sha3_state *state, const u8 *data, 14 + size_t nblocks, size_t block_size) 15 + { 16 + if (static_branch_likely(&have_sha3)) { 17 + /* 18 + * Note that KIMD assumes little-endian order of the state 19 + * words. sha3_state already uses that order, though, so 20 + * there's no need for a byteswap. 21 + */ 22 + switch (block_size) { 23 + case SHA3_224_BLOCK_SIZE: 24 + cpacf_kimd(CPACF_KIMD_SHA3_224, state, 25 + data, nblocks * block_size); 26 + return; 27 + case SHA3_256_BLOCK_SIZE: 28 + /* 29 + * This case handles both SHA3-256 and SHAKE256, since 30 + * they have the same block size. 31 + */ 32 + cpacf_kimd(CPACF_KIMD_SHA3_256, state, 33 + data, nblocks * block_size); 34 + return; 35 + case SHA3_384_BLOCK_SIZE: 36 + cpacf_kimd(CPACF_KIMD_SHA3_384, state, 37 + data, nblocks * block_size); 38 + return; 39 + case SHA3_512_BLOCK_SIZE: 40 + cpacf_kimd(CPACF_KIMD_SHA3_512, state, 41 + data, nblocks * block_size); 42 + return; 43 + } 44 + } 45 + sha3_absorb_blocks_generic(state, data, nblocks, block_size); 46 + } 47 + 48 + static void sha3_keccakf(struct sha3_state *state) 49 + { 50 + if (static_branch_likely(&have_sha3)) { 51 + /* 52 + * Passing zeroes into any of CPACF_KIMD_SHA3_* gives the plain 53 + * Keccak-f permutation, which is what we want here. Use 54 + * SHA3-512 since it has the smallest block size. 55 + */ 56 + static const u8 zeroes[SHA3_512_BLOCK_SIZE]; 57 + 58 + cpacf_kimd(CPACF_KIMD_SHA3_512, state, zeroes, sizeof(zeroes)); 59 + } else { 60 + sha3_keccakf_generic(state); 61 + } 62 + } 63 + 64 + static inline bool s390_sha3(int func, const u8 *in, size_t in_len, 65 + u8 *out, size_t out_len) 66 + { 67 + struct sha3_state state; 68 + 69 + if (!static_branch_likely(&have_sha3)) 70 + return false; 71 + 72 + if (static_branch_likely(&have_sha3_init_optim)) 73 + func |= CPACF_KLMD_NIP | CPACF_KLMD_DUFOP; 74 + else 75 + memset(&state, 0, sizeof(state)); 76 + 77 + cpacf_klmd(func, &state, in, in_len); 78 + 79 + if (static_branch_likely(&have_sha3_init_optim)) 80 + kmsan_unpoison_memory(&state, out_len); 81 + 82 + memcpy(out, &state, out_len); 83 + memzero_explicit(&state, sizeof(state)); 84 + return true; 85 + } 86 + 87 + #define sha3_224_arch sha3_224_arch 88 + static bool sha3_224_arch(const u8 *in, size_t in_len, 89 + u8 out[SHA3_224_DIGEST_SIZE]) 90 + { 91 + return s390_sha3(CPACF_KLMD_SHA3_224, in, in_len, 92 + out, SHA3_224_DIGEST_SIZE); 93 + } 94 + 95 + #define sha3_256_arch sha3_256_arch 96 + static bool sha3_256_arch(const u8 *in, size_t in_len, 97 + u8 out[SHA3_256_DIGEST_SIZE]) 98 + { 99 + return s390_sha3(CPACF_KLMD_SHA3_256, in, in_len, 100 + out, SHA3_256_DIGEST_SIZE); 101 + } 102 + 103 + #define sha3_384_arch sha3_384_arch 104 + static bool sha3_384_arch(const u8 *in, size_t in_len, 105 + u8 out[SHA3_384_DIGEST_SIZE]) 106 + { 107 + return s390_sha3(CPACF_KLMD_SHA3_384, in, in_len, 108 + out, SHA3_384_DIGEST_SIZE); 109 + } 110 + 111 + #define sha3_512_arch sha3_512_arch 112 + static bool sha3_512_arch(const u8 *in, size_t in_len, 113 + u8 out[SHA3_512_DIGEST_SIZE]) 114 + { 115 + return s390_sha3(CPACF_KLMD_SHA3_512, in, in_len, 116 + out, SHA3_512_DIGEST_SIZE); 117 + } 118 + 119 + #define sha3_mod_init_arch sha3_mod_init_arch 120 + static void sha3_mod_init_arch(void) 121 + { 122 + int num_present = 0; 123 + int num_possible = 0; 124 + 125 + if (!cpu_have_feature(S390_CPU_FEATURE_MSA)) 126 + return; 127 + /* 128 + * Since all the SHA-3 functions are in Message-Security-Assist 129 + * Extension 6, just treat them as all or nothing. This way we need 130 + * only one static_key. 131 + */ 132 + #define QUERY(opcode, func) \ 133 + ({ num_present += !!cpacf_query_func(opcode, func); num_possible++; }) 134 + QUERY(CPACF_KIMD, CPACF_KIMD_SHA3_224); 135 + QUERY(CPACF_KIMD, CPACF_KIMD_SHA3_256); 136 + QUERY(CPACF_KIMD, CPACF_KIMD_SHA3_384); 137 + QUERY(CPACF_KIMD, CPACF_KIMD_SHA3_512); 138 + QUERY(CPACF_KLMD, CPACF_KLMD_SHA3_224); 139 + QUERY(CPACF_KLMD, CPACF_KLMD_SHA3_256); 140 + QUERY(CPACF_KLMD, CPACF_KLMD_SHA3_384); 141 + QUERY(CPACF_KLMD, CPACF_KLMD_SHA3_512); 142 + #undef QUERY 143 + 144 + if (num_present == num_possible) { 145 + static_branch_enable(&have_sha3); 146 + if (test_facility(86)) 147 + static_branch_enable(&have_sha3_init_optim); 148 + } else if (num_present != 0) { 149 + pr_warn("Unsupported combination of SHA-3 facilities\n"); 150 + } 151 + }
+18 -1
lib/crypto/sha1.c
··· 12 12 #include <linux/string.h> 13 13 #include <linux/unaligned.h> 14 14 #include <linux/wordpart.h> 15 + #include "fips.h" 15 16 16 17 static const struct sha1_block_state sha1_iv = { 17 18 .h = { SHA1_H0, SHA1_H1, SHA1_H2, SHA1_H3, SHA1_H4 }, ··· 331 330 } 332 331 EXPORT_SYMBOL_GPL(hmac_sha1_usingrawkey); 333 332 334 - #ifdef sha1_mod_init_arch 333 + #if defined(sha1_mod_init_arch) || defined(CONFIG_CRYPTO_FIPS) 335 334 static int __init sha1_mod_init(void) 336 335 { 336 + #ifdef sha1_mod_init_arch 337 337 sha1_mod_init_arch(); 338 + #endif 339 + if (fips_enabled) { 340 + /* 341 + * FIPS cryptographic algorithm self-test. As per the FIPS 342 + * Implementation Guidance, testing HMAC-SHA1 satisfies the test 343 + * requirement for SHA-1 too. 344 + */ 345 + u8 mac[SHA1_DIGEST_SIZE]; 346 + 347 + hmac_sha1_usingrawkey(fips_test_key, sizeof(fips_test_key), 348 + fips_test_data, sizeof(fips_test_data), 349 + mac); 350 + if (memcmp(fips_test_hmac_sha1_value, mac, sizeof(mac)) != 0) 351 + panic("sha1: FIPS self-test failed\n"); 352 + } 338 353 return 0; 339 354 } 340 355 subsys_initcall(sha1_mod_init);
+22 -4
lib/crypto/sha256.c
··· 17 17 #include <linux/string.h> 18 18 #include <linux/unaligned.h> 19 19 #include <linux/wordpart.h> 20 + #include "fips.h" 20 21 21 22 static const struct sha256_block_state sha224_iv = { 22 23 .h = { ··· 270 269 EXPORT_SYMBOL(sha256); 271 270 272 271 /* 273 - * Pre-boot environment (as indicated by __DISABLE_EXPORTS being defined) 274 - * doesn't need either HMAC support or interleaved hashing support 272 + * Pre-boot environments (as indicated by __DISABLE_EXPORTS being defined) just 273 + * need the generic SHA-256 code. Omit all other features from them. 275 274 */ 276 275 #ifndef __DISABLE_EXPORTS 277 276 ··· 478 477 hmac_sha256_final(&ctx, out); 479 478 } 480 479 EXPORT_SYMBOL_GPL(hmac_sha256_usingrawkey); 481 - #endif /* !__DISABLE_EXPORTS */ 482 480 483 - #ifdef sha256_mod_init_arch 481 + #if defined(sha256_mod_init_arch) || defined(CONFIG_CRYPTO_FIPS) 484 482 static int __init sha256_mod_init(void) 485 483 { 484 + #ifdef sha256_mod_init_arch 486 485 sha256_mod_init_arch(); 486 + #endif 487 + if (fips_enabled) { 488 + /* 489 + * FIPS cryptographic algorithm self-test. As per the FIPS 490 + * Implementation Guidance, testing HMAC-SHA256 satisfies the 491 + * test requirement for SHA-224, SHA-256, and HMAC-SHA224 too. 492 + */ 493 + u8 mac[SHA256_DIGEST_SIZE]; 494 + 495 + hmac_sha256_usingrawkey(fips_test_key, sizeof(fips_test_key), 496 + fips_test_data, sizeof(fips_test_data), 497 + mac); 498 + if (memcmp(fips_test_hmac_sha256_value, mac, sizeof(mac)) != 0) 499 + panic("sha256: FIPS self-test failed\n"); 500 + } 487 501 return 0; 488 502 } 489 503 subsys_initcall(sha256_mod_init); ··· 508 492 } 509 493 module_exit(sha256_mod_exit); 510 494 #endif 495 + 496 + #endif /* !__DISABLE_EXPORTS */ 511 497 512 498 MODULE_DESCRIPTION("SHA-224, SHA-256, HMAC-SHA224, and HMAC-SHA256 library functions"); 513 499 MODULE_LICENSE("GPL");
+411
lib/crypto/sha3.c
··· 1 + // SPDX-License-Identifier: GPL-2.0-or-later 2 + /* 3 + * SHA-3, as specified in 4 + * https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.202.pdf 5 + * 6 + * SHA-3 code by Jeff Garzik <jeff@garzik.org> 7 + * Ard Biesheuvel <ard.biesheuvel@linaro.org> 8 + * David Howells <dhowells@redhat.com> 9 + * 10 + * See also Documentation/crypto/sha3.rst 11 + */ 12 + 13 + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 14 + #include <crypto/sha3.h> 15 + #include <crypto/utils.h> 16 + #include <linux/export.h> 17 + #include <linux/kernel.h> 18 + #include <linux/module.h> 19 + #include <linux/unaligned.h> 20 + #include "fips.h" 21 + 22 + /* 23 + * On some 32-bit architectures, such as h8300, GCC ends up using over 1 KB of 24 + * stack if the round calculation gets inlined into the loop in 25 + * sha3_keccakf_generic(). On the other hand, on 64-bit architectures with 26 + * plenty of [64-bit wide] general purpose registers, not inlining it severely 27 + * hurts performance. So let's use 64-bitness as a heuristic to decide whether 28 + * to inline or not. 29 + */ 30 + #ifdef CONFIG_64BIT 31 + #define SHA3_INLINE inline 32 + #else 33 + #define SHA3_INLINE noinline 34 + #endif 35 + 36 + #define SHA3_KECCAK_ROUNDS 24 37 + 38 + static const u64 sha3_keccakf_rndc[SHA3_KECCAK_ROUNDS] = { 39 + 0x0000000000000001ULL, 0x0000000000008082ULL, 0x800000000000808aULL, 40 + 0x8000000080008000ULL, 0x000000000000808bULL, 0x0000000080000001ULL, 41 + 0x8000000080008081ULL, 0x8000000000008009ULL, 0x000000000000008aULL, 42 + 0x0000000000000088ULL, 0x0000000080008009ULL, 0x000000008000000aULL, 43 + 0x000000008000808bULL, 0x800000000000008bULL, 0x8000000000008089ULL, 44 + 0x8000000000008003ULL, 0x8000000000008002ULL, 0x8000000000000080ULL, 45 + 0x000000000000800aULL, 0x800000008000000aULL, 0x8000000080008081ULL, 46 + 0x8000000000008080ULL, 0x0000000080000001ULL, 0x8000000080008008ULL 47 + }; 48 + 49 + /* 50 + * Perform a single round of Keccak mixing. 51 + */ 52 + static SHA3_INLINE void sha3_keccakf_one_round_generic(u64 st[25], int round) 53 + { 54 + u64 t[5], tt, bc[5]; 55 + 56 + /* Theta */ 57 + bc[0] = st[0] ^ st[5] ^ st[10] ^ st[15] ^ st[20]; 58 + bc[1] = st[1] ^ st[6] ^ st[11] ^ st[16] ^ st[21]; 59 + bc[2] = st[2] ^ st[7] ^ st[12] ^ st[17] ^ st[22]; 60 + bc[3] = st[3] ^ st[8] ^ st[13] ^ st[18] ^ st[23]; 61 + bc[4] = st[4] ^ st[9] ^ st[14] ^ st[19] ^ st[24]; 62 + 63 + t[0] = bc[4] ^ rol64(bc[1], 1); 64 + t[1] = bc[0] ^ rol64(bc[2], 1); 65 + t[2] = bc[1] ^ rol64(bc[3], 1); 66 + t[3] = bc[2] ^ rol64(bc[4], 1); 67 + t[4] = bc[3] ^ rol64(bc[0], 1); 68 + 69 + st[0] ^= t[0]; 70 + 71 + /* Rho Pi */ 72 + tt = st[1]; 73 + st[ 1] = rol64(st[ 6] ^ t[1], 44); 74 + st[ 6] = rol64(st[ 9] ^ t[4], 20); 75 + st[ 9] = rol64(st[22] ^ t[2], 61); 76 + st[22] = rol64(st[14] ^ t[4], 39); 77 + st[14] = rol64(st[20] ^ t[0], 18); 78 + st[20] = rol64(st[ 2] ^ t[2], 62); 79 + st[ 2] = rol64(st[12] ^ t[2], 43); 80 + st[12] = rol64(st[13] ^ t[3], 25); 81 + st[13] = rol64(st[19] ^ t[4], 8); 82 + st[19] = rol64(st[23] ^ t[3], 56); 83 + st[23] = rol64(st[15] ^ t[0], 41); 84 + st[15] = rol64(st[ 4] ^ t[4], 27); 85 + st[ 4] = rol64(st[24] ^ t[4], 14); 86 + st[24] = rol64(st[21] ^ t[1], 2); 87 + st[21] = rol64(st[ 8] ^ t[3], 55); 88 + st[ 8] = rol64(st[16] ^ t[1], 45); 89 + st[16] = rol64(st[ 5] ^ t[0], 36); 90 + st[ 5] = rol64(st[ 3] ^ t[3], 28); 91 + st[ 3] = rol64(st[18] ^ t[3], 21); 92 + st[18] = rol64(st[17] ^ t[2], 15); 93 + st[17] = rol64(st[11] ^ t[1], 10); 94 + st[11] = rol64(st[ 7] ^ t[2], 6); 95 + st[ 7] = rol64(st[10] ^ t[0], 3); 96 + st[10] = rol64( tt ^ t[1], 1); 97 + 98 + /* Chi */ 99 + bc[ 0] = ~st[ 1] & st[ 2]; 100 + bc[ 1] = ~st[ 2] & st[ 3]; 101 + bc[ 2] = ~st[ 3] & st[ 4]; 102 + bc[ 3] = ~st[ 4] & st[ 0]; 103 + bc[ 4] = ~st[ 0] & st[ 1]; 104 + st[ 0] ^= bc[ 0]; 105 + st[ 1] ^= bc[ 1]; 106 + st[ 2] ^= bc[ 2]; 107 + st[ 3] ^= bc[ 3]; 108 + st[ 4] ^= bc[ 4]; 109 + 110 + bc[ 0] = ~st[ 6] & st[ 7]; 111 + bc[ 1] = ~st[ 7] & st[ 8]; 112 + bc[ 2] = ~st[ 8] & st[ 9]; 113 + bc[ 3] = ~st[ 9] & st[ 5]; 114 + bc[ 4] = ~st[ 5] & st[ 6]; 115 + st[ 5] ^= bc[ 0]; 116 + st[ 6] ^= bc[ 1]; 117 + st[ 7] ^= bc[ 2]; 118 + st[ 8] ^= bc[ 3]; 119 + st[ 9] ^= bc[ 4]; 120 + 121 + bc[ 0] = ~st[11] & st[12]; 122 + bc[ 1] = ~st[12] & st[13]; 123 + bc[ 2] = ~st[13] & st[14]; 124 + bc[ 3] = ~st[14] & st[10]; 125 + bc[ 4] = ~st[10] & st[11]; 126 + st[10] ^= bc[ 0]; 127 + st[11] ^= bc[ 1]; 128 + st[12] ^= bc[ 2]; 129 + st[13] ^= bc[ 3]; 130 + st[14] ^= bc[ 4]; 131 + 132 + bc[ 0] = ~st[16] & st[17]; 133 + bc[ 1] = ~st[17] & st[18]; 134 + bc[ 2] = ~st[18] & st[19]; 135 + bc[ 3] = ~st[19] & st[15]; 136 + bc[ 4] = ~st[15] & st[16]; 137 + st[15] ^= bc[ 0]; 138 + st[16] ^= bc[ 1]; 139 + st[17] ^= bc[ 2]; 140 + st[18] ^= bc[ 3]; 141 + st[19] ^= bc[ 4]; 142 + 143 + bc[ 0] = ~st[21] & st[22]; 144 + bc[ 1] = ~st[22] & st[23]; 145 + bc[ 2] = ~st[23] & st[24]; 146 + bc[ 3] = ~st[24] & st[20]; 147 + bc[ 4] = ~st[20] & st[21]; 148 + st[20] ^= bc[ 0]; 149 + st[21] ^= bc[ 1]; 150 + st[22] ^= bc[ 2]; 151 + st[23] ^= bc[ 3]; 152 + st[24] ^= bc[ 4]; 153 + 154 + /* Iota */ 155 + st[0] ^= sha3_keccakf_rndc[round]; 156 + } 157 + 158 + /* Generic implementation of the Keccak-f[1600] permutation */ 159 + static void sha3_keccakf_generic(struct sha3_state *state) 160 + { 161 + /* 162 + * Temporarily convert the state words from little-endian to native- 163 + * endian so that they can be operated on. Note that on little-endian 164 + * machines this conversion is a no-op and is optimized out. 165 + */ 166 + 167 + for (int i = 0; i < ARRAY_SIZE(state->words); i++) 168 + state->native_words[i] = le64_to_cpu(state->words[i]); 169 + 170 + for (int round = 0; round < SHA3_KECCAK_ROUNDS; round++) 171 + sha3_keccakf_one_round_generic(state->native_words, round); 172 + 173 + for (int i = 0; i < ARRAY_SIZE(state->words); i++) 174 + state->words[i] = cpu_to_le64(state->native_words[i]); 175 + } 176 + 177 + /* 178 + * Generic implementation of absorbing the given nonzero number of full blocks 179 + * into the sponge function Keccak[r=8*block_size, c=1600-8*block_size]. 180 + */ 181 + static void __maybe_unused 182 + sha3_absorb_blocks_generic(struct sha3_state *state, const u8 *data, 183 + size_t nblocks, size_t block_size) 184 + { 185 + do { 186 + for (size_t i = 0; i < block_size; i += 8) 187 + state->words[i / 8] ^= get_unaligned((__le64 *)&data[i]); 188 + sha3_keccakf_generic(state); 189 + data += block_size; 190 + } while (--nblocks); 191 + } 192 + 193 + #ifdef CONFIG_CRYPTO_LIB_SHA3_ARCH 194 + #include "sha3.h" /* $(SRCARCH)/sha3.h */ 195 + #else 196 + #define sha3_keccakf sha3_keccakf_generic 197 + #define sha3_absorb_blocks sha3_absorb_blocks_generic 198 + #endif 199 + 200 + void __sha3_update(struct __sha3_ctx *ctx, const u8 *in, size_t in_len) 201 + { 202 + const size_t block_size = ctx->block_size; 203 + size_t absorb_offset = ctx->absorb_offset; 204 + 205 + /* Warn if squeezing has already begun. */ 206 + WARN_ON_ONCE(absorb_offset >= block_size); 207 + 208 + if (absorb_offset && absorb_offset + in_len >= block_size) { 209 + crypto_xor(&ctx->state.bytes[absorb_offset], in, 210 + block_size - absorb_offset); 211 + in += block_size - absorb_offset; 212 + in_len -= block_size - absorb_offset; 213 + sha3_keccakf(&ctx->state); 214 + absorb_offset = 0; 215 + } 216 + 217 + if (in_len >= block_size) { 218 + size_t nblocks = in_len / block_size; 219 + 220 + sha3_absorb_blocks(&ctx->state, in, nblocks, block_size); 221 + in += nblocks * block_size; 222 + in_len -= nblocks * block_size; 223 + } 224 + 225 + if (in_len) { 226 + crypto_xor(&ctx->state.bytes[absorb_offset], in, in_len); 227 + absorb_offset += in_len; 228 + } 229 + ctx->absorb_offset = absorb_offset; 230 + } 231 + EXPORT_SYMBOL_GPL(__sha3_update); 232 + 233 + void sha3_final(struct sha3_ctx *sha3_ctx, u8 *out) 234 + { 235 + struct __sha3_ctx *ctx = &sha3_ctx->ctx; 236 + 237 + ctx->state.bytes[ctx->absorb_offset] ^= 0x06; 238 + ctx->state.bytes[ctx->block_size - 1] ^= 0x80; 239 + sha3_keccakf(&ctx->state); 240 + memcpy(out, ctx->state.bytes, ctx->digest_size); 241 + sha3_zeroize_ctx(sha3_ctx); 242 + } 243 + EXPORT_SYMBOL_GPL(sha3_final); 244 + 245 + void shake_squeeze(struct shake_ctx *shake_ctx, u8 *out, size_t out_len) 246 + { 247 + struct __sha3_ctx *ctx = &shake_ctx->ctx; 248 + const size_t block_size = ctx->block_size; 249 + size_t squeeze_offset = ctx->squeeze_offset; 250 + 251 + if (ctx->absorb_offset < block_size) { 252 + /* First squeeze: */ 253 + 254 + /* Add the domain separation suffix and padding. */ 255 + ctx->state.bytes[ctx->absorb_offset] ^= 0x1f; 256 + ctx->state.bytes[block_size - 1] ^= 0x80; 257 + 258 + /* Indicate that squeezing has begun. */ 259 + ctx->absorb_offset = block_size; 260 + 261 + /* 262 + * Indicate that no output is pending yet, i.e. sha3_keccakf() 263 + * will need to be called before the first copy. 264 + */ 265 + squeeze_offset = block_size; 266 + } 267 + while (out_len) { 268 + if (squeeze_offset == block_size) { 269 + sha3_keccakf(&ctx->state); 270 + squeeze_offset = 0; 271 + } 272 + size_t copy = min(out_len, block_size - squeeze_offset); 273 + 274 + memcpy(out, &ctx->state.bytes[squeeze_offset], copy); 275 + out += copy; 276 + out_len -= copy; 277 + squeeze_offset += copy; 278 + } 279 + ctx->squeeze_offset = squeeze_offset; 280 + } 281 + EXPORT_SYMBOL_GPL(shake_squeeze); 282 + 283 + #ifndef sha3_224_arch 284 + static inline bool sha3_224_arch(const u8 *in, size_t in_len, 285 + u8 out[SHA3_224_DIGEST_SIZE]) 286 + { 287 + return false; 288 + } 289 + #endif 290 + #ifndef sha3_256_arch 291 + static inline bool sha3_256_arch(const u8 *in, size_t in_len, 292 + u8 out[SHA3_256_DIGEST_SIZE]) 293 + { 294 + return false; 295 + } 296 + #endif 297 + #ifndef sha3_384_arch 298 + static inline bool sha3_384_arch(const u8 *in, size_t in_len, 299 + u8 out[SHA3_384_DIGEST_SIZE]) 300 + { 301 + return false; 302 + } 303 + #endif 304 + #ifndef sha3_512_arch 305 + static inline bool sha3_512_arch(const u8 *in, size_t in_len, 306 + u8 out[SHA3_512_DIGEST_SIZE]) 307 + { 308 + return false; 309 + } 310 + #endif 311 + 312 + void sha3_224(const u8 *in, size_t in_len, u8 out[SHA3_224_DIGEST_SIZE]) 313 + { 314 + struct sha3_ctx ctx; 315 + 316 + if (sha3_224_arch(in, in_len, out)) 317 + return; 318 + sha3_224_init(&ctx); 319 + sha3_update(&ctx, in, in_len); 320 + sha3_final(&ctx, out); 321 + } 322 + EXPORT_SYMBOL_GPL(sha3_224); 323 + 324 + void sha3_256(const u8 *in, size_t in_len, u8 out[SHA3_256_DIGEST_SIZE]) 325 + { 326 + struct sha3_ctx ctx; 327 + 328 + if (sha3_256_arch(in, in_len, out)) 329 + return; 330 + sha3_256_init(&ctx); 331 + sha3_update(&ctx, in, in_len); 332 + sha3_final(&ctx, out); 333 + } 334 + EXPORT_SYMBOL_GPL(sha3_256); 335 + 336 + void sha3_384(const u8 *in, size_t in_len, u8 out[SHA3_384_DIGEST_SIZE]) 337 + { 338 + struct sha3_ctx ctx; 339 + 340 + if (sha3_384_arch(in, in_len, out)) 341 + return; 342 + sha3_384_init(&ctx); 343 + sha3_update(&ctx, in, in_len); 344 + sha3_final(&ctx, out); 345 + } 346 + EXPORT_SYMBOL_GPL(sha3_384); 347 + 348 + void sha3_512(const u8 *in, size_t in_len, u8 out[SHA3_512_DIGEST_SIZE]) 349 + { 350 + struct sha3_ctx ctx; 351 + 352 + if (sha3_512_arch(in, in_len, out)) 353 + return; 354 + sha3_512_init(&ctx); 355 + sha3_update(&ctx, in, in_len); 356 + sha3_final(&ctx, out); 357 + } 358 + EXPORT_SYMBOL_GPL(sha3_512); 359 + 360 + void shake128(const u8 *in, size_t in_len, u8 *out, size_t out_len) 361 + { 362 + struct shake_ctx ctx; 363 + 364 + shake128_init(&ctx); 365 + shake_update(&ctx, in, in_len); 366 + shake_squeeze(&ctx, out, out_len); 367 + shake_zeroize_ctx(&ctx); 368 + } 369 + EXPORT_SYMBOL_GPL(shake128); 370 + 371 + void shake256(const u8 *in, size_t in_len, u8 *out, size_t out_len) 372 + { 373 + struct shake_ctx ctx; 374 + 375 + shake256_init(&ctx); 376 + shake_update(&ctx, in, in_len); 377 + shake_squeeze(&ctx, out, out_len); 378 + shake_zeroize_ctx(&ctx); 379 + } 380 + EXPORT_SYMBOL_GPL(shake256); 381 + 382 + #if defined(sha3_mod_init_arch) || defined(CONFIG_CRYPTO_FIPS) 383 + static int __init sha3_mod_init(void) 384 + { 385 + #ifdef sha3_mod_init_arch 386 + sha3_mod_init_arch(); 387 + #endif 388 + if (fips_enabled) { 389 + /* 390 + * FIPS cryptographic algorithm self-test. As per the FIPS 391 + * Implementation Guidance, testing any SHA-3 algorithm 392 + * satisfies the test requirement for all of them. 393 + */ 394 + u8 hash[SHA3_256_DIGEST_SIZE]; 395 + 396 + sha3_256(fips_test_data, sizeof(fips_test_data), hash); 397 + if (memcmp(fips_test_sha3_256_value, hash, sizeof(hash)) != 0) 398 + panic("sha3: FIPS self-test failed\n"); 399 + } 400 + return 0; 401 + } 402 + subsys_initcall(sha3_mod_init); 403 + 404 + static void __exit sha3_mod_exit(void) 405 + { 406 + } 407 + module_exit(sha3_mod_exit); 408 + #endif 409 + 410 + MODULE_DESCRIPTION("SHA-3 library functions"); 411 + MODULE_LICENSE("GPL");
+18 -1
lib/crypto/sha512.c
··· 17 17 #include <linux/string.h> 18 18 #include <linux/unaligned.h> 19 19 #include <linux/wordpart.h> 20 + #include "fips.h" 20 21 21 22 static const struct sha512_block_state sha384_iv = { 22 23 .h = { ··· 406 405 } 407 406 EXPORT_SYMBOL_GPL(hmac_sha512_usingrawkey); 408 407 409 - #ifdef sha512_mod_init_arch 408 + #if defined(sha512_mod_init_arch) || defined(CONFIG_CRYPTO_FIPS) 410 409 static int __init sha512_mod_init(void) 411 410 { 411 + #ifdef sha512_mod_init_arch 412 412 sha512_mod_init_arch(); 413 + #endif 414 + if (fips_enabled) { 415 + /* 416 + * FIPS cryptographic algorithm self-test. As per the FIPS 417 + * Implementation Guidance, testing HMAC-SHA512 satisfies the 418 + * test requirement for SHA-384, SHA-512, and HMAC-SHA384 too. 419 + */ 420 + u8 mac[SHA512_DIGEST_SIZE]; 421 + 422 + hmac_sha512_usingrawkey(fips_test_key, sizeof(fips_test_key), 423 + fips_test_data, sizeof(fips_test_data), 424 + mac); 425 + if (memcmp(fips_test_hmac_sha512_value, mac, sizeof(mac)) != 0) 426 + panic("sha512: FIPS self-test failed\n"); 427 + } 413 428 return 0; 414 429 } 415 430 subsys_initcall(sha512_mod_init);
+19 -20
lib/crypto/tests/blake2s_kunit.c
··· 14 14 static void blake2s_default(const u8 *data, size_t len, 15 15 u8 out[BLAKE2S_HASH_SIZE]) 16 16 { 17 - blake2s(out, data, NULL, BLAKE2S_HASH_SIZE, len, 0); 17 + blake2s(NULL, 0, data, len, out, BLAKE2S_HASH_SIZE); 18 18 } 19 19 20 - static void blake2s_init_default(struct blake2s_state *state) 20 + static void blake2s_init_default(struct blake2s_ctx *ctx) 21 21 { 22 - blake2s_init(state, BLAKE2S_HASH_SIZE); 22 + blake2s_init(ctx, BLAKE2S_HASH_SIZE); 23 23 } 24 24 25 25 /* ··· 27 27 * with a key length of 0 and a hash length of BLAKE2S_HASH_SIZE. 28 28 */ 29 29 #define HASH blake2s_default 30 - #define HASH_CTX blake2s_state 30 + #define HASH_CTX blake2s_ctx 31 31 #define HASH_SIZE BLAKE2S_HASH_SIZE 32 32 #define HASH_INIT blake2s_init_default 33 33 #define HASH_UPDATE blake2s_update ··· 44 44 u8 *data = &test_buf[0]; 45 45 u8 *key = data + data_len; 46 46 u8 *hash = key + BLAKE2S_KEY_SIZE; 47 - struct blake2s_state main_state; 47 + struct blake2s_ctx main_ctx; 48 48 u8 main_hash[BLAKE2S_HASH_SIZE]; 49 49 50 50 rand_bytes_seeded_from_len(data, data_len); 51 - blake2s_init(&main_state, BLAKE2S_HASH_SIZE); 51 + blake2s_init(&main_ctx, BLAKE2S_HASH_SIZE); 52 52 for (int key_len = 0; key_len <= BLAKE2S_KEY_SIZE; key_len++) { 53 53 rand_bytes_seeded_from_len(key, key_len); 54 54 for (int out_len = 1; out_len <= BLAKE2S_HASH_SIZE; out_len++) { 55 - blake2s(hash, data, key, out_len, data_len, key_len); 56 - blake2s_update(&main_state, hash, out_len); 55 + blake2s(key, key_len, data, data_len, hash, out_len); 56 + blake2s_update(&main_ctx, hash, out_len); 57 57 } 58 58 } 59 - blake2s_final(&main_state, main_hash); 59 + blake2s_final(&main_ctx, main_hash); 60 60 KUNIT_ASSERT_MEMEQ(test, main_hash, blake2s_keyed_testvec_consolidated, 61 61 BLAKE2S_HASH_SIZE); 62 62 } ··· 75 75 u8 *guarded_key = &test_buf[TEST_BUF_LEN - key_len]; 76 76 u8 hash1[BLAKE2S_HASH_SIZE]; 77 77 u8 hash2[BLAKE2S_HASH_SIZE]; 78 - struct blake2s_state state; 78 + struct blake2s_ctx ctx; 79 79 80 80 rand_bytes(key, key_len); 81 81 memcpy(guarded_key, key, key_len); 82 82 83 - blake2s(hash1, test_buf, key, 84 - BLAKE2S_HASH_SIZE, data_len, key_len); 85 - blake2s(hash2, test_buf, guarded_key, 86 - BLAKE2S_HASH_SIZE, data_len, key_len); 83 + blake2s(key, key_len, test_buf, data_len, 84 + hash1, BLAKE2S_HASH_SIZE); 85 + blake2s(guarded_key, key_len, test_buf, data_len, 86 + hash2, BLAKE2S_HASH_SIZE); 87 87 KUNIT_ASSERT_MEMEQ(test, hash1, hash2, BLAKE2S_HASH_SIZE); 88 88 89 - blake2s_init_key(&state, BLAKE2S_HASH_SIZE, 90 - guarded_key, key_len); 91 - blake2s_update(&state, test_buf, data_len); 92 - blake2s_final(&state, hash2); 89 + blake2s_init_key(&ctx, BLAKE2S_HASH_SIZE, guarded_key, key_len); 90 + blake2s_update(&ctx, test_buf, data_len); 91 + blake2s_final(&ctx, hash2); 93 92 KUNIT_ASSERT_MEMEQ(test, hash1, hash2, BLAKE2S_HASH_SIZE); 94 93 } 95 94 } ··· 106 107 u8 hash[BLAKE2S_HASH_SIZE]; 107 108 u8 *guarded_hash = &test_buf[TEST_BUF_LEN - out_len]; 108 109 109 - blake2s(hash, test_buf, NULL, out_len, data_len, 0); 110 - blake2s(guarded_hash, test_buf, NULL, out_len, data_len, 0); 110 + blake2s(NULL, 0, test_buf, data_len, hash, out_len); 111 + blake2s(NULL, 0, test_buf, data_len, guarded_hash, out_len); 111 112 KUNIT_ASSERT_MEMEQ(test, hash, guarded_hash, out_len); 112 113 } 113 114 }
+157 -118
lib/crypto/x86/blake2s-core.S
··· 6 6 7 7 #include <linux/linkage.h> 8 8 9 - .section .rodata.cst32.BLAKE2S_IV, "aM", @progbits, 32 9 + .section .rodata.cst32.iv, "aM", @progbits, 32 10 10 .align 32 11 - IV: .octa 0xA54FF53A3C6EF372BB67AE856A09E667 11 + .Liv: 12 + .octa 0xA54FF53A3C6EF372BB67AE856A09E667 12 13 .octa 0x5BE0CD191F83D9AB9B05688C510E527F 13 - .section .rodata.cst16.ROT16, "aM", @progbits, 16 14 + 15 + .section .rodata.cst16.ror16, "aM", @progbits, 16 14 16 .align 16 15 - ROT16: .octa 0x0D0C0F0E09080B0A0504070601000302 16 - .section .rodata.cst16.ROR328, "aM", @progbits, 16 17 + .Lror16: 18 + .octa 0x0D0C0F0E09080B0A0504070601000302 19 + 20 + .section .rodata.cst16.ror8, "aM", @progbits, 16 17 21 .align 16 18 - ROR328: .octa 0x0C0F0E0D080B0A090407060500030201 19 - .section .rodata.cst64.BLAKE2S_SIGMA, "aM", @progbits, 160 22 + .Lror8: 23 + .octa 0x0C0F0E0D080B0A090407060500030201 24 + 25 + .section .rodata.cst64.sigma, "aM", @progbits, 160 20 26 .align 64 21 - SIGMA: 27 + .Lsigma: 22 28 .byte 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13 23 29 .byte 14, 4, 9, 13, 10, 8, 15, 6, 5, 1, 0, 11, 3, 12, 2, 7 24 30 .byte 11, 12, 5, 15, 8, 0, 2, 13, 9, 10, 3, 7, 4, 14, 6, 1 ··· 35 29 .byte 13, 7, 12, 3, 11, 14, 1, 9, 2, 5, 15, 8, 10, 0, 4, 6 36 30 .byte 6, 14, 11, 0, 15, 9, 3, 8, 10, 12, 13, 1, 5, 2, 7, 4 37 31 .byte 10, 8, 7, 1, 2, 4, 6, 5, 13, 15, 9, 3, 0, 11, 14, 12 38 - .section .rodata.cst64.BLAKE2S_SIGMA2, "aM", @progbits, 160 32 + 33 + .section .rodata.cst64.sigma2, "aM", @progbits, 160 39 34 .align 64 40 - SIGMA2: 35 + .Lsigma2: 41 36 .byte 0, 2, 4, 6, 1, 3, 5, 7, 14, 8, 10, 12, 15, 9, 11, 13 42 37 .byte 8, 2, 13, 15, 10, 9, 12, 3, 6, 4, 0, 14, 5, 11, 1, 7 43 38 .byte 11, 13, 8, 6, 5, 10, 14, 3, 2, 4, 12, 15, 1, 0, 7, 9 ··· 50 43 .byte 15, 5, 4, 13, 10, 7, 3, 11, 12, 2, 0, 6, 9, 8, 1, 14 51 44 .byte 8, 7, 14, 11, 13, 15, 0, 12, 10, 4, 5, 6, 3, 2, 1, 9 52 45 46 + #define CTX %rdi 47 + #define DATA %rsi 48 + #define NBLOCKS %rdx 49 + #define INC %ecx 50 + 53 51 .text 52 + // 53 + // void blake2s_compress_ssse3(struct blake2s_ctx *ctx, 54 + // const u8 *data, size_t nblocks, u32 inc); 55 + // 56 + // Only the first three fields of struct blake2s_ctx are used: 57 + // u32 h[8]; (inout) 58 + // u32 t[2]; (inout) 59 + // u32 f[2]; (in) 60 + // 54 61 SYM_FUNC_START(blake2s_compress_ssse3) 55 - testq %rdx,%rdx 56 - je .Lendofloop 57 - movdqu (%rdi),%xmm0 58 - movdqu 0x10(%rdi),%xmm1 59 - movdqa ROT16(%rip),%xmm12 60 - movdqa ROR328(%rip),%xmm13 61 - movdqu 0x20(%rdi),%xmm14 62 - movq %rcx,%xmm15 63 - leaq SIGMA+0xa0(%rip),%r8 64 - jmp .Lbeginofloop 62 + movdqu (CTX),%xmm0 // Load h[0..3] 63 + movdqu 16(CTX),%xmm1 // Load h[4..7] 64 + movdqa .Lror16(%rip),%xmm12 65 + movdqa .Lror8(%rip),%xmm13 66 + movdqu 32(CTX),%xmm14 // Load t and f 67 + movd INC,%xmm15 // Load inc 68 + leaq .Lsigma+160(%rip),%r8 69 + jmp .Lssse3_mainloop 70 + 65 71 .align 32 66 - .Lbeginofloop: 67 - movdqa %xmm0,%xmm10 68 - movdqa %xmm1,%xmm11 69 - paddq %xmm15,%xmm14 70 - movdqa IV(%rip),%xmm2 72 + .Lssse3_mainloop: 73 + // Main loop: each iteration processes one 64-byte block. 74 + movdqa %xmm0,%xmm10 // Save h[0..3] and let v[0..3] = h[0..3] 75 + movdqa %xmm1,%xmm11 // Save h[4..7] and let v[4..7] = h[4..7] 76 + paddq %xmm15,%xmm14 // t += inc (64-bit addition) 77 + movdqa .Liv(%rip),%xmm2 // v[8..11] = iv[0..3] 71 78 movdqa %xmm14,%xmm3 72 - pxor IV+0x10(%rip),%xmm3 73 - leaq SIGMA(%rip),%rcx 74 - .Lroundloop: 79 + pxor .Liv+16(%rip),%xmm3 // v[12..15] = iv[4..7] ^ [t, f] 80 + leaq .Lsigma(%rip),%rcx 81 + 82 + .Lssse3_roundloop: 83 + // Round loop: each iteration does 1 round (of 10 rounds total). 75 84 movzbl (%rcx),%eax 76 - movd (%rsi,%rax,4),%xmm4 77 - movzbl 0x1(%rcx),%eax 78 - movd (%rsi,%rax,4),%xmm5 79 - movzbl 0x2(%rcx),%eax 80 - movd (%rsi,%rax,4),%xmm6 81 - movzbl 0x3(%rcx),%eax 82 - movd (%rsi,%rax,4),%xmm7 85 + movd (DATA,%rax,4),%xmm4 86 + movzbl 1(%rcx),%eax 87 + movd (DATA,%rax,4),%xmm5 88 + movzbl 2(%rcx),%eax 89 + movd (DATA,%rax,4),%xmm6 90 + movzbl 3(%rcx),%eax 91 + movd (DATA,%rax,4),%xmm7 83 92 punpckldq %xmm5,%xmm4 84 93 punpckldq %xmm7,%xmm6 85 94 punpcklqdq %xmm6,%xmm4 ··· 106 83 paddd %xmm3,%xmm2 107 84 pxor %xmm2,%xmm1 108 85 movdqa %xmm1,%xmm8 109 - psrld $0xc,%xmm1 110 - pslld $0x14,%xmm8 86 + psrld $12,%xmm1 87 + pslld $20,%xmm8 111 88 por %xmm8,%xmm1 112 - movzbl 0x4(%rcx),%eax 113 - movd (%rsi,%rax,4),%xmm5 114 - movzbl 0x5(%rcx),%eax 115 - movd (%rsi,%rax,4),%xmm6 116 - movzbl 0x6(%rcx),%eax 117 - movd (%rsi,%rax,4),%xmm7 118 - movzbl 0x7(%rcx),%eax 119 - movd (%rsi,%rax,4),%xmm4 89 + movzbl 4(%rcx),%eax 90 + movd (DATA,%rax,4),%xmm5 91 + movzbl 5(%rcx),%eax 92 + movd (DATA,%rax,4),%xmm6 93 + movzbl 6(%rcx),%eax 94 + movd (DATA,%rax,4),%xmm7 95 + movzbl 7(%rcx),%eax 96 + movd (DATA,%rax,4),%xmm4 120 97 punpckldq %xmm6,%xmm5 121 98 punpckldq %xmm4,%xmm7 122 99 punpcklqdq %xmm7,%xmm5 ··· 127 104 paddd %xmm3,%xmm2 128 105 pxor %xmm2,%xmm1 129 106 movdqa %xmm1,%xmm8 130 - psrld $0x7,%xmm1 131 - pslld $0x19,%xmm8 107 + psrld $7,%xmm1 108 + pslld $25,%xmm8 132 109 por %xmm8,%xmm1 133 110 pshufd $0x93,%xmm0,%xmm0 134 111 pshufd $0x4e,%xmm3,%xmm3 135 112 pshufd $0x39,%xmm2,%xmm2 136 - movzbl 0x8(%rcx),%eax 137 - movd (%rsi,%rax,4),%xmm6 138 - movzbl 0x9(%rcx),%eax 139 - movd (%rsi,%rax,4),%xmm7 140 - movzbl 0xa(%rcx),%eax 141 - movd (%rsi,%rax,4),%xmm4 142 - movzbl 0xb(%rcx),%eax 143 - movd (%rsi,%rax,4),%xmm5 113 + movzbl 8(%rcx),%eax 114 + movd (DATA,%rax,4),%xmm6 115 + movzbl 9(%rcx),%eax 116 + movd (DATA,%rax,4),%xmm7 117 + movzbl 10(%rcx),%eax 118 + movd (DATA,%rax,4),%xmm4 119 + movzbl 11(%rcx),%eax 120 + movd (DATA,%rax,4),%xmm5 144 121 punpckldq %xmm7,%xmm6 145 122 punpckldq %xmm5,%xmm4 146 123 punpcklqdq %xmm4,%xmm6 ··· 151 128 paddd %xmm3,%xmm2 152 129 pxor %xmm2,%xmm1 153 130 movdqa %xmm1,%xmm8 154 - psrld $0xc,%xmm1 155 - pslld $0x14,%xmm8 131 + psrld $12,%xmm1 132 + pslld $20,%xmm8 156 133 por %xmm8,%xmm1 157 - movzbl 0xc(%rcx),%eax 158 - movd (%rsi,%rax,4),%xmm7 159 - movzbl 0xd(%rcx),%eax 160 - movd (%rsi,%rax,4),%xmm4 161 - movzbl 0xe(%rcx),%eax 162 - movd (%rsi,%rax,4),%xmm5 163 - movzbl 0xf(%rcx),%eax 164 - movd (%rsi,%rax,4),%xmm6 134 + movzbl 12(%rcx),%eax 135 + movd (DATA,%rax,4),%xmm7 136 + movzbl 13(%rcx),%eax 137 + movd (DATA,%rax,4),%xmm4 138 + movzbl 14(%rcx),%eax 139 + movd (DATA,%rax,4),%xmm5 140 + movzbl 15(%rcx),%eax 141 + movd (DATA,%rax,4),%xmm6 165 142 punpckldq %xmm4,%xmm7 166 143 punpckldq %xmm6,%xmm5 167 144 punpcklqdq %xmm5,%xmm7 ··· 172 149 paddd %xmm3,%xmm2 173 150 pxor %xmm2,%xmm1 174 151 movdqa %xmm1,%xmm8 175 - psrld $0x7,%xmm1 176 - pslld $0x19,%xmm8 152 + psrld $7,%xmm1 153 + pslld $25,%xmm8 177 154 por %xmm8,%xmm1 178 155 pshufd $0x39,%xmm0,%xmm0 179 156 pshufd $0x4e,%xmm3,%xmm3 180 157 pshufd $0x93,%xmm2,%xmm2 181 - addq $0x10,%rcx 158 + addq $16,%rcx 182 159 cmpq %r8,%rcx 183 - jnz .Lroundloop 160 + jnz .Lssse3_roundloop 161 + 162 + // Compute the new h: h[0..7] ^= v[0..7] ^ v[8..15] 184 163 pxor %xmm2,%xmm0 185 164 pxor %xmm3,%xmm1 186 165 pxor %xmm10,%xmm0 187 166 pxor %xmm11,%xmm1 188 - addq $0x40,%rsi 189 - decq %rdx 190 - jnz .Lbeginofloop 191 - movdqu %xmm0,(%rdi) 192 - movdqu %xmm1,0x10(%rdi) 193 - movdqu %xmm14,0x20(%rdi) 194 - .Lendofloop: 167 + addq $64,DATA 168 + decq NBLOCKS 169 + jnz .Lssse3_mainloop 170 + 171 + movdqu %xmm0,(CTX) // Store new h[0..3] 172 + movdqu %xmm1,16(CTX) // Store new h[4..7] 173 + movq %xmm14,32(CTX) // Store new t (f is unchanged) 195 174 RET 196 175 SYM_FUNC_END(blake2s_compress_ssse3) 197 176 177 + // 178 + // void blake2s_compress_avx512(struct blake2s_ctx *ctx, 179 + // const u8 *data, size_t nblocks, u32 inc); 180 + // 181 + // Only the first three fields of struct blake2s_ctx are used: 182 + // u32 h[8]; (inout) 183 + // u32 t[2]; (inout) 184 + // u32 f[2]; (in) 185 + // 198 186 SYM_FUNC_START(blake2s_compress_avx512) 199 - vmovdqu (%rdi),%xmm0 200 - vmovdqu 0x10(%rdi),%xmm1 201 - vmovdqu 0x20(%rdi),%xmm4 202 - vmovq %rcx,%xmm5 203 - vmovdqa IV(%rip),%xmm14 204 - vmovdqa IV+16(%rip),%xmm15 205 - jmp .Lblake2s_compress_avx512_mainloop 206 - .align 32 207 - .Lblake2s_compress_avx512_mainloop: 208 - vmovdqa %xmm0,%xmm10 209 - vmovdqa %xmm1,%xmm11 210 - vpaddq %xmm5,%xmm4,%xmm4 211 - vmovdqa %xmm14,%xmm2 212 - vpxor %xmm15,%xmm4,%xmm3 213 - vmovdqu (%rsi),%ymm6 214 - vmovdqu 0x20(%rsi),%ymm7 215 - addq $0x40,%rsi 216 - leaq SIGMA2(%rip),%rax 217 - movb $0xa,%cl 218 - .Lblake2s_compress_avx512_roundloop: 187 + vmovdqu (CTX),%xmm0 // Load h[0..3] 188 + vmovdqu 16(CTX),%xmm1 // Load h[4..7] 189 + vmovdqu 32(CTX),%xmm4 // Load t and f 190 + vmovd INC,%xmm5 // Load inc 191 + vmovdqa .Liv(%rip),%xmm14 // Load iv[0..3] 192 + vmovdqa .Liv+16(%rip),%xmm15 // Load iv[4..7] 193 + jmp .Lavx512_mainloop 194 + 195 + .align 32 196 + .Lavx512_mainloop: 197 + // Main loop: each iteration processes one 64-byte block. 198 + vmovdqa %xmm0,%xmm10 // Save h[0..3] and let v[0..3] = h[0..3] 199 + vmovdqa %xmm1,%xmm11 // Save h[4..7] and let v[4..7] = h[4..7] 200 + vpaddq %xmm5,%xmm4,%xmm4 // t += inc (64-bit addition) 201 + vmovdqa %xmm14,%xmm2 // v[8..11] = iv[0..3] 202 + vpxor %xmm15,%xmm4,%xmm3 // v[12..15] = iv[4..7] ^ [t, f] 203 + vmovdqu (DATA),%ymm6 // Load first 8 data words 204 + vmovdqu 32(DATA),%ymm7 // Load second 8 data words 205 + addq $64,DATA 206 + leaq .Lsigma2(%rip),%rax 207 + movb $10,%cl // Set num rounds remaining 208 + 209 + .Lavx512_roundloop: 210 + // Round loop: each iteration does 1 round (of 10 rounds total). 219 211 vpmovzxbd (%rax),%ymm8 220 - vpmovzxbd 0x8(%rax),%ymm9 221 - addq $0x10,%rax 212 + vpmovzxbd 8(%rax),%ymm9 213 + addq $16,%rax 222 214 vpermi2d %ymm7,%ymm6,%ymm8 223 215 vpermi2d %ymm7,%ymm6,%ymm9 224 216 vmovdqa %ymm8,%ymm6 ··· 241 203 vpaddd %xmm8,%xmm0,%xmm0 242 204 vpaddd %xmm1,%xmm0,%xmm0 243 205 vpxor %xmm0,%xmm3,%xmm3 244 - vprord $0x10,%xmm3,%xmm3 206 + vprord $16,%xmm3,%xmm3 245 207 vpaddd %xmm3,%xmm2,%xmm2 246 208 vpxor %xmm2,%xmm1,%xmm1 247 - vprord $0xc,%xmm1,%xmm1 248 - vextracti128 $0x1,%ymm8,%xmm8 209 + vprord $12,%xmm1,%xmm1 210 + vextracti128 $1,%ymm8,%xmm8 249 211 vpaddd %xmm8,%xmm0,%xmm0 250 212 vpaddd %xmm1,%xmm0,%xmm0 251 213 vpxor %xmm0,%xmm3,%xmm3 252 - vprord $0x8,%xmm3,%xmm3 214 + vprord $8,%xmm3,%xmm3 253 215 vpaddd %xmm3,%xmm2,%xmm2 254 216 vpxor %xmm2,%xmm1,%xmm1 255 - vprord $0x7,%xmm1,%xmm1 217 + vprord $7,%xmm1,%xmm1 256 218 vpshufd $0x93,%xmm0,%xmm0 257 219 vpshufd $0x4e,%xmm3,%xmm3 258 220 vpshufd $0x39,%xmm2,%xmm2 259 221 vpaddd %xmm9,%xmm0,%xmm0 260 222 vpaddd %xmm1,%xmm0,%xmm0 261 223 vpxor %xmm0,%xmm3,%xmm3 262 - vprord $0x10,%xmm3,%xmm3 224 + vprord $16,%xmm3,%xmm3 263 225 vpaddd %xmm3,%xmm2,%xmm2 264 226 vpxor %xmm2,%xmm1,%xmm1 265 - vprord $0xc,%xmm1,%xmm1 266 - vextracti128 $0x1,%ymm9,%xmm9 227 + vprord $12,%xmm1,%xmm1 228 + vextracti128 $1,%ymm9,%xmm9 267 229 vpaddd %xmm9,%xmm0,%xmm0 268 230 vpaddd %xmm1,%xmm0,%xmm0 269 231 vpxor %xmm0,%xmm3,%xmm3 270 - vprord $0x8,%xmm3,%xmm3 232 + vprord $8,%xmm3,%xmm3 271 233 vpaddd %xmm3,%xmm2,%xmm2 272 234 vpxor %xmm2,%xmm1,%xmm1 273 - vprord $0x7,%xmm1,%xmm1 235 + vprord $7,%xmm1,%xmm1 274 236 vpshufd $0x39,%xmm0,%xmm0 275 237 vpshufd $0x4e,%xmm3,%xmm3 276 238 vpshufd $0x93,%xmm2,%xmm2 277 239 decb %cl 278 - jne .Lblake2s_compress_avx512_roundloop 279 - vpxor %xmm10,%xmm0,%xmm0 280 - vpxor %xmm11,%xmm1,%xmm1 281 - vpxor %xmm2,%xmm0,%xmm0 282 - vpxor %xmm3,%xmm1,%xmm1 283 - decq %rdx 284 - jne .Lblake2s_compress_avx512_mainloop 285 - vmovdqu %xmm0,(%rdi) 286 - vmovdqu %xmm1,0x10(%rdi) 287 - vmovdqu %xmm4,0x20(%rdi) 240 + jne .Lavx512_roundloop 241 + 242 + // Compute the new h: h[0..7] ^= v[0..7] ^ v[8..15] 243 + vpternlogd $0x96,%xmm10,%xmm2,%xmm0 244 + vpternlogd $0x96,%xmm11,%xmm3,%xmm1 245 + decq NBLOCKS 246 + jne .Lavx512_mainloop 247 + 248 + vmovdqu %xmm0,(CTX) // Store new h[0..3] 249 + vmovdqu %xmm1,16(CTX) // Store new h[4..7] 250 + vmovq %xmm4,32(CTX) // Store new t (f is unchanged) 288 251 vzeroupper 289 252 RET 290 253 SYM_FUNC_END(blake2s_compress_avx512)
+10 -12
lib/crypto/x86/blake2s.h
··· 11 11 #include <linux/kernel.h> 12 12 #include <linux/sizes.h> 13 13 14 - asmlinkage void blake2s_compress_ssse3(struct blake2s_state *state, 15 - const u8 *block, const size_t nblocks, 16 - const u32 inc); 17 - asmlinkage void blake2s_compress_avx512(struct blake2s_state *state, 18 - const u8 *block, const size_t nblocks, 19 - const u32 inc); 14 + asmlinkage void blake2s_compress_ssse3(struct blake2s_ctx *ctx, 15 + const u8 *data, size_t nblocks, u32 inc); 16 + asmlinkage void blake2s_compress_avx512(struct blake2s_ctx *ctx, 17 + const u8 *data, size_t nblocks, u32 inc); 20 18 21 19 static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_ssse3); 22 20 static __ro_after_init DEFINE_STATIC_KEY_FALSE(blake2s_use_avx512); 23 21 24 - static void blake2s_compress(struct blake2s_state *state, const u8 *block, 25 - size_t nblocks, const u32 inc) 22 + static void blake2s_compress(struct blake2s_ctx *ctx, 23 + const u8 *data, size_t nblocks, u32 inc) 26 24 { 27 25 /* SIMD disables preemption, so relax after processing each page. */ 28 26 BUILD_BUG_ON(SZ_4K / BLAKE2S_BLOCK_SIZE < 8); 29 27 30 28 if (!static_branch_likely(&blake2s_use_ssse3) || !may_use_simd()) { 31 - blake2s_compress_generic(state, block, nblocks, inc); 29 + blake2s_compress_generic(ctx, data, nblocks, inc); 32 30 return; 33 31 } 34 32 ··· 36 38 37 39 kernel_fpu_begin(); 38 40 if (static_branch_likely(&blake2s_use_avx512)) 39 - blake2s_compress_avx512(state, block, blocks, inc); 41 + blake2s_compress_avx512(ctx, data, blocks, inc); 40 42 else 41 - blake2s_compress_ssse3(state, block, blocks, inc); 43 + blake2s_compress_ssse3(ctx, data, blocks, inc); 42 44 kernel_fpu_end(); 43 45 46 + data += blocks * BLAKE2S_BLOCK_SIZE; 44 47 nblocks -= blocks; 45 - block += blocks * BLAKE2S_BLOCK_SIZE; 46 48 } while (nblocks); 47 49 } 48 50
+83
lib/crypto/x86/polyval.h
··· 1 + /* SPDX-License-Identifier: GPL-2.0-or-later */ 2 + /* 3 + * POLYVAL library functions, x86_64 optimized 4 + * 5 + * Copyright 2025 Google LLC 6 + */ 7 + #include <asm/fpu/api.h> 8 + #include <linux/cpufeature.h> 9 + 10 + #define NUM_H_POWERS 8 11 + 12 + static __ro_after_init DEFINE_STATIC_KEY_FALSE(have_pclmul_avx); 13 + 14 + asmlinkage void polyval_mul_pclmul_avx(struct polyval_elem *a, 15 + const struct polyval_elem *b); 16 + asmlinkage void polyval_blocks_pclmul_avx(struct polyval_elem *acc, 17 + const struct polyval_key *key, 18 + const u8 *data, size_t nblocks); 19 + 20 + static void polyval_preparekey_arch(struct polyval_key *key, 21 + const u8 raw_key[POLYVAL_BLOCK_SIZE]) 22 + { 23 + static_assert(ARRAY_SIZE(key->h_powers) == NUM_H_POWERS); 24 + memcpy(&key->h_powers[NUM_H_POWERS - 1], raw_key, POLYVAL_BLOCK_SIZE); 25 + if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { 26 + kernel_fpu_begin(); 27 + for (int i = NUM_H_POWERS - 2; i >= 0; i--) { 28 + key->h_powers[i] = key->h_powers[i + 1]; 29 + polyval_mul_pclmul_avx( 30 + &key->h_powers[i], 31 + &key->h_powers[NUM_H_POWERS - 1]); 32 + } 33 + kernel_fpu_end(); 34 + } else { 35 + for (int i = NUM_H_POWERS - 2; i >= 0; i--) { 36 + key->h_powers[i] = key->h_powers[i + 1]; 37 + polyval_mul_generic(&key->h_powers[i], 38 + &key->h_powers[NUM_H_POWERS - 1]); 39 + } 40 + } 41 + } 42 + 43 + static void polyval_mul_arch(struct polyval_elem *acc, 44 + const struct polyval_key *key) 45 + { 46 + if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { 47 + kernel_fpu_begin(); 48 + polyval_mul_pclmul_avx(acc, &key->h_powers[NUM_H_POWERS - 1]); 49 + kernel_fpu_end(); 50 + } else { 51 + polyval_mul_generic(acc, &key->h_powers[NUM_H_POWERS - 1]); 52 + } 53 + } 54 + 55 + static void polyval_blocks_arch(struct polyval_elem *acc, 56 + const struct polyval_key *key, 57 + const u8 *data, size_t nblocks) 58 + { 59 + if (static_branch_likely(&have_pclmul_avx) && irq_fpu_usable()) { 60 + do { 61 + /* Allow rescheduling every 4 KiB. */ 62 + size_t n = min_t(size_t, nblocks, 63 + 4096 / POLYVAL_BLOCK_SIZE); 64 + 65 + kernel_fpu_begin(); 66 + polyval_blocks_pclmul_avx(acc, key, data, n); 67 + kernel_fpu_end(); 68 + data += n * POLYVAL_BLOCK_SIZE; 69 + nblocks -= n; 70 + } while (nblocks); 71 + } else { 72 + polyval_blocks_generic(acc, &key->h_powers[NUM_H_POWERS - 1], 73 + data, nblocks); 74 + } 75 + } 76 + 77 + #define polyval_mod_init_arch polyval_mod_init_arch 78 + static void polyval_mod_init_arch(void) 79 + { 80 + if (boot_cpu_has(X86_FEATURE_PCLMULQDQ) && 81 + boot_cpu_has(X86_FEATURE_AVX)) 82 + static_branch_enable(&have_pclmul_avx); 83 + }
+36
scripts/crypto/gen-fips-testvecs.py
··· 1 + #!/usr/bin/env python3 2 + # SPDX-License-Identifier: GPL-2.0-or-later 3 + # 4 + # Script that generates lib/crypto/fips.h 5 + # 6 + # Copyright 2025 Google LLC 7 + 8 + import hashlib 9 + import hmac 10 + 11 + fips_test_data = b"fips test data\0\0" 12 + fips_test_key = b"fips test key\0\0\0" 13 + 14 + def print_static_u8_array_definition(name, value): 15 + print('') 16 + print(f'static const u8 {name}[] __initconst __maybe_unused = {{') 17 + for i in range(0, len(value), 8): 18 + line = '\t' + ''.join(f'0x{b:02x}, ' for b in value[i:i+8]) 19 + print(f'{line.rstrip()}') 20 + print('};') 21 + 22 + print('/* SPDX-License-Identifier: GPL-2.0-or-later */') 23 + print(f'/* This file was generated by: gen-fips-testvecs.py */') 24 + print() 25 + print('#include <linux/fips.h>') 26 + 27 + print_static_u8_array_definition("fips_test_data", fips_test_data) 28 + print_static_u8_array_definition("fips_test_key", fips_test_key) 29 + 30 + for alg in 'sha1', 'sha256', 'sha512': 31 + ctx = hmac.new(fips_test_key, digestmod=alg) 32 + ctx.update(fips_test_data) 33 + print_static_u8_array_definition(f'fips_test_hmac_{alg}_value', ctx.digest()) 34 + 35 + print_static_u8_array_definition(f'fips_test_sha3_256_value', 36 + hashlib.sha3_256(fips_test_data).digest())