Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'ipsec-next-2025-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2025-11-18

1) Relax a lock contention bottleneck to improve IPsec crypto
offload performance. From Jianbo Liu.

2) Deprecate pfkey, the interface will be removed in 2027.

3) Update xfrm documentation and move it to ipsec maintainance.
From Bagas Sanjaya.

* tag 'ipsec-next-2025-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
MAINTAINERS: Add entry for XFRM documentation
net: Move XFRM documentation into its own subdirectory
Documentation: xfrm_sync: Number the fifth section
Documentation: xfrm_sysctl: Trim trailing colon in section heading
Documentation: xfrm_sync: Trim excess section heading characters
Documentation: xfrm_sync: Properly reindent list text
Documentation: xfrm_device: Separate hardware offload sublists
Documentation: xfrm_device: Use numbered list for offloading steps
Documentation: xfrm_device: Wrap iproute2 snippets in literal code block
pfkey: Deprecate pfkey
xfrm: Skip redundant replay recheck for the hardware offload path
xfrm: Refactor xfrm_input lock to reduce contention with RSS
====================

Link: https://patch.msgid.link/20251118092610.2223552-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+103 -80
+1 -4
Documentation/networking/index.rst
··· 131 131 vxlan 132 132 x25 133 133 x25-iface 134 - xfrm_device 135 - xfrm_proc 136 - xfrm_sync 137 - xfrm_sysctl 134 + xfrm/index 138 135 xdp-rx-metadata 139 136 xsk-tx-metadata 140 137
+13
Documentation/networking/xfrm/index.rst
··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ============== 4 + XFRM Framework 5 + ============== 6 + 7 + .. toctree:: 8 + :maxdepth: 2 9 + 10 + xfrm_device 11 + xfrm_proc 12 + xfrm_sync 13 + xfrm_sysctl
+12 -8
Documentation/networking/xfrm_device.rst Documentation/networking/xfrm/xfrm_device.rst
··· 20 20 Device interface allows NIC drivers to offer to the stack access to the 21 21 hardware offload. 22 22 23 - Right now, there are two types of hardware offload that kernel supports. 23 + Right now, there are two types of hardware offload that kernel supports: 24 + 24 25 * IPsec crypto offload: 26 + 25 27 * NIC performs encrypt/decrypt 26 28 * Kernel does everything else 29 + 27 30 * IPsec packet offload: 31 + 28 32 * NIC performs encrypt/decrypt 29 33 * NIC does encapsulation 30 34 * Kernel and NIC have SA and policy in-sync ··· 38 34 Userland access to the offload is typically through a system such as 39 35 libreswan or KAME/raccoon, but the iproute2 'ip xfrm' command set can 40 36 be handy when experimenting. An example command might look something 41 - like this for crypto offload: 37 + like this for crypto offload:: 42 38 43 39 ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ 44 40 reqid 0x07 replay-window 32 \ ··· 46 42 sel src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp \ 47 43 offload dev eth4 dir in 48 44 49 - and for packet offload 45 + and for packet offload:: 50 46 51 47 ip x s add proto esp dst 14.0.0.70 src 14.0.0.52 spi 0x07 mode transport \ 52 48 reqid 0x07 replay-window 32 \ ··· 157 153 IPsec headers are still in the packet data; they are removed later up 158 154 the stack in xfrm_input(). 159 155 160 - find and hold the SA that was used to the Rx skb:: 156 + 1. Find and hold the SA that was used to the Rx skb:: 161 157 162 - get spi, protocol, and destination IP from packet headers 158 + /* get spi, protocol, and destination IP from packet headers */ 163 159 xs = find xs from (spi, protocol, dest_IP) 164 160 xfrm_state_hold(xs); 165 161 166 - store the state information into the skb:: 162 + 2. Store the state information into the skb:: 167 163 168 164 sp = secpath_set(skb); 169 165 if (!sp) return; 170 166 sp->xvec[sp->len++] = xs; 171 167 sp->olen++; 172 168 173 - indicate the success and/or error status of the offload:: 169 + 3. Indicate the success and/or error status of the offload:: 174 170 175 171 xo = xfrm_offload(skb); 176 172 xo->flags = CRYPTO_DONE; 177 173 xo->status = crypto_status; 178 174 179 - hand the packet to napi_gro_receive() as usual 175 + 4. Hand the packet to napi_gro_receive() as usual. 180 176 181 177 In ESN mode, xdo_dev_state_advance_esn() is called from 182 178 xfrm_replay_advance_esn() for RX, and xfrm_replay_overflow_offload_esn for TX.
Documentation/networking/xfrm_proc.rst Documentation/networking/xfrm/xfrm_proc.rst
+50 -47
Documentation/networking/xfrm_sync.rst Documentation/networking/xfrm/xfrm_sync.rst
··· 1 1 .. SPDX-License-Identifier: GPL-2.0 2 2 3 - ==== 4 - XFRM 5 - ==== 3 + ========= 4 + XFRM sync 5 + ========= 6 6 7 7 The sync patches work is based on initial patches from 8 8 Krisztian <hidden@balabit.hu> and others and additional patches ··· 36 36 - the replay sequence for both inbound and outbound 37 37 38 38 1) Message Structure 39 - ---------------------- 39 + -------------------- 40 40 41 41 nlmsghdr:aevent_id:optional-TLVs. 42 42 ··· 83 83 A program needs to subscribe to multicast group XFRMNLGRP_AEVENTS 84 84 to get notified of these events. 85 85 86 - 2) TLVS reflect the different parameters: 87 - ----------------------------------------- 86 + 2) TLVS reflect the different parameters 87 + ---------------------------------------- 88 88 89 89 a) byte value (XFRMA_LTIME_VAL) 90 90 91 - This TLV carries the running/current counter for byte lifetime since 92 - last event. 91 + This TLV carries the running/current counter for byte lifetime since 92 + last event. 93 93 94 - b)replay value (XFRMA_REPLAY_VAL) 94 + b) replay value (XFRMA_REPLAY_VAL) 95 95 96 - This TLV carries the running/current counter for replay sequence since 97 - last event. 96 + This TLV carries the running/current counter for replay sequence since 97 + last event. 98 98 99 - c)replay threshold (XFRMA_REPLAY_THRESH) 99 + c) replay threshold (XFRMA_REPLAY_THRESH) 100 100 101 - This TLV carries the threshold being used by the kernel to trigger events 102 - when the replay sequence is exceeded. 101 + This TLV carries the threshold being used by the kernel to trigger events 102 + when the replay sequence is exceeded. 103 103 104 104 d) expiry timer (XFRMA_ETIMER_THRESH) 105 105 106 - This is a timer value in milliseconds which is used as the nagle 107 - value to rate limit the events. 106 + This is a timer value in milliseconds which is used as the nagle 107 + value to rate limit the events. 108 108 109 - 3) Default configurations for the parameters: 110 - --------------------------------------------- 109 + 3) Default configurations for the parameters 110 + -------------------------------------------- 111 111 112 112 By default these events should be turned off unless there is 113 113 at least one listener registered to listen to the multicast ··· 121 121 the two sysctls/proc entries are: 122 122 123 123 a) /proc/sys/net/core/sysctl_xfrm_aevent_etime 124 - used to provide default values for the XFRMA_ETIMER_THRESH in incremental 125 - units of time of 100ms. The default is 10 (1 second) 124 + 125 + Used to provide default values for the XFRMA_ETIMER_THRESH in incremental 126 + units of time of 100ms. The default is 10 (1 second) 126 127 127 128 b) /proc/sys/net/core/sysctl_xfrm_aevent_rseqth 128 - used to provide default values for XFRMA_REPLAY_THRESH parameter 129 - in incremental packet count. The default is two packets. 129 + 130 + Used to provide default values for XFRMA_REPLAY_THRESH parameter 131 + in incremental packet count. The default is two packets. 130 132 131 133 4) Message types 132 134 ---------------- ··· 136 134 a) XFRM_MSG_GETAE issued by user-->kernel. 137 135 XFRM_MSG_GETAE does not carry any TLVs. 138 136 139 - The response is a XFRM_MSG_NEWAE which is formatted based on what 140 - XFRM_MSG_GETAE queried for. 137 + The response is a XFRM_MSG_NEWAE which is formatted based on what 138 + XFRM_MSG_GETAE queried for. 141 139 142 - The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 143 - * if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved 144 - * if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved 140 + The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 141 + 142 + * if XFRM_AE_RTHR flag is set, then XFRMA_REPLAY_THRESH is also retrieved 143 + * if XFRM_AE_ETHR flag is set, then XFRMA_ETIMER_THRESH is also retrieved 145 144 146 145 b) XFRM_MSG_NEWAE is issued by either user space to configure 147 146 or kernel to announce events or respond to a XFRM_MSG_GETAE. 148 147 149 - i) user --> kernel to configure a specific SA. 148 + i) user --> kernel to configure a specific SA. 150 149 151 - any of the values or threshold parameters can be updated by passing the 152 - appropriate TLV. 150 + any of the values or threshold parameters can be updated by passing the 151 + appropriate TLV. 153 152 154 - A response is issued back to the sender in user space to indicate success 155 - or failure. 153 + A response is issued back to the sender in user space to indicate success 154 + or failure. 156 155 157 - In the case of success, additionally an event with 158 - XFRM_MSG_NEWAE is also issued to any listeners as described in iii). 156 + In the case of success, additionally an event with 157 + XFRM_MSG_NEWAE is also issued to any listeners as described in iii). 159 158 160 - ii) kernel->user direction as a response to XFRM_MSG_GETAE 159 + ii) kernel->user direction as a response to XFRM_MSG_GETAE 161 160 162 - The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 161 + The response will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 163 162 164 - The threshold TLVs will be included if explicitly requested in 165 - the XFRM_MSG_GETAE message. 163 + The threshold TLVs will be included if explicitly requested in 164 + the XFRM_MSG_GETAE message. 166 165 167 - iii) kernel->user to report as event if someone sets any values or 168 - thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). 169 - In such a case XFRM_AE_CU flag is set to inform the user that 170 - the change happened as a result of an update. 171 - The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 166 + iii) kernel->user to report as event if someone sets any values or 167 + thresholds for an SA using XFRM_MSG_NEWAE (as described in #i above). 168 + In such a case XFRM_AE_CU flag is set to inform the user that 169 + the change happened as a result of an update. 170 + The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 172 171 173 - iv) kernel->user to report event when replay threshold or a timeout 174 - is exceeded. 172 + iv) kernel->user to report event when replay threshold or a timeout 173 + is exceeded. 175 174 176 175 In such a case either XFRM_AE_CR (replay exceeded) or XFRM_AE_CE (timeout 177 176 happened) is set to inform the user what happened. 178 177 Note the two flags are mutually exclusive. 179 178 The message will always have XFRMA_LTIME_VAL and XFRMA_REPLAY_VAL TLVs. 180 179 181 - Exceptions to threshold settings 182 - -------------------------------- 180 + 5) Exceptions to threshold settings 181 + ----------------------------------- 183 182 184 183 If you have an SA that is getting hit by traffic in bursts such that 185 184 there is a period where the timer threshold expires with no packets
+2 -2
Documentation/networking/xfrm_sysctl.rst Documentation/networking/xfrm/xfrm_sysctl.rst
··· 4 4 XFRM Syscall 5 5 ============ 6 6 7 - /proc/sys/net/core/xfrm_* Variables: 8 - ==================================== 7 + /proc/sys/net/core/xfrm_* Variables 8 + =================================== 9 9 10 10 xfrm_acq_expires - INTEGER 11 11 default 30 - hard timeout in seconds for acquire requests
+1
MAINTAINERS
··· 18068 18068 S: Maintained 18069 18069 T: git git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git 18070 18070 T: git git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git 18071 + F: Documentation/networking/xfrm/ 18071 18072 F: include/net/xfrm.h 18072 18073 F: include/uapi/linux/xfrm.h 18073 18074 F: net/ipv4/ah4.c
+2
net/key/af_key.c
··· 3903 3903 { 3904 3904 int err = proto_register(&key_proto, 0); 3905 3905 3906 + pr_warn_once("PFKEY is deprecated and scheduled to be removed in 2027, " 3907 + "please contact the netdev mailing list\n"); 3906 3908 if (err != 0) 3907 3909 goto out; 3908 3910
+7 -4
net/xfrm/Kconfig
··· 110 110 select CRYPTO_DEFLATE 111 111 112 112 config NET_KEY 113 - tristate "PF_KEY sockets" 113 + tristate "PF_KEY sockets (deprecated)" 114 114 select XFRM_ALGO 115 115 help 116 116 PF_KEYv2 socket family, compatible to KAME ones. 117 - They are required if you are going to use IPsec tools ported 118 - from KAME. 119 117 120 - Say Y unless you know what you are doing. 118 + The PF_KEYv2 socket interface is deprecated and 119 + scheduled for removal. All maintained IKE daemons 120 + no longer need PF_KEY sockets. Please use the netlink 121 + interface (XFRM_USER) to configure IPsec. 122 + 123 + If unsure, say N. 121 124 122 125 config NET_KEY_MIGRATE 123 126 bool "PF_KEY MIGRATE"
+15 -15
net/xfrm/xfrm_input.c
··· 505 505 async = 1; 506 506 dev_put(skb->dev); 507 507 seq = XFRM_SKB_CB(skb)->seq.input.low; 508 + spin_lock(&x->lock); 508 509 goto resume; 509 510 } 510 511 /* GRO call */ ··· 542 541 XFRM_INC_STATS(net, LINUX_MIB_XFRMINHDRERROR); 543 542 goto drop; 544 543 } 544 + 545 + nexthdr = x->type_offload->input_tail(x, skb); 545 546 } 546 547 547 - goto lock; 548 + goto process; 548 549 } 549 550 550 551 family = XFRM_SPI_SKB_CB(skb)->family; ··· 614 611 goto drop; 615 612 } 616 613 617 - lock: 614 + process: 615 + seq_hi = htonl(xfrm_replay_seqhi(x, seq)); 616 + 617 + XFRM_SKB_CB(skb)->seq.input.low = seq; 618 + XFRM_SKB_CB(skb)->seq.input.hi = seq_hi; 619 + 618 620 spin_lock(&x->lock); 619 621 620 622 if (unlikely(x->km.state != XFRM_STATE_VALID)) { ··· 646 638 goto drop_unlock; 647 639 } 648 640 649 - spin_unlock(&x->lock); 650 - 651 641 if (xfrm_tunnel_check(skb, x, family)) { 652 642 XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEMODEERROR); 653 - goto drop; 643 + goto drop_unlock; 654 644 } 655 645 656 - seq_hi = htonl(xfrm_replay_seqhi(x, seq)); 657 - 658 - XFRM_SKB_CB(skb)->seq.input.low = seq; 659 - XFRM_SKB_CB(skb)->seq.input.hi = seq_hi; 660 - 661 - if (crypto_done) { 662 - nexthdr = x->type_offload->input_tail(x, skb); 663 - } else { 646 + if (!crypto_done) { 647 + spin_unlock(&x->lock); 664 648 dev_hold(skb->dev); 665 649 666 650 nexthdr = x->type->input(x, skb); ··· 660 660 return 0; 661 661 662 662 dev_put(skb->dev); 663 + spin_lock(&x->lock); 663 664 } 664 665 resume: 665 - spin_lock(&x->lock); 666 666 if (nexthdr < 0) { 667 667 if (nexthdr == -EBADMSG) { 668 668 xfrm_audit_state_icvfail(x, skb, ··· 676 676 /* only the first xfrm gets the encap type */ 677 677 encap_type = 0; 678 678 679 - if (xfrm_replay_recheck(x, skb, seq)) { 679 + if (!crypto_done && xfrm_replay_recheck(x, skb, seq)) { 680 680 XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATESEQERROR); 681 681 goto drop_unlock; 682 682 }