Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mptcp: sysctl: add syn_retrans_before_tcp_fallback

The number of SYN + MPC retransmissions before falling back to TCP was
fixed to 2. This is certainly a good default value, but having a fixed
number can be a problem in some environments.

The current behaviour means that if all packets are dropped, there will
be:

- The initial SYN + MPC

- 2 retransmissions with MPC

- The next ones will be without MPTCP.

So typically ~3 seconds before falling back to TCP. In some networks
where some temporally blackholes are unfortunately frequent, or when a
client tries to initiate connections while the network is not ready yet,
this can cause new connections not to have MPTCP connections.

In such environments, it is now possible to increase the number of SYN
retransmissions with MPTCP options to make sure MPTCP is used.

Interesting values are:

- 0: the first retransmission will be done without MPTCP options: quite
aggressive, but also a higher risk of detecting false-positive
MPTCP blackholes.

- >= 128: all SYN retransmissions will keep the MPTCP options: back to
the < 6.12 behaviour.

The default behaviour is not changed here.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250117-net-next-mptcp-syn_retrans_before_tcp_fallback-v1-1-ab4b187099b0@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

authored by

Matthieu Baerts (NGI0) and committed by
Jakub Kicinski
5b4fd353 50309d38

+33 -4
+16
Documentation/networking/mptcp-sysctl.rst
··· 108 108 This is a per-namespace sysctl. 109 109 110 110 Default: 4 111 + 112 + syn_retrans_before_tcp_fallback - INTEGER 113 + The number of SYN + MP_CAPABLE retransmissions before falling back to 114 + TCP, i.e. dropping the MPTCP options. In other words, if all the packets 115 + are dropped on the way, there will be: 116 + 117 + * The initial SYN with MPTCP support 118 + * This number of SYN retransmitted with MPTCP support 119 + * The next SYN retransmissions will be without MPTCP support 120 + 121 + 0 means the first retransmission will be done without MPTCP options. 122 + >= 128 means that all SYN retransmissions will keep the MPTCP options. A 123 + lower number might increase false-positive MPTCP blackholes detections. 124 + This is a per-namespace sysctl. 125 + 126 + Default: 2
+17 -4
net/mptcp/ctrl.c
··· 32 32 unsigned int close_timeout; 33 33 unsigned int stale_loss_cnt; 34 34 atomic_t active_disable_times; 35 + u8 syn_retrans_before_tcp_fallback; 35 36 unsigned long active_disable_stamp; 36 37 u8 mptcp_enabled; 37 38 u8 checksum_enabled; ··· 93 92 pernet->mptcp_enabled = 1; 94 93 pernet->add_addr_timeout = TCP_RTO_MAX; 95 94 pernet->blackhole_timeout = 3600; 95 + pernet->syn_retrans_before_tcp_fallback = 2; 96 96 atomic_set(&pernet->active_disable_times, 0); 97 97 pernet->close_timeout = TCP_TIMEWAIT_LEN; 98 98 pernet->checksum_enabled = 0; ··· 247 245 .proc_handler = proc_blackhole_detect_timeout, 248 246 .extra1 = SYSCTL_ZERO, 249 247 }, 248 + { 249 + .procname = "syn_retrans_before_tcp_fallback", 250 + .maxlen = sizeof(u8), 251 + .mode = 0644, 252 + .proc_handler = proc_dou8vec_minmax, 253 + }, 250 254 }; 251 255 252 256 static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) ··· 277 269 /* table[7] is for available_schedulers which is read-only info */ 278 270 table[8].data = &pernet->close_timeout; 279 271 table[9].data = &pernet->blackhole_timeout; 272 + table[10].data = &pernet->syn_retrans_before_tcp_fallback; 280 273 281 274 hdr = register_net_sysctl_sz(net, MPTCP_SYSCTL_PATH, table, 282 275 ARRAY_SIZE(mptcp_sysctl_table)); ··· 401 392 void mptcp_active_detect_blackhole(struct sock *ssk, bool expired) 402 393 { 403 394 struct mptcp_subflow_context *subflow; 404 - u32 timeouts; 405 395 406 396 if (!sk_is_mptcp(ssk)) 407 397 return; 408 398 409 - timeouts = inet_csk(ssk)->icsk_retransmits; 410 399 subflow = mptcp_subflow_ctx(ssk); 411 400 412 401 if (subflow->request_mptcp && ssk->sk_state == TCP_SYN_SENT) { 413 - if (timeouts == 2 || (timeouts < 2 && expired)) { 414 - MPTCP_INC_STATS(sock_net(ssk), MPTCP_MIB_MPCAPABLEACTIVEDROP); 402 + struct net *net = sock_net(ssk); 403 + u8 timeouts, to_max; 404 + 405 + timeouts = inet_csk(ssk)->icsk_retransmits; 406 + to_max = mptcp_get_pernet(net)->syn_retrans_before_tcp_fallback; 407 + 408 + if (timeouts == to_max || (timeouts < to_max && expired)) { 409 + MPTCP_INC_STATS(net, MPTCP_MIB_MPCAPABLEACTIVEDROP); 415 410 subflow->mpc_drop = 1; 416 411 mptcp_subflow_early_fallback(mptcp_sk(subflow->conn), subflow); 417 412 } else {