Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'mptcp-pm-code-reorganisation'

Matthieu Baerts says:

====================
mptcp: pm: code reorganisation

Before this series, the PM code was dispersed in different places:

- pm.c had common code for all PMs.

- pm_netlink.c was initially only about the in-kernel PM, but ended up
also getting exported common helpers, callbacks used by the different
PMs, NL events for PM userspace daemon, etc. quite confusing.

- pm_userspace.c had userspace PM only code, but it was using "specific"
in-kernel PM helpers according to their names.

To clarify the code, a reorganisation is suggested here, only by moving
code around, and small helper renaming to avoid confusions:

- pm_netlink.c now only contains common PM generic Netlink code:
- PM events: this code was already there
- shared helpers around Netlink code that were already there as well
- shared Netlink commands code from pm.c

- pm_kernel.c now contains only code that is specific to the in-kernel
PM. Now all functions are either called from:
- pm.c: events coming from the core, when this PM is being used
- pm_netlink.c: for shared Netlink commands
- mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM
- sockopt.c: for the exported counters per netns

- pm.c got many code from pm_netlink.c:
- helpers used from both PMs and not linked to Netlink
- callbacks used by different PMs, e.g. ADD_ADDR management
- some helpers have been renamed to remove the '_nl' prefix, and some
have been marked as 'static'.

- protocol.h has been updated accordingly:
- some helpers no longer need to be exported
- new ones needed to be exported: they have been prefixed if needed.

The code around the PM is now less confusing, which should help for the
maintenance in the long term, and the introduction of a PM Ops.

This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.

Note that checkpatch, when used with --max-line-length=80, will complain
about lines being over the 80 limits, but these warnings were already
there before moving the code around.

Also, patch 1 is not directly related to the code reorganisation, but it
was a remaining cleanup that we didn't upstream before, because it was
conflicting with another patch that has been sent for inclusion to the
net tree.
====================

Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-0-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+2045 -2032
+1 -1
net/mptcp/Makefile
··· 3 3 4 4 mptcp-y := protocol.o subflow.o options.o token.o crypto.o ctrl.o pm.o diag.o \ 5 5 mib.o pm_netlink.o sockopt.o pm_userspace.o fastopen.o sched.o \ 6 - mptcp_pm_gen.o 6 + mptcp_pm_gen.o pm_kernel.o 7 7 8 8 obj-$(CONFIG_SYN_COOKIES) += syncookies.o 9 9 obj-$(CONFIG_INET_MPTCP_DIAG) += mptcp_diag.o
+515 -138
net/mptcp/pm.c
··· 5 5 */ 6 6 #define pr_fmt(fmt) "MPTCP: " fmt 7 7 8 - #include <linux/kernel.h> 9 - #include <net/mptcp.h> 10 8 #include "protocol.h" 11 - 12 9 #include "mib.h" 13 - #include "mptcp_pm_gen.h" 10 + 11 + #define ADD_ADDR_RETRANS_MAX 3 12 + 13 + struct mptcp_pm_add_entry { 14 + struct list_head list; 15 + struct mptcp_addr_info addr; 16 + u8 retrans_times; 17 + struct timer_list add_timer; 18 + struct mptcp_sock *sock; 19 + }; 20 + 21 + /* path manager helpers */ 22 + 23 + /* if sk is ipv4 or ipv6_only allows only same-family local and remote addresses, 24 + * otherwise allow any matching local/remote pair 25 + */ 26 + bool mptcp_pm_addr_families_match(const struct sock *sk, 27 + const struct mptcp_addr_info *loc, 28 + const struct mptcp_addr_info *rem) 29 + { 30 + bool mptcp_is_v4 = sk->sk_family == AF_INET; 31 + 32 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 33 + bool loc_is_v4 = loc->family == AF_INET || ipv6_addr_v4mapped(&loc->addr6); 34 + bool rem_is_v4 = rem->family == AF_INET || ipv6_addr_v4mapped(&rem->addr6); 35 + 36 + if (mptcp_is_v4) 37 + return loc_is_v4 && rem_is_v4; 38 + 39 + if (ipv6_only_sock(sk)) 40 + return !loc_is_v4 && !rem_is_v4; 41 + 42 + return loc_is_v4 == rem_is_v4; 43 + #else 44 + return mptcp_is_v4 && loc->family == AF_INET && rem->family == AF_INET; 45 + #endif 46 + } 47 + 48 + bool mptcp_addresses_equal(const struct mptcp_addr_info *a, 49 + const struct mptcp_addr_info *b, bool use_port) 50 + { 51 + bool addr_equals = false; 52 + 53 + if (a->family == b->family) { 54 + if (a->family == AF_INET) 55 + addr_equals = a->addr.s_addr == b->addr.s_addr; 56 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 57 + else 58 + addr_equals = ipv6_addr_equal(&a->addr6, &b->addr6); 59 + } else if (a->family == AF_INET) { 60 + if (ipv6_addr_v4mapped(&b->addr6)) 61 + addr_equals = a->addr.s_addr == b->addr6.s6_addr32[3]; 62 + } else if (b->family == AF_INET) { 63 + if (ipv6_addr_v4mapped(&a->addr6)) 64 + addr_equals = a->addr6.s6_addr32[3] == b->addr.s_addr; 65 + #endif 66 + } 67 + 68 + if (!addr_equals) 69 + return false; 70 + if (!use_port) 71 + return true; 72 + 73 + return a->port == b->port; 74 + } 75 + 76 + void mptcp_local_address(const struct sock_common *skc, 77 + struct mptcp_addr_info *addr) 78 + { 79 + addr->family = skc->skc_family; 80 + addr->port = htons(skc->skc_num); 81 + if (addr->family == AF_INET) 82 + addr->addr.s_addr = skc->skc_rcv_saddr; 83 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 84 + else if (addr->family == AF_INET6) 85 + addr->addr6 = skc->skc_v6_rcv_saddr; 86 + #endif 87 + } 88 + 89 + void mptcp_remote_address(const struct sock_common *skc, 90 + struct mptcp_addr_info *addr) 91 + { 92 + addr->family = skc->skc_family; 93 + addr->port = skc->skc_dport; 94 + if (addr->family == AF_INET) 95 + addr->addr.s_addr = skc->skc_daddr; 96 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 97 + else if (addr->family == AF_INET6) 98 + addr->addr6 = skc->skc_v6_daddr; 99 + #endif 100 + } 101 + 102 + static bool mptcp_pm_is_init_remote_addr(struct mptcp_sock *msk, 103 + const struct mptcp_addr_info *remote) 104 + { 105 + struct mptcp_addr_info mpc_remote; 106 + 107 + mptcp_remote_address((struct sock_common *)msk, &mpc_remote); 108 + return mptcp_addresses_equal(&mpc_remote, remote, remote->port); 109 + } 110 + 111 + bool mptcp_lookup_subflow_by_saddr(const struct list_head *list, 112 + const struct mptcp_addr_info *saddr) 113 + { 114 + struct mptcp_subflow_context *subflow; 115 + struct mptcp_addr_info cur; 116 + struct sock_common *skc; 117 + 118 + list_for_each_entry(subflow, list, node) { 119 + skc = (struct sock_common *)mptcp_subflow_tcp_sock(subflow); 120 + 121 + mptcp_local_address(skc, &cur); 122 + if (mptcp_addresses_equal(&cur, saddr, saddr->port)) 123 + return true; 124 + } 125 + 126 + return false; 127 + } 128 + 129 + static struct mptcp_pm_add_entry * 130 + mptcp_lookup_anno_list_by_saddr(const struct mptcp_sock *msk, 131 + const struct mptcp_addr_info *addr) 132 + { 133 + struct mptcp_pm_add_entry *entry; 134 + 135 + lockdep_assert_held(&msk->pm.lock); 136 + 137 + list_for_each_entry(entry, &msk->pm.anno_list, list) { 138 + if (mptcp_addresses_equal(&entry->addr, addr, true)) 139 + return entry; 140 + } 141 + 142 + return NULL; 143 + } 144 + 145 + bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk, 146 + const struct mptcp_addr_info *addr) 147 + { 148 + struct mptcp_pm_add_entry *entry; 149 + 150 + entry = mptcp_pm_del_add_timer(msk, addr, false); 151 + kfree(entry); 152 + return entry; 153 + } 154 + 155 + bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk) 156 + { 157 + struct mptcp_pm_add_entry *entry; 158 + struct mptcp_addr_info saddr; 159 + bool ret = false; 160 + 161 + mptcp_local_address((struct sock_common *)sk, &saddr); 162 + 163 + spin_lock_bh(&msk->pm.lock); 164 + list_for_each_entry(entry, &msk->pm.anno_list, list) { 165 + if (mptcp_addresses_equal(&entry->addr, &saddr, true)) { 166 + ret = true; 167 + goto out; 168 + } 169 + } 170 + 171 + out: 172 + spin_unlock_bh(&msk->pm.lock); 173 + return ret; 174 + } 175 + 176 + static void __mptcp_pm_send_ack(struct mptcp_sock *msk, 177 + struct mptcp_subflow_context *subflow, 178 + bool prio, bool backup) 179 + { 180 + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 181 + bool slow; 182 + 183 + pr_debug("send ack for %s\n", 184 + prio ? "mp_prio" : 185 + (mptcp_pm_should_add_signal(msk) ? "add_addr" : "rm_addr")); 186 + 187 + slow = lock_sock_fast(ssk); 188 + if (prio) { 189 + subflow->send_mp_prio = 1; 190 + subflow->request_bkup = backup; 191 + } 192 + 193 + __mptcp_subflow_send_ack(ssk); 194 + unlock_sock_fast(ssk, slow); 195 + } 196 + 197 + void mptcp_pm_send_ack(struct mptcp_sock *msk, 198 + struct mptcp_subflow_context *subflow, 199 + bool prio, bool backup) 200 + { 201 + spin_unlock_bh(&msk->pm.lock); 202 + __mptcp_pm_send_ack(msk, subflow, prio, backup); 203 + spin_lock_bh(&msk->pm.lock); 204 + } 205 + 206 + void mptcp_pm_addr_send_ack(struct mptcp_sock *msk) 207 + { 208 + struct mptcp_subflow_context *subflow, *alt = NULL; 209 + 210 + msk_owned_by_me(msk); 211 + lockdep_assert_held(&msk->pm.lock); 212 + 213 + if (!mptcp_pm_should_add_signal(msk) && 214 + !mptcp_pm_should_rm_signal(msk)) 215 + return; 216 + 217 + mptcp_for_each_subflow(msk, subflow) { 218 + if (__mptcp_subflow_active(subflow)) { 219 + if (!subflow->stale) { 220 + mptcp_pm_send_ack(msk, subflow, false, false); 221 + return; 222 + } 223 + 224 + if (!alt) 225 + alt = subflow; 226 + } 227 + } 228 + 229 + if (alt) 230 + mptcp_pm_send_ack(msk, alt, false, false); 231 + } 232 + 233 + int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk, 234 + struct mptcp_addr_info *addr, 235 + struct mptcp_addr_info *rem, 236 + u8 bkup) 237 + { 238 + struct mptcp_subflow_context *subflow; 239 + 240 + pr_debug("bkup=%d\n", bkup); 241 + 242 + mptcp_for_each_subflow(msk, subflow) { 243 + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 244 + struct mptcp_addr_info local, remote; 245 + 246 + mptcp_local_address((struct sock_common *)ssk, &local); 247 + if (!mptcp_addresses_equal(&local, addr, addr->port)) 248 + continue; 249 + 250 + if (rem && rem->family != AF_UNSPEC) { 251 + mptcp_remote_address((struct sock_common *)ssk, &remote); 252 + if (!mptcp_addresses_equal(&remote, rem, rem->port)) 253 + continue; 254 + } 255 + 256 + __mptcp_pm_send_ack(msk, subflow, true, bkup); 257 + return 0; 258 + } 259 + 260 + return -EINVAL; 261 + } 262 + 263 + static void mptcp_pm_add_timer(struct timer_list *timer) 264 + { 265 + struct mptcp_pm_add_entry *entry = from_timer(entry, timer, add_timer); 266 + struct mptcp_sock *msk = entry->sock; 267 + struct sock *sk = (struct sock *)msk; 268 + 269 + pr_debug("msk=%p\n", msk); 270 + 271 + if (!msk) 272 + return; 273 + 274 + if (inet_sk_state_load(sk) == TCP_CLOSE) 275 + return; 276 + 277 + if (!entry->addr.id) 278 + return; 279 + 280 + if (mptcp_pm_should_add_signal_addr(msk)) { 281 + sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX / 8); 282 + goto out; 283 + } 284 + 285 + spin_lock_bh(&msk->pm.lock); 286 + 287 + if (!mptcp_pm_should_add_signal_addr(msk)) { 288 + pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id); 289 + mptcp_pm_announce_addr(msk, &entry->addr, false); 290 + mptcp_pm_add_addr_send_ack(msk); 291 + entry->retrans_times++; 292 + } 293 + 294 + if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) 295 + sk_reset_timer(sk, timer, 296 + jiffies + mptcp_get_add_addr_timeout(sock_net(sk))); 297 + 298 + spin_unlock_bh(&msk->pm.lock); 299 + 300 + if (entry->retrans_times == ADD_ADDR_RETRANS_MAX) 301 + mptcp_pm_subflow_established(msk); 302 + 303 + out: 304 + __sock_put(sk); 305 + } 306 + 307 + struct mptcp_pm_add_entry * 308 + mptcp_pm_del_add_timer(struct mptcp_sock *msk, 309 + const struct mptcp_addr_info *addr, bool check_id) 310 + { 311 + struct mptcp_pm_add_entry *entry; 312 + struct sock *sk = (struct sock *)msk; 313 + struct timer_list *add_timer = NULL; 314 + 315 + spin_lock_bh(&msk->pm.lock); 316 + entry = mptcp_lookup_anno_list_by_saddr(msk, addr); 317 + if (entry && (!check_id || entry->addr.id == addr->id)) { 318 + entry->retrans_times = ADD_ADDR_RETRANS_MAX; 319 + add_timer = &entry->add_timer; 320 + } 321 + if (!check_id && entry) 322 + list_del(&entry->list); 323 + spin_unlock_bh(&msk->pm.lock); 324 + 325 + /* no lock, because sk_stop_timer_sync() is calling del_timer_sync() */ 326 + if (add_timer) 327 + sk_stop_timer_sync(sk, add_timer); 328 + 329 + return entry; 330 + } 331 + 332 + bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, 333 + const struct mptcp_addr_info *addr) 334 + { 335 + struct mptcp_pm_add_entry *add_entry = NULL; 336 + struct sock *sk = (struct sock *)msk; 337 + struct net *net = sock_net(sk); 338 + 339 + lockdep_assert_held(&msk->pm.lock); 340 + 341 + add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr); 342 + 343 + if (add_entry) { 344 + if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) 345 + return false; 346 + 347 + sk_reset_timer(sk, &add_entry->add_timer, 348 + jiffies + mptcp_get_add_addr_timeout(net)); 349 + return true; 350 + } 351 + 352 + add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC); 353 + if (!add_entry) 354 + return false; 355 + 356 + list_add(&add_entry->list, &msk->pm.anno_list); 357 + 358 + add_entry->addr = *addr; 359 + add_entry->sock = msk; 360 + add_entry->retrans_times = 0; 361 + 362 + timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); 363 + sk_reset_timer(sk, &add_entry->add_timer, 364 + jiffies + mptcp_get_add_addr_timeout(net)); 365 + 366 + return true; 367 + } 368 + 369 + static void mptcp_pm_free_anno_list(struct mptcp_sock *msk) 370 + { 371 + struct mptcp_pm_add_entry *entry, *tmp; 372 + struct sock *sk = (struct sock *)msk; 373 + LIST_HEAD(free_list); 374 + 375 + pr_debug("msk=%p\n", msk); 376 + 377 + spin_lock_bh(&msk->pm.lock); 378 + list_splice_init(&msk->pm.anno_list, &free_list); 379 + spin_unlock_bh(&msk->pm.lock); 380 + 381 + list_for_each_entry_safe(entry, tmp, &free_list, list) { 382 + sk_stop_timer_sync(sk, &entry->add_timer); 383 + kfree(entry); 384 + } 385 + } 14 386 15 387 /* path manager command handlers */ 16 388 ··· 429 57 msk->pm.rm_list_tx = *rm_list; 430 58 rm_addr |= BIT(MPTCP_RM_ADDR_SIGNAL); 431 59 WRITE_ONCE(msk->pm.addr_signal, rm_addr); 432 - mptcp_pm_nl_addr_send_ack(msk); 60 + mptcp_pm_addr_send_ack(msk); 433 61 return 0; 434 62 } 435 63 ··· 603 231 __MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP); 604 232 } 605 233 /* id0 should not have a different address */ 606 - } else if ((addr->id == 0 && !mptcp_pm_nl_is_init_remote_addr(msk, addr)) || 234 + } else if ((addr->id == 0 && !mptcp_pm_is_init_remote_addr(msk, addr)) || 607 235 (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { 608 236 mptcp_pm_announce_addr(msk, addr, true); 609 237 mptcp_pm_add_addr_send_ack(msk); ··· 640 268 return; 641 269 642 270 mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_SEND_ACK); 271 + } 272 + 273 + static void mptcp_pm_rm_addr_or_subflow(struct mptcp_sock *msk, 274 + const struct mptcp_rm_list *rm_list, 275 + enum linux_mptcp_mib_field rm_type) 276 + { 277 + struct mptcp_subflow_context *subflow, *tmp; 278 + struct sock *sk = (struct sock *)msk; 279 + u8 i; 280 + 281 + pr_debug("%s rm_list_nr %d\n", 282 + rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", rm_list->nr); 283 + 284 + msk_owned_by_me(msk); 285 + 286 + if (sk->sk_state == TCP_LISTEN) 287 + return; 288 + 289 + if (!rm_list->nr) 290 + return; 291 + 292 + if (list_empty(&msk->conn_list)) 293 + return; 294 + 295 + for (i = 0; i < rm_list->nr; i++) { 296 + u8 rm_id = rm_list->ids[i]; 297 + bool removed = false; 298 + 299 + mptcp_for_each_subflow_safe(msk, subflow, tmp) { 300 + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 301 + u8 remote_id = READ_ONCE(subflow->remote_id); 302 + int how = RCV_SHUTDOWN | SEND_SHUTDOWN; 303 + u8 id = subflow_get_local_id(subflow); 304 + 305 + if ((1 << inet_sk_state_load(ssk)) & 306 + (TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING | TCPF_CLOSE)) 307 + continue; 308 + if (rm_type == MPTCP_MIB_RMADDR && remote_id != rm_id) 309 + continue; 310 + if (rm_type == MPTCP_MIB_RMSUBFLOW && id != rm_id) 311 + continue; 312 + 313 + pr_debug(" -> %s rm_list_ids[%d]=%u local_id=%u remote_id=%u mpc_id=%u\n", 314 + rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", 315 + i, rm_id, id, remote_id, msk->mpc_endpoint_id); 316 + spin_unlock_bh(&msk->pm.lock); 317 + mptcp_subflow_shutdown(sk, ssk, how); 318 + removed |= subflow->request_join; 319 + 320 + /* the following takes care of updating the subflows counter */ 321 + mptcp_close_ssk(sk, ssk, subflow); 322 + spin_lock_bh(&msk->pm.lock); 323 + 324 + if (rm_type == MPTCP_MIB_RMSUBFLOW) 325 + __MPTCP_INC_STATS(sock_net(sk), rm_type); 326 + } 327 + 328 + if (rm_type == MPTCP_MIB_RMADDR) { 329 + __MPTCP_INC_STATS(sock_net(sk), rm_type); 330 + if (removed && mptcp_pm_is_kernel(msk)) 331 + mptcp_pm_nl_rm_addr(msk, rm_id); 332 + } 333 + } 334 + } 335 + 336 + static void mptcp_pm_rm_addr_recv(struct mptcp_sock *msk) 337 + { 338 + mptcp_pm_rm_addr_or_subflow(msk, &msk->pm.rm_list_rx, MPTCP_MIB_RMADDR); 339 + } 340 + 341 + void mptcp_pm_rm_subflow(struct mptcp_sock *msk, 342 + const struct mptcp_rm_list *rm_list) 343 + { 344 + mptcp_pm_rm_addr_or_subflow(msk, rm_list, MPTCP_MIB_RMSUBFLOW); 643 345 } 644 346 645 347 void mptcp_pm_rm_addr_received(struct mptcp_sock *msk, ··· 770 324 WRITE_ONCE(subflow->fail_tout, 0); 771 325 } 772 326 } 773 - 774 - /* path manager helpers */ 775 327 776 328 bool mptcp_pm_add_addr_signal(struct mptcp_sock *msk, const struct sk_buff *skb, 777 329 unsigned int opt_size, unsigned int remaining, ··· 850 406 851 407 int mptcp_pm_get_local_id(struct mptcp_sock *msk, struct sock_common *skc) 852 408 { 853 - struct mptcp_addr_info skc_local; 409 + struct mptcp_pm_addr_entry skc_local = { 0 }; 854 410 struct mptcp_addr_info msk_local; 855 411 856 412 if (WARN_ON_ONCE(!msk)) ··· 860 416 * addr 861 417 */ 862 418 mptcp_local_address((struct sock_common *)msk, &msk_local); 863 - mptcp_local_address((struct sock_common *)skc, &skc_local); 864 - if (mptcp_addresses_equal(&msk_local, &skc_local, false)) 419 + mptcp_local_address((struct sock_common *)skc, &skc_local.addr); 420 + if (mptcp_addresses_equal(&msk_local, &skc_local.addr, false)) 865 421 return 0; 422 + 423 + skc_local.addr.id = 0; 424 + skc_local.flags = MPTCP_PM_ADDR_FLAG_IMPLICIT; 866 425 867 426 if (mptcp_pm_is_userspace(msk)) 868 427 return mptcp_userspace_pm_get_local_id(msk, &skc_local); ··· 884 437 return mptcp_pm_nl_is_backup(msk, &skc_local); 885 438 } 886 439 887 - static int mptcp_pm_get_addr(u8 id, struct mptcp_pm_addr_entry *addr, 888 - struct genl_info *info) 440 + static void mptcp_pm_subflows_chk_stale(const struct mptcp_sock *msk, struct sock *ssk) 889 441 { 890 - if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 891 - return mptcp_userspace_pm_get_addr(id, addr, info); 892 - return mptcp_pm_nl_get_addr(id, addr, info); 893 - } 442 + struct mptcp_subflow_context *iter, *subflow = mptcp_subflow_ctx(ssk); 443 + struct sock *sk = (struct sock *)msk; 444 + unsigned int active_max_loss_cnt; 445 + struct net *net = sock_net(sk); 446 + unsigned int stale_loss_cnt; 447 + bool slow; 894 448 895 - int mptcp_pm_nl_get_addr_doit(struct sk_buff *skb, struct genl_info *info) 896 - { 897 - struct mptcp_pm_addr_entry addr; 898 - struct nlattr *attr; 899 - struct sk_buff *msg; 900 - void *reply; 901 - int ret; 449 + stale_loss_cnt = mptcp_stale_loss_cnt(net); 450 + if (subflow->stale || !stale_loss_cnt || subflow->stale_count <= stale_loss_cnt) 451 + return; 902 452 903 - if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 904 - return -EINVAL; 453 + /* look for another available subflow not in loss state */ 454 + active_max_loss_cnt = max_t(int, stale_loss_cnt - 1, 1); 455 + mptcp_for_each_subflow(msk, iter) { 456 + if (iter != subflow && mptcp_subflow_active(iter) && 457 + iter->stale_count < active_max_loss_cnt) { 458 + /* we have some alternatives, try to mark this subflow as idle ...*/ 459 + slow = lock_sock_fast(ssk); 460 + if (!tcp_rtx_and_write_queues_empty(ssk)) { 461 + subflow->stale = 1; 462 + __mptcp_retransmit_pending_data(sk); 463 + MPTCP_INC_STATS(net, MPTCP_MIB_SUBFLOWSTALE); 464 + } 465 + unlock_sock_fast(ssk, slow); 905 466 906 - attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 907 - ret = mptcp_pm_parse_entry(attr, info, false, &addr); 908 - if (ret < 0) 909 - return ret; 910 - 911 - msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); 912 - if (!msg) 913 - return -ENOMEM; 914 - 915 - reply = genlmsg_put_reply(msg, info, &mptcp_genl_family, 0, 916 - info->genlhdr->cmd); 917 - if (!reply) { 918 - GENL_SET_ERR_MSG(info, "not enough space in Netlink message"); 919 - ret = -EMSGSIZE; 920 - goto fail; 467 + /* always try to push the pending data regardless of re-injections: 468 + * we can possibly use backup subflows now, and subflow selection 469 + * is cheap under the msk socket lock 470 + */ 471 + __mptcp_push_pending(sk, 0); 472 + return; 473 + } 921 474 } 922 - 923 - ret = mptcp_pm_get_addr(addr.addr.id, &addr, info); 924 - if (ret) { 925 - NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 926 - goto fail; 927 - } 928 - 929 - ret = mptcp_nl_fill_addr(msg, &addr); 930 - if (ret) 931 - goto fail; 932 - 933 - genlmsg_end(msg, reply); 934 - ret = genlmsg_reply(msg, info); 935 - return ret; 936 - 937 - fail: 938 - nlmsg_free(msg); 939 - return ret; 940 - } 941 - 942 - int mptcp_pm_genl_fill_addr(struct sk_buff *msg, 943 - struct netlink_callback *cb, 944 - struct mptcp_pm_addr_entry *entry) 945 - { 946 - void *hdr; 947 - 948 - hdr = genlmsg_put(msg, NETLINK_CB(cb->skb).portid, 949 - cb->nlh->nlmsg_seq, &mptcp_genl_family, 950 - NLM_F_MULTI, MPTCP_PM_CMD_GET_ADDR); 951 - if (!hdr) 952 - return -EINVAL; 953 - 954 - if (mptcp_nl_fill_addr(msg, entry) < 0) { 955 - genlmsg_cancel(msg, hdr); 956 - return -EINVAL; 957 - } 958 - 959 - genlmsg_end(msg, hdr); 960 - return 0; 961 - } 962 - 963 - static int mptcp_pm_dump_addr(struct sk_buff *msg, struct netlink_callback *cb) 964 - { 965 - const struct genl_info *info = genl_info_dump(cb); 966 - 967 - if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 968 - return mptcp_userspace_pm_dump_addr(msg, cb); 969 - return mptcp_pm_nl_dump_addr(msg, cb); 970 - } 971 - 972 - int mptcp_pm_nl_get_addr_dumpit(struct sk_buff *msg, 973 - struct netlink_callback *cb) 974 - { 975 - return mptcp_pm_dump_addr(msg, cb); 976 - } 977 - 978 - static int mptcp_pm_set_flags(struct genl_info *info) 979 - { 980 - struct mptcp_pm_addr_entry loc = { .addr = { .family = AF_UNSPEC }, }; 981 - struct nlattr *attr_loc; 982 - int ret = -EINVAL; 983 - 984 - if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ATTR_ADDR)) 985 - return ret; 986 - 987 - attr_loc = info->attrs[MPTCP_PM_ATTR_ADDR]; 988 - ret = mptcp_pm_parse_entry(attr_loc, info, false, &loc); 989 - if (ret < 0) 990 - return ret; 991 - 992 - if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 993 - return mptcp_userspace_pm_set_flags(&loc, info); 994 - return mptcp_pm_nl_set_flags(&loc, info); 995 - } 996 - 997 - int mptcp_pm_nl_set_flags_doit(struct sk_buff *skb, struct genl_info *info) 998 - { 999 - return mptcp_pm_set_flags(info); 1000 475 } 1001 476 1002 477 void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk) ··· 933 564 } else if (subflow->stale_rcv_tstamp == rcv_tstamp) { 934 565 if (subflow->stale_count < U8_MAX) 935 566 subflow->stale_count++; 936 - mptcp_pm_nl_subflow_chk_stale(msk, ssk); 567 + mptcp_pm_subflows_chk_stale(msk, ssk); 937 568 } else { 938 569 subflow->stale_count = 0; 939 570 mptcp_subflow_set_active(subflow); 940 571 } 941 572 } 942 573 943 - /* if sk is ipv4 or ipv6_only allows only same-family local and remote addresses, 944 - * otherwise allow any matching local/remote pair 945 - */ 946 - bool mptcp_pm_addr_families_match(const struct sock *sk, 947 - const struct mptcp_addr_info *loc, 948 - const struct mptcp_addr_info *rem) 574 + void mptcp_pm_worker(struct mptcp_sock *msk) 949 575 { 950 - bool mptcp_is_v4 = sk->sk_family == AF_INET; 576 + struct mptcp_pm_data *pm = &msk->pm; 951 577 952 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 953 - bool loc_is_v4 = loc->family == AF_INET || ipv6_addr_v4mapped(&loc->addr6); 954 - bool rem_is_v4 = rem->family == AF_INET || ipv6_addr_v4mapped(&rem->addr6); 578 + msk_owned_by_me(msk); 955 579 956 - if (mptcp_is_v4) 957 - return loc_is_v4 && rem_is_v4; 580 + if (!(pm->status & MPTCP_PM_WORK_MASK)) 581 + return; 958 582 959 - if (ipv6_only_sock(sk)) 960 - return !loc_is_v4 && !rem_is_v4; 583 + spin_lock_bh(&msk->pm.lock); 961 584 962 - return loc_is_v4 == rem_is_v4; 963 - #else 964 - return mptcp_is_v4 && loc->family == AF_INET && rem->family == AF_INET; 965 - #endif 585 + pr_debug("msk=%p status=%x\n", msk, pm->status); 586 + if (pm->status & BIT(MPTCP_PM_ADD_ADDR_SEND_ACK)) { 587 + pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_SEND_ACK); 588 + mptcp_pm_addr_send_ack(msk); 589 + } 590 + if (pm->status & BIT(MPTCP_PM_RM_ADDR_RECEIVED)) { 591 + pm->status &= ~BIT(MPTCP_PM_RM_ADDR_RECEIVED); 592 + mptcp_pm_rm_addr_recv(msk); 593 + } 594 + __mptcp_pm_kernel_worker(msk); 595 + 596 + spin_unlock_bh(&msk->pm.lock); 597 + } 598 + 599 + void mptcp_pm_destroy(struct mptcp_sock *msk) 600 + { 601 + mptcp_pm_free_anno_list(msk); 602 + 603 + if (mptcp_pm_is_userspace(msk)) 604 + mptcp_userspace_pm_free_local_addr_list(msk); 966 605 } 967 606 968 607 void mptcp_pm_data_reset(struct mptcp_sock *msk)
+1410
net/mptcp/pm_kernel.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + /* Multipath TCP 3 + * 4 + * Copyright (c) 2025, Matthieu Baerts. 5 + */ 6 + 7 + #define pr_fmt(fmt) "MPTCP: " fmt 8 + 9 + #include <net/netns/generic.h> 10 + 11 + #include "protocol.h" 12 + #include "mib.h" 13 + #include "mptcp_pm_gen.h" 14 + 15 + static int pm_nl_pernet_id; 16 + 17 + struct pm_nl_pernet { 18 + /* protects pernet updates */ 19 + spinlock_t lock; 20 + struct list_head local_addr_list; 21 + unsigned int addrs; 22 + unsigned int stale_loss_cnt; 23 + unsigned int add_addr_signal_max; 24 + unsigned int add_addr_accept_max; 25 + unsigned int local_addr_max; 26 + unsigned int subflows_max; 27 + unsigned int next_id; 28 + DECLARE_BITMAP(id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 29 + }; 30 + 31 + #define MPTCP_PM_ADDR_MAX 8 32 + 33 + static struct pm_nl_pernet *pm_nl_get_pernet(const struct net *net) 34 + { 35 + return net_generic(net, pm_nl_pernet_id); 36 + } 37 + 38 + static struct pm_nl_pernet * 39 + pm_nl_get_pernet_from_msk(const struct mptcp_sock *msk) 40 + { 41 + return pm_nl_get_pernet(sock_net((struct sock *)msk)); 42 + } 43 + 44 + static struct pm_nl_pernet *genl_info_pm_nl(struct genl_info *info) 45 + { 46 + return pm_nl_get_pernet(genl_info_net(info)); 47 + } 48 + 49 + unsigned int mptcp_pm_get_add_addr_signal_max(const struct mptcp_sock *msk) 50 + { 51 + const struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 52 + 53 + return READ_ONCE(pernet->add_addr_signal_max); 54 + } 55 + EXPORT_SYMBOL_GPL(mptcp_pm_get_add_addr_signal_max); 56 + 57 + unsigned int mptcp_pm_get_add_addr_accept_max(const struct mptcp_sock *msk) 58 + { 59 + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 60 + 61 + return READ_ONCE(pernet->add_addr_accept_max); 62 + } 63 + EXPORT_SYMBOL_GPL(mptcp_pm_get_add_addr_accept_max); 64 + 65 + unsigned int mptcp_pm_get_subflows_max(const struct mptcp_sock *msk) 66 + { 67 + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 68 + 69 + return READ_ONCE(pernet->subflows_max); 70 + } 71 + EXPORT_SYMBOL_GPL(mptcp_pm_get_subflows_max); 72 + 73 + unsigned int mptcp_pm_get_local_addr_max(const struct mptcp_sock *msk) 74 + { 75 + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 76 + 77 + return READ_ONCE(pernet->local_addr_max); 78 + } 79 + EXPORT_SYMBOL_GPL(mptcp_pm_get_local_addr_max); 80 + 81 + static bool lookup_subflow_by_daddr(const struct list_head *list, 82 + const struct mptcp_addr_info *daddr) 83 + { 84 + struct mptcp_subflow_context *subflow; 85 + struct mptcp_addr_info cur; 86 + 87 + list_for_each_entry(subflow, list, node) { 88 + struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 89 + 90 + if (!((1 << inet_sk_state_load(ssk)) & 91 + (TCPF_ESTABLISHED | TCPF_SYN_SENT | TCPF_SYN_RECV))) 92 + continue; 93 + 94 + mptcp_remote_address((struct sock_common *)ssk, &cur); 95 + if (mptcp_addresses_equal(&cur, daddr, daddr->port)) 96 + return true; 97 + } 98 + 99 + return false; 100 + } 101 + 102 + static bool 103 + select_local_address(const struct pm_nl_pernet *pernet, 104 + const struct mptcp_sock *msk, 105 + struct mptcp_pm_local *new_local) 106 + { 107 + struct mptcp_pm_addr_entry *entry; 108 + bool found = false; 109 + 110 + msk_owned_by_me(msk); 111 + 112 + rcu_read_lock(); 113 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 114 + if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) 115 + continue; 116 + 117 + if (!test_bit(entry->addr.id, msk->pm.id_avail_bitmap)) 118 + continue; 119 + 120 + new_local->addr = entry->addr; 121 + new_local->flags = entry->flags; 122 + new_local->ifindex = entry->ifindex; 123 + found = true; 124 + break; 125 + } 126 + rcu_read_unlock(); 127 + 128 + return found; 129 + } 130 + 131 + static bool 132 + select_signal_address(struct pm_nl_pernet *pernet, const struct mptcp_sock *msk, 133 + struct mptcp_pm_local *new_local) 134 + { 135 + struct mptcp_pm_addr_entry *entry; 136 + bool found = false; 137 + 138 + rcu_read_lock(); 139 + /* do not keep any additional per socket state, just signal 140 + * the address list in order. 141 + * Note: removal from the local address list during the msk life-cycle 142 + * can lead to additional addresses not being announced. 143 + */ 144 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 145 + if (!test_bit(entry->addr.id, msk->pm.id_avail_bitmap)) 146 + continue; 147 + 148 + if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) 149 + continue; 150 + 151 + new_local->addr = entry->addr; 152 + new_local->flags = entry->flags; 153 + new_local->ifindex = entry->ifindex; 154 + found = true; 155 + break; 156 + } 157 + rcu_read_unlock(); 158 + 159 + return found; 160 + } 161 + 162 + /* Fill all the remote addresses into the array addrs[], 163 + * and return the array size. 164 + */ 165 + static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk, 166 + struct mptcp_addr_info *local, 167 + bool fullmesh, 168 + struct mptcp_addr_info *addrs) 169 + { 170 + bool deny_id0 = READ_ONCE(msk->pm.remote_deny_join_id0); 171 + struct sock *sk = (struct sock *)msk, *ssk; 172 + struct mptcp_subflow_context *subflow; 173 + struct mptcp_addr_info remote = { 0 }; 174 + unsigned int subflows_max; 175 + int i = 0; 176 + 177 + subflows_max = mptcp_pm_get_subflows_max(msk); 178 + mptcp_remote_address((struct sock_common *)sk, &remote); 179 + 180 + /* Non-fullmesh endpoint, fill in the single entry 181 + * corresponding to the primary MPC subflow remote address 182 + */ 183 + if (!fullmesh) { 184 + if (deny_id0) 185 + return 0; 186 + 187 + if (!mptcp_pm_addr_families_match(sk, local, &remote)) 188 + return 0; 189 + 190 + msk->pm.subflows++; 191 + addrs[i++] = remote; 192 + } else { 193 + DECLARE_BITMAP(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1); 194 + 195 + /* Forbid creation of new subflows matching existing 196 + * ones, possibly already created by incoming ADD_ADDR 197 + */ 198 + bitmap_zero(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1); 199 + mptcp_for_each_subflow(msk, subflow) 200 + if (READ_ONCE(subflow->local_id) == local->id) 201 + __set_bit(subflow->remote_id, unavail_id); 202 + 203 + mptcp_for_each_subflow(msk, subflow) { 204 + ssk = mptcp_subflow_tcp_sock(subflow); 205 + mptcp_remote_address((struct sock_common *)ssk, &addrs[i]); 206 + addrs[i].id = READ_ONCE(subflow->remote_id); 207 + if (deny_id0 && !addrs[i].id) 208 + continue; 209 + 210 + if (test_bit(addrs[i].id, unavail_id)) 211 + continue; 212 + 213 + if (!mptcp_pm_addr_families_match(sk, local, &addrs[i])) 214 + continue; 215 + 216 + if (msk->pm.subflows < subflows_max) { 217 + /* forbid creating multiple address towards 218 + * this id 219 + */ 220 + __set_bit(addrs[i].id, unavail_id); 221 + msk->pm.subflows++; 222 + i++; 223 + } 224 + } 225 + } 226 + 227 + return i; 228 + } 229 + 230 + static struct mptcp_pm_addr_entry * 231 + __lookup_addr_by_id(struct pm_nl_pernet *pernet, unsigned int id) 232 + { 233 + struct mptcp_pm_addr_entry *entry; 234 + 235 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, 236 + lockdep_is_held(&pernet->lock)) { 237 + if (entry->addr.id == id) 238 + return entry; 239 + } 240 + return NULL; 241 + } 242 + 243 + static struct mptcp_pm_addr_entry * 244 + __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) 245 + { 246 + struct mptcp_pm_addr_entry *entry; 247 + 248 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, 249 + lockdep_is_held(&pernet->lock)) { 250 + if (mptcp_addresses_equal(&entry->addr, info, entry->addr.port)) 251 + return entry; 252 + } 253 + return NULL; 254 + } 255 + 256 + static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) 257 + { 258 + struct sock *sk = (struct sock *)msk; 259 + unsigned int add_addr_signal_max; 260 + bool signal_and_subflow = false; 261 + unsigned int local_addr_max; 262 + struct pm_nl_pernet *pernet; 263 + struct mptcp_pm_local local; 264 + unsigned int subflows_max; 265 + 266 + pernet = pm_nl_get_pernet(sock_net(sk)); 267 + 268 + add_addr_signal_max = mptcp_pm_get_add_addr_signal_max(msk); 269 + local_addr_max = mptcp_pm_get_local_addr_max(msk); 270 + subflows_max = mptcp_pm_get_subflows_max(msk); 271 + 272 + /* do lazy endpoint usage accounting for the MPC subflows */ 273 + if (unlikely(!(msk->pm.status & BIT(MPTCP_PM_MPC_ENDPOINT_ACCOUNTED))) && msk->first) { 274 + struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(msk->first); 275 + struct mptcp_pm_addr_entry *entry; 276 + struct mptcp_addr_info mpc_addr; 277 + bool backup = false; 278 + 279 + mptcp_local_address((struct sock_common *)msk->first, &mpc_addr); 280 + rcu_read_lock(); 281 + entry = __lookup_addr(pernet, &mpc_addr); 282 + if (entry) { 283 + __clear_bit(entry->addr.id, msk->pm.id_avail_bitmap); 284 + msk->mpc_endpoint_id = entry->addr.id; 285 + backup = !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 286 + } 287 + rcu_read_unlock(); 288 + 289 + if (backup) 290 + mptcp_pm_send_ack(msk, subflow, true, backup); 291 + 292 + msk->pm.status |= BIT(MPTCP_PM_MPC_ENDPOINT_ACCOUNTED); 293 + } 294 + 295 + pr_debug("local %d:%d signal %d:%d subflows %d:%d\n", 296 + msk->pm.local_addr_used, local_addr_max, 297 + msk->pm.add_addr_signaled, add_addr_signal_max, 298 + msk->pm.subflows, subflows_max); 299 + 300 + /* check first for announce */ 301 + if (msk->pm.add_addr_signaled < add_addr_signal_max) { 302 + /* due to racing events on both ends we can reach here while 303 + * previous add address is still running: if we invoke now 304 + * mptcp_pm_announce_addr(), that will fail and the 305 + * corresponding id will be marked as used. 306 + * Instead let the PM machinery reschedule us when the 307 + * current address announce will be completed. 308 + */ 309 + if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) 310 + return; 311 + 312 + if (!select_signal_address(pernet, msk, &local)) 313 + goto subflow; 314 + 315 + /* If the alloc fails, we are on memory pressure, not worth 316 + * continuing, and trying to create subflows. 317 + */ 318 + if (!mptcp_pm_alloc_anno_list(msk, &local.addr)) 319 + return; 320 + 321 + __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); 322 + msk->pm.add_addr_signaled++; 323 + 324 + /* Special case for ID0: set the correct ID */ 325 + if (local.addr.id == msk->mpc_endpoint_id) 326 + local.addr.id = 0; 327 + 328 + mptcp_pm_announce_addr(msk, &local.addr, false); 329 + mptcp_pm_addr_send_ack(msk); 330 + 331 + if (local.flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) 332 + signal_and_subflow = true; 333 + } 334 + 335 + subflow: 336 + /* check if should create a new subflow */ 337 + while (msk->pm.local_addr_used < local_addr_max && 338 + msk->pm.subflows < subflows_max) { 339 + struct mptcp_addr_info addrs[MPTCP_PM_ADDR_MAX]; 340 + bool fullmesh; 341 + int i, nr; 342 + 343 + if (signal_and_subflow) 344 + signal_and_subflow = false; 345 + else if (!select_local_address(pernet, msk, &local)) 346 + break; 347 + 348 + fullmesh = !!(local.flags & MPTCP_PM_ADDR_FLAG_FULLMESH); 349 + 350 + __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); 351 + 352 + /* Special case for ID0: set the correct ID */ 353 + if (local.addr.id == msk->mpc_endpoint_id) 354 + local.addr.id = 0; 355 + else /* local_addr_used is not decr for ID 0 */ 356 + msk->pm.local_addr_used++; 357 + 358 + nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); 359 + if (nr == 0) 360 + continue; 361 + 362 + spin_unlock_bh(&msk->pm.lock); 363 + for (i = 0; i < nr; i++) 364 + __mptcp_subflow_connect(sk, &local, &addrs[i]); 365 + spin_lock_bh(&msk->pm.lock); 366 + } 367 + mptcp_pm_nl_check_work_pending(msk); 368 + } 369 + 370 + static void mptcp_pm_nl_fully_established(struct mptcp_sock *msk) 371 + { 372 + mptcp_pm_create_subflow_or_signal_addr(msk); 373 + } 374 + 375 + static void mptcp_pm_nl_subflow_established(struct mptcp_sock *msk) 376 + { 377 + mptcp_pm_create_subflow_or_signal_addr(msk); 378 + } 379 + 380 + /* Fill all the local addresses into the array addrs[], 381 + * and return the array size. 382 + */ 383 + static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk, 384 + struct mptcp_addr_info *remote, 385 + struct mptcp_pm_local *locals) 386 + { 387 + struct sock *sk = (struct sock *)msk; 388 + struct mptcp_pm_addr_entry *entry; 389 + struct mptcp_addr_info mpc_addr; 390 + struct pm_nl_pernet *pernet; 391 + unsigned int subflows_max; 392 + int i = 0; 393 + 394 + pernet = pm_nl_get_pernet_from_msk(msk); 395 + subflows_max = mptcp_pm_get_subflows_max(msk); 396 + 397 + mptcp_local_address((struct sock_common *)msk, &mpc_addr); 398 + 399 + rcu_read_lock(); 400 + list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 401 + if (!(entry->flags & MPTCP_PM_ADDR_FLAG_FULLMESH)) 402 + continue; 403 + 404 + if (!mptcp_pm_addr_families_match(sk, &entry->addr, remote)) 405 + continue; 406 + 407 + if (msk->pm.subflows < subflows_max) { 408 + locals[i].addr = entry->addr; 409 + locals[i].flags = entry->flags; 410 + locals[i].ifindex = entry->ifindex; 411 + 412 + /* Special case for ID0: set the correct ID */ 413 + if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port)) 414 + locals[i].addr.id = 0; 415 + 416 + msk->pm.subflows++; 417 + i++; 418 + } 419 + } 420 + rcu_read_unlock(); 421 + 422 + /* If the array is empty, fill in the single 423 + * 'IPADDRANY' local address 424 + */ 425 + if (!i) { 426 + memset(&locals[i], 0, sizeof(locals[i])); 427 + locals[i].addr.family = 428 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 429 + remote->family == AF_INET6 && 430 + ipv6_addr_v4mapped(&remote->addr6) ? AF_INET : 431 + #endif 432 + remote->family; 433 + 434 + if (!mptcp_pm_addr_families_match(sk, &locals[i].addr, remote)) 435 + return 0; 436 + 437 + msk->pm.subflows++; 438 + i++; 439 + } 440 + 441 + return i; 442 + } 443 + 444 + static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) 445 + { 446 + struct mptcp_pm_local locals[MPTCP_PM_ADDR_MAX]; 447 + struct sock *sk = (struct sock *)msk; 448 + unsigned int add_addr_accept_max; 449 + struct mptcp_addr_info remote; 450 + unsigned int subflows_max; 451 + bool sf_created = false; 452 + int i, nr; 453 + 454 + add_addr_accept_max = mptcp_pm_get_add_addr_accept_max(msk); 455 + subflows_max = mptcp_pm_get_subflows_max(msk); 456 + 457 + pr_debug("accepted %d:%d remote family %d\n", 458 + msk->pm.add_addr_accepted, add_addr_accept_max, 459 + msk->pm.remote.family); 460 + 461 + remote = msk->pm.remote; 462 + mptcp_pm_announce_addr(msk, &remote, true); 463 + mptcp_pm_addr_send_ack(msk); 464 + 465 + if (lookup_subflow_by_daddr(&msk->conn_list, &remote)) 466 + return; 467 + 468 + /* pick id 0 port, if none is provided the remote address */ 469 + if (!remote.port) 470 + remote.port = sk->sk_dport; 471 + 472 + /* connect to the specified remote address, using whatever 473 + * local address the routing configuration will pick. 474 + */ 475 + nr = fill_local_addresses_vec(msk, &remote, locals); 476 + if (nr == 0) 477 + return; 478 + 479 + spin_unlock_bh(&msk->pm.lock); 480 + for (i = 0; i < nr; i++) 481 + if (__mptcp_subflow_connect(sk, &locals[i], &remote) == 0) 482 + sf_created = true; 483 + spin_lock_bh(&msk->pm.lock); 484 + 485 + if (sf_created) { 486 + /* add_addr_accepted is not decr for ID 0 */ 487 + if (remote.id) 488 + msk->pm.add_addr_accepted++; 489 + if (msk->pm.add_addr_accepted >= add_addr_accept_max || 490 + msk->pm.subflows >= subflows_max) 491 + WRITE_ONCE(msk->pm.accept_addr, false); 492 + } 493 + } 494 + 495 + void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id) 496 + { 497 + if (rm_id && WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { 498 + /* Note: if the subflow has been closed before, this 499 + * add_addr_accepted counter will not be decremented. 500 + */ 501 + if (--msk->pm.add_addr_accepted < mptcp_pm_get_add_addr_accept_max(msk)) 502 + WRITE_ONCE(msk->pm.accept_addr, true); 503 + } 504 + } 505 + 506 + static bool address_use_port(struct mptcp_pm_addr_entry *entry) 507 + { 508 + return (entry->flags & 509 + (MPTCP_PM_ADDR_FLAG_SIGNAL | MPTCP_PM_ADDR_FLAG_SUBFLOW)) == 510 + MPTCP_PM_ADDR_FLAG_SIGNAL; 511 + } 512 + 513 + /* caller must ensure the RCU grace period is already elapsed */ 514 + static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry) 515 + { 516 + if (entry->lsk) 517 + sock_release(entry->lsk); 518 + kfree(entry); 519 + } 520 + 521 + static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet, 522 + struct mptcp_pm_addr_entry *entry, 523 + bool needs_id, bool replace) 524 + { 525 + struct mptcp_pm_addr_entry *cur, *del_entry = NULL; 526 + unsigned int addr_max; 527 + int ret = -EINVAL; 528 + 529 + spin_lock_bh(&pernet->lock); 530 + /* to keep the code simple, don't do IDR-like allocation for address ID, 531 + * just bail when we exceed limits 532 + */ 533 + if (pernet->next_id == MPTCP_PM_MAX_ADDR_ID) 534 + pernet->next_id = 1; 535 + if (pernet->addrs >= MPTCP_PM_ADDR_MAX) { 536 + ret = -ERANGE; 537 + goto out; 538 + } 539 + if (test_bit(entry->addr.id, pernet->id_bitmap)) { 540 + ret = -EBUSY; 541 + goto out; 542 + } 543 + 544 + /* do not insert duplicate address, differentiate on port only 545 + * singled addresses 546 + */ 547 + if (!address_use_port(entry)) 548 + entry->addr.port = 0; 549 + list_for_each_entry(cur, &pernet->local_addr_list, list) { 550 + if (mptcp_addresses_equal(&cur->addr, &entry->addr, 551 + cur->addr.port || entry->addr.port)) { 552 + /* allow replacing the exiting endpoint only if such 553 + * endpoint is an implicit one and the user-space 554 + * did not provide an endpoint id 555 + */ 556 + if (!(cur->flags & MPTCP_PM_ADDR_FLAG_IMPLICIT)) { 557 + ret = -EEXIST; 558 + goto out; 559 + } 560 + if (entry->addr.id) 561 + goto out; 562 + 563 + /* allow callers that only need to look up the local 564 + * addr's id to skip replacement. This allows them to 565 + * avoid calling synchronize_rcu in the packet recv 566 + * path. 567 + */ 568 + if (!replace) { 569 + kfree(entry); 570 + ret = cur->addr.id; 571 + goto out; 572 + } 573 + 574 + pernet->addrs--; 575 + entry->addr.id = cur->addr.id; 576 + list_del_rcu(&cur->list); 577 + del_entry = cur; 578 + break; 579 + } 580 + } 581 + 582 + if (!entry->addr.id && needs_id) { 583 + find_next: 584 + entry->addr.id = find_next_zero_bit(pernet->id_bitmap, 585 + MPTCP_PM_MAX_ADDR_ID + 1, 586 + pernet->next_id); 587 + if (!entry->addr.id && pernet->next_id != 1) { 588 + pernet->next_id = 1; 589 + goto find_next; 590 + } 591 + } 592 + 593 + if (!entry->addr.id && needs_id) 594 + goto out; 595 + 596 + __set_bit(entry->addr.id, pernet->id_bitmap); 597 + if (entry->addr.id > pernet->next_id) 598 + pernet->next_id = entry->addr.id; 599 + 600 + if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) { 601 + addr_max = pernet->add_addr_signal_max; 602 + WRITE_ONCE(pernet->add_addr_signal_max, addr_max + 1); 603 + } 604 + if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 605 + addr_max = pernet->local_addr_max; 606 + WRITE_ONCE(pernet->local_addr_max, addr_max + 1); 607 + } 608 + 609 + pernet->addrs++; 610 + if (!entry->addr.port) 611 + list_add_tail_rcu(&entry->list, &pernet->local_addr_list); 612 + else 613 + list_add_rcu(&entry->list, &pernet->local_addr_list); 614 + ret = entry->addr.id; 615 + 616 + out: 617 + spin_unlock_bh(&pernet->lock); 618 + 619 + /* just replaced an existing entry, free it */ 620 + if (del_entry) { 621 + synchronize_rcu(); 622 + __mptcp_pm_release_addr_entry(del_entry); 623 + } 624 + return ret; 625 + } 626 + 627 + static struct lock_class_key mptcp_slock_keys[2]; 628 + static struct lock_class_key mptcp_keys[2]; 629 + 630 + static int mptcp_pm_nl_create_listen_socket(struct sock *sk, 631 + struct mptcp_pm_addr_entry *entry) 632 + { 633 + bool is_ipv6 = sk->sk_family == AF_INET6; 634 + int addrlen = sizeof(struct sockaddr_in); 635 + struct sockaddr_storage addr; 636 + struct sock *newsk, *ssk; 637 + int backlog = 1024; 638 + int err; 639 + 640 + err = sock_create_kern(sock_net(sk), entry->addr.family, 641 + SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk); 642 + if (err) 643 + return err; 644 + 645 + newsk = entry->lsk->sk; 646 + if (!newsk) 647 + return -EINVAL; 648 + 649 + /* The subflow socket lock is acquired in a nested to the msk one 650 + * in several places, even by the TCP stack, and this msk is a kernel 651 + * socket: lockdep complains. Instead of propagating the _nested 652 + * modifiers in several places, re-init the lock class for the msk 653 + * socket to an mptcp specific one. 654 + */ 655 + sock_lock_init_class_and_name(newsk, 656 + is_ipv6 ? "mlock-AF_INET6" : "mlock-AF_INET", 657 + &mptcp_slock_keys[is_ipv6], 658 + is_ipv6 ? "msk_lock-AF_INET6" : "msk_lock-AF_INET", 659 + &mptcp_keys[is_ipv6]); 660 + 661 + lock_sock(newsk); 662 + ssk = __mptcp_nmpc_sk(mptcp_sk(newsk)); 663 + release_sock(newsk); 664 + if (IS_ERR(ssk)) 665 + return PTR_ERR(ssk); 666 + 667 + mptcp_info2sockaddr(&entry->addr, &addr, entry->addr.family); 668 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 669 + if (entry->addr.family == AF_INET6) 670 + addrlen = sizeof(struct sockaddr_in6); 671 + #endif 672 + if (ssk->sk_family == AF_INET) 673 + err = inet_bind_sk(ssk, (struct sockaddr *)&addr, addrlen); 674 + #if IS_ENABLED(CONFIG_MPTCP_IPV6) 675 + else if (ssk->sk_family == AF_INET6) 676 + err = inet6_bind_sk(ssk, (struct sockaddr *)&addr, addrlen); 677 + #endif 678 + if (err) 679 + return err; 680 + 681 + /* We don't use mptcp_set_state() here because it needs to be called 682 + * under the msk socket lock. For the moment, that will not bring 683 + * anything more than only calling inet_sk_state_store(), because the 684 + * old status is known (TCP_CLOSE). 685 + */ 686 + inet_sk_state_store(newsk, TCP_LISTEN); 687 + lock_sock(ssk); 688 + WRITE_ONCE(mptcp_subflow_ctx(ssk)->pm_listener, true); 689 + err = __inet_listen_sk(ssk, backlog); 690 + if (!err) 691 + mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CREATED); 692 + release_sock(ssk); 693 + return err; 694 + } 695 + 696 + int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, 697 + struct mptcp_pm_addr_entry *skc) 698 + { 699 + struct mptcp_pm_addr_entry *entry; 700 + struct pm_nl_pernet *pernet; 701 + int ret; 702 + 703 + pernet = pm_nl_get_pernet_from_msk(msk); 704 + 705 + rcu_read_lock(); 706 + entry = __lookup_addr(pernet, &skc->addr); 707 + ret = entry ? entry->addr.id : -1; 708 + rcu_read_unlock(); 709 + if (ret >= 0) 710 + return ret; 711 + 712 + /* address not found, add to local list */ 713 + entry = kmalloc(sizeof(*entry), GFP_ATOMIC); 714 + if (!entry) 715 + return -ENOMEM; 716 + 717 + *entry = *skc; 718 + entry->addr.port = 0; 719 + ret = mptcp_pm_nl_append_new_local_addr(pernet, entry, true, false); 720 + if (ret < 0) 721 + kfree(entry); 722 + 723 + return ret; 724 + } 725 + 726 + bool mptcp_pm_nl_is_backup(struct mptcp_sock *msk, struct mptcp_addr_info *skc) 727 + { 728 + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 729 + struct mptcp_pm_addr_entry *entry; 730 + bool backup; 731 + 732 + rcu_read_lock(); 733 + entry = __lookup_addr(pernet, skc); 734 + backup = entry && !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 735 + rcu_read_unlock(); 736 + 737 + return backup; 738 + } 739 + 740 + static int mptcp_nl_add_subflow_or_signal_addr(struct net *net, 741 + struct mptcp_addr_info *addr) 742 + { 743 + struct mptcp_sock *msk; 744 + long s_slot = 0, s_num = 0; 745 + 746 + while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 747 + struct sock *sk = (struct sock *)msk; 748 + struct mptcp_addr_info mpc_addr; 749 + 750 + if (!READ_ONCE(msk->fully_established) || 751 + mptcp_pm_is_userspace(msk)) 752 + goto next; 753 + 754 + /* if the endp linked to the init sf is re-added with a != ID */ 755 + mptcp_local_address((struct sock_common *)msk, &mpc_addr); 756 + 757 + lock_sock(sk); 758 + spin_lock_bh(&msk->pm.lock); 759 + if (mptcp_addresses_equal(addr, &mpc_addr, addr->port)) 760 + msk->mpc_endpoint_id = addr->id; 761 + mptcp_pm_create_subflow_or_signal_addr(msk); 762 + spin_unlock_bh(&msk->pm.lock); 763 + release_sock(sk); 764 + 765 + next: 766 + sock_put(sk); 767 + cond_resched(); 768 + } 769 + 770 + return 0; 771 + } 772 + 773 + static bool mptcp_pm_has_addr_attr_id(const struct nlattr *attr, 774 + struct genl_info *info) 775 + { 776 + struct nlattr *tb[MPTCP_PM_ADDR_ATTR_MAX + 1]; 777 + 778 + if (!nla_parse_nested_deprecated(tb, MPTCP_PM_ADDR_ATTR_MAX, attr, 779 + mptcp_pm_address_nl_policy, info->extack) && 780 + tb[MPTCP_PM_ADDR_ATTR_ID]) 781 + return true; 782 + return false; 783 + } 784 + 785 + /* Add an MPTCP endpoint */ 786 + int mptcp_pm_nl_add_addr_doit(struct sk_buff *skb, struct genl_info *info) 787 + { 788 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 789 + struct mptcp_pm_addr_entry addr, *entry; 790 + struct nlattr *attr; 791 + int ret; 792 + 793 + if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 794 + return -EINVAL; 795 + 796 + attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 797 + ret = mptcp_pm_parse_entry(attr, info, true, &addr); 798 + if (ret < 0) 799 + return ret; 800 + 801 + if (addr.addr.port && !address_use_port(&addr)) { 802 + NL_SET_ERR_MSG_ATTR(info->extack, attr, 803 + "flags must have signal and not subflow when using port"); 804 + return -EINVAL; 805 + } 806 + 807 + if (addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL && 808 + addr.flags & MPTCP_PM_ADDR_FLAG_FULLMESH) { 809 + NL_SET_ERR_MSG_ATTR(info->extack, attr, 810 + "flags mustn't have both signal and fullmesh"); 811 + return -EINVAL; 812 + } 813 + 814 + if (addr.flags & MPTCP_PM_ADDR_FLAG_IMPLICIT) { 815 + NL_SET_ERR_MSG_ATTR(info->extack, attr, 816 + "can't create IMPLICIT endpoint"); 817 + return -EINVAL; 818 + } 819 + 820 + entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); 821 + if (!entry) { 822 + GENL_SET_ERR_MSG(info, "can't allocate addr"); 823 + return -ENOMEM; 824 + } 825 + 826 + *entry = addr; 827 + if (entry->addr.port) { 828 + ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry); 829 + if (ret) { 830 + GENL_SET_ERR_MSG_FMT(info, "create listen socket error: %d", ret); 831 + goto out_free; 832 + } 833 + } 834 + ret = mptcp_pm_nl_append_new_local_addr(pernet, entry, 835 + !mptcp_pm_has_addr_attr_id(attr, info), 836 + true); 837 + if (ret < 0) { 838 + GENL_SET_ERR_MSG_FMT(info, "too many addresses or duplicate one: %d", ret); 839 + goto out_free; 840 + } 841 + 842 + mptcp_nl_add_subflow_or_signal_addr(sock_net(skb->sk), &entry->addr); 843 + return 0; 844 + 845 + out_free: 846 + __mptcp_pm_release_addr_entry(entry); 847 + return ret; 848 + } 849 + 850 + static u8 mptcp_endp_get_local_id(struct mptcp_sock *msk, 851 + const struct mptcp_addr_info *addr) 852 + { 853 + return msk->mpc_endpoint_id == addr->id ? 0 : addr->id; 854 + } 855 + 856 + static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk, 857 + const struct mptcp_addr_info *addr, 858 + bool force) 859 + { 860 + struct mptcp_rm_list list = { .nr = 0 }; 861 + bool ret; 862 + 863 + list.ids[list.nr++] = mptcp_endp_get_local_id(msk, addr); 864 + 865 + ret = mptcp_remove_anno_list_by_saddr(msk, addr); 866 + if (ret || force) { 867 + spin_lock_bh(&msk->pm.lock); 868 + if (ret) { 869 + __set_bit(addr->id, msk->pm.id_avail_bitmap); 870 + msk->pm.add_addr_signaled--; 871 + } 872 + mptcp_pm_remove_addr(msk, &list); 873 + spin_unlock_bh(&msk->pm.lock); 874 + } 875 + return ret; 876 + } 877 + 878 + static void __mark_subflow_endp_available(struct mptcp_sock *msk, u8 id) 879 + { 880 + /* If it was marked as used, and not ID 0, decrement local_addr_used */ 881 + if (!__test_and_set_bit(id ? : msk->mpc_endpoint_id, msk->pm.id_avail_bitmap) && 882 + id && !WARN_ON_ONCE(msk->pm.local_addr_used == 0)) 883 + msk->pm.local_addr_used--; 884 + } 885 + 886 + static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, 887 + const struct mptcp_pm_addr_entry *entry) 888 + { 889 + const struct mptcp_addr_info *addr = &entry->addr; 890 + struct mptcp_rm_list list = { .nr = 1 }; 891 + long s_slot = 0, s_num = 0; 892 + struct mptcp_sock *msk; 893 + 894 + pr_debug("remove_id=%d\n", addr->id); 895 + 896 + while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 897 + struct sock *sk = (struct sock *)msk; 898 + bool remove_subflow; 899 + 900 + if (mptcp_pm_is_userspace(msk)) 901 + goto next; 902 + 903 + lock_sock(sk); 904 + remove_subflow = mptcp_lookup_subflow_by_saddr(&msk->conn_list, addr); 905 + mptcp_pm_remove_anno_addr(msk, addr, remove_subflow && 906 + !(entry->flags & MPTCP_PM_ADDR_FLAG_IMPLICIT)); 907 + 908 + list.ids[0] = mptcp_endp_get_local_id(msk, addr); 909 + if (remove_subflow) { 910 + spin_lock_bh(&msk->pm.lock); 911 + mptcp_pm_rm_subflow(msk, &list); 912 + spin_unlock_bh(&msk->pm.lock); 913 + } 914 + 915 + if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 916 + spin_lock_bh(&msk->pm.lock); 917 + __mark_subflow_endp_available(msk, list.ids[0]); 918 + spin_unlock_bh(&msk->pm.lock); 919 + } 920 + 921 + if (msk->mpc_endpoint_id == entry->addr.id) 922 + msk->mpc_endpoint_id = 0; 923 + release_sock(sk); 924 + 925 + next: 926 + sock_put(sk); 927 + cond_resched(); 928 + } 929 + 930 + return 0; 931 + } 932 + 933 + static int mptcp_nl_remove_id_zero_address(struct net *net, 934 + struct mptcp_addr_info *addr) 935 + { 936 + struct mptcp_rm_list list = { .nr = 0 }; 937 + long s_slot = 0, s_num = 0; 938 + struct mptcp_sock *msk; 939 + 940 + list.ids[list.nr++] = 0; 941 + 942 + while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 943 + struct sock *sk = (struct sock *)msk; 944 + struct mptcp_addr_info msk_local; 945 + 946 + if (list_empty(&msk->conn_list) || mptcp_pm_is_userspace(msk)) 947 + goto next; 948 + 949 + mptcp_local_address((struct sock_common *)msk, &msk_local); 950 + if (!mptcp_addresses_equal(&msk_local, addr, addr->port)) 951 + goto next; 952 + 953 + lock_sock(sk); 954 + spin_lock_bh(&msk->pm.lock); 955 + mptcp_pm_remove_addr(msk, &list); 956 + mptcp_pm_rm_subflow(msk, &list); 957 + __mark_subflow_endp_available(msk, 0); 958 + spin_unlock_bh(&msk->pm.lock); 959 + release_sock(sk); 960 + 961 + next: 962 + sock_put(sk); 963 + cond_resched(); 964 + } 965 + 966 + return 0; 967 + } 968 + 969 + /* Remove an MPTCP endpoint */ 970 + int mptcp_pm_nl_del_addr_doit(struct sk_buff *skb, struct genl_info *info) 971 + { 972 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 973 + struct mptcp_pm_addr_entry addr, *entry; 974 + unsigned int addr_max; 975 + struct nlattr *attr; 976 + int ret; 977 + 978 + if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 979 + return -EINVAL; 980 + 981 + attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 982 + ret = mptcp_pm_parse_entry(attr, info, false, &addr); 983 + if (ret < 0) 984 + return ret; 985 + 986 + /* the zero id address is special: the first address used by the msk 987 + * always gets such an id, so different subflows can have different zero 988 + * id addresses. Additionally zero id is not accounted for in id_bitmap. 989 + * Let's use an 'mptcp_rm_list' instead of the common remove code. 990 + */ 991 + if (addr.addr.id == 0) 992 + return mptcp_nl_remove_id_zero_address(sock_net(skb->sk), &addr.addr); 993 + 994 + spin_lock_bh(&pernet->lock); 995 + entry = __lookup_addr_by_id(pernet, addr.addr.id); 996 + if (!entry) { 997 + NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 998 + spin_unlock_bh(&pernet->lock); 999 + return -EINVAL; 1000 + } 1001 + if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) { 1002 + addr_max = pernet->add_addr_signal_max; 1003 + WRITE_ONCE(pernet->add_addr_signal_max, addr_max - 1); 1004 + } 1005 + if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 1006 + addr_max = pernet->local_addr_max; 1007 + WRITE_ONCE(pernet->local_addr_max, addr_max - 1); 1008 + } 1009 + 1010 + pernet->addrs--; 1011 + list_del_rcu(&entry->list); 1012 + __clear_bit(entry->addr.id, pernet->id_bitmap); 1013 + spin_unlock_bh(&pernet->lock); 1014 + 1015 + mptcp_nl_remove_subflow_and_signal_addr(sock_net(skb->sk), entry); 1016 + synchronize_rcu(); 1017 + __mptcp_pm_release_addr_entry(entry); 1018 + 1019 + return ret; 1020 + } 1021 + 1022 + static void mptcp_pm_flush_addrs_and_subflows(struct mptcp_sock *msk, 1023 + struct list_head *rm_list) 1024 + { 1025 + struct mptcp_rm_list alist = { .nr = 0 }, slist = { .nr = 0 }; 1026 + struct mptcp_pm_addr_entry *entry; 1027 + 1028 + list_for_each_entry(entry, rm_list, list) { 1029 + if (slist.nr < MPTCP_RM_IDS_MAX && 1030 + mptcp_lookup_subflow_by_saddr(&msk->conn_list, &entry->addr)) 1031 + slist.ids[slist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr); 1032 + 1033 + if (alist.nr < MPTCP_RM_IDS_MAX && 1034 + mptcp_remove_anno_list_by_saddr(msk, &entry->addr)) 1035 + alist.ids[alist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr); 1036 + } 1037 + 1038 + spin_lock_bh(&msk->pm.lock); 1039 + if (alist.nr) { 1040 + msk->pm.add_addr_signaled -= alist.nr; 1041 + mptcp_pm_remove_addr(msk, &alist); 1042 + } 1043 + if (slist.nr) 1044 + mptcp_pm_rm_subflow(msk, &slist); 1045 + /* Reset counters: maybe some subflows have been removed before */ 1046 + bitmap_fill(msk->pm.id_avail_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 1047 + msk->pm.local_addr_used = 0; 1048 + spin_unlock_bh(&msk->pm.lock); 1049 + } 1050 + 1051 + static void mptcp_nl_flush_addrs_list(struct net *net, 1052 + struct list_head *rm_list) 1053 + { 1054 + long s_slot = 0, s_num = 0; 1055 + struct mptcp_sock *msk; 1056 + 1057 + if (list_empty(rm_list)) 1058 + return; 1059 + 1060 + while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 1061 + struct sock *sk = (struct sock *)msk; 1062 + 1063 + if (!mptcp_pm_is_userspace(msk)) { 1064 + lock_sock(sk); 1065 + mptcp_pm_flush_addrs_and_subflows(msk, rm_list); 1066 + release_sock(sk); 1067 + } 1068 + 1069 + sock_put(sk); 1070 + cond_resched(); 1071 + } 1072 + } 1073 + 1074 + /* caller must ensure the RCU grace period is already elapsed */ 1075 + static void __flush_addrs(struct list_head *list) 1076 + { 1077 + while (!list_empty(list)) { 1078 + struct mptcp_pm_addr_entry *cur; 1079 + 1080 + cur = list_entry(list->next, 1081 + struct mptcp_pm_addr_entry, list); 1082 + list_del_rcu(&cur->list); 1083 + __mptcp_pm_release_addr_entry(cur); 1084 + } 1085 + } 1086 + 1087 + static void __reset_counters(struct pm_nl_pernet *pernet) 1088 + { 1089 + WRITE_ONCE(pernet->add_addr_signal_max, 0); 1090 + WRITE_ONCE(pernet->add_addr_accept_max, 0); 1091 + WRITE_ONCE(pernet->local_addr_max, 0); 1092 + pernet->addrs = 0; 1093 + } 1094 + 1095 + int mptcp_pm_nl_flush_addrs_doit(struct sk_buff *skb, struct genl_info *info) 1096 + { 1097 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 1098 + LIST_HEAD(free_list); 1099 + 1100 + spin_lock_bh(&pernet->lock); 1101 + list_splice_init(&pernet->local_addr_list, &free_list); 1102 + __reset_counters(pernet); 1103 + pernet->next_id = 1; 1104 + bitmap_zero(pernet->id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 1105 + spin_unlock_bh(&pernet->lock); 1106 + mptcp_nl_flush_addrs_list(sock_net(skb->sk), &free_list); 1107 + synchronize_rcu(); 1108 + __flush_addrs(&free_list); 1109 + return 0; 1110 + } 1111 + 1112 + int mptcp_pm_nl_get_addr(u8 id, struct mptcp_pm_addr_entry *addr, 1113 + struct genl_info *info) 1114 + { 1115 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 1116 + struct mptcp_pm_addr_entry *entry; 1117 + int ret = -EINVAL; 1118 + 1119 + rcu_read_lock(); 1120 + entry = __lookup_addr_by_id(pernet, id); 1121 + if (entry) { 1122 + *addr = *entry; 1123 + ret = 0; 1124 + } 1125 + rcu_read_unlock(); 1126 + 1127 + return ret; 1128 + } 1129 + 1130 + int mptcp_pm_nl_dump_addr(struct sk_buff *msg, 1131 + struct netlink_callback *cb) 1132 + { 1133 + struct net *net = sock_net(msg->sk); 1134 + struct mptcp_pm_addr_entry *entry; 1135 + struct pm_nl_pernet *pernet; 1136 + int id = cb->args[0]; 1137 + int i; 1138 + 1139 + pernet = pm_nl_get_pernet(net); 1140 + 1141 + rcu_read_lock(); 1142 + for (i = id; i < MPTCP_PM_MAX_ADDR_ID + 1; i++) { 1143 + if (test_bit(i, pernet->id_bitmap)) { 1144 + entry = __lookup_addr_by_id(pernet, i); 1145 + if (!entry) 1146 + break; 1147 + 1148 + if (entry->addr.id <= id) 1149 + continue; 1150 + 1151 + if (mptcp_pm_genl_fill_addr(msg, cb, entry) < 0) 1152 + break; 1153 + 1154 + id = entry->addr.id; 1155 + } 1156 + } 1157 + rcu_read_unlock(); 1158 + 1159 + cb->args[0] = id; 1160 + return msg->len; 1161 + } 1162 + 1163 + static int parse_limit(struct genl_info *info, int id, unsigned int *limit) 1164 + { 1165 + struct nlattr *attr = info->attrs[id]; 1166 + 1167 + if (!attr) 1168 + return 0; 1169 + 1170 + *limit = nla_get_u32(attr); 1171 + if (*limit > MPTCP_PM_ADDR_MAX) { 1172 + NL_SET_ERR_MSG_ATTR_FMT(info->extack, attr, 1173 + "limit greater than maximum (%u)", 1174 + MPTCP_PM_ADDR_MAX); 1175 + return -EINVAL; 1176 + } 1177 + return 0; 1178 + } 1179 + 1180 + int mptcp_pm_nl_set_limits_doit(struct sk_buff *skb, struct genl_info *info) 1181 + { 1182 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 1183 + unsigned int rcv_addrs, subflows; 1184 + int ret; 1185 + 1186 + spin_lock_bh(&pernet->lock); 1187 + rcv_addrs = pernet->add_addr_accept_max; 1188 + ret = parse_limit(info, MPTCP_PM_ATTR_RCV_ADD_ADDRS, &rcv_addrs); 1189 + if (ret) 1190 + goto unlock; 1191 + 1192 + subflows = pernet->subflows_max; 1193 + ret = parse_limit(info, MPTCP_PM_ATTR_SUBFLOWS, &subflows); 1194 + if (ret) 1195 + goto unlock; 1196 + 1197 + WRITE_ONCE(pernet->add_addr_accept_max, rcv_addrs); 1198 + WRITE_ONCE(pernet->subflows_max, subflows); 1199 + 1200 + unlock: 1201 + spin_unlock_bh(&pernet->lock); 1202 + return ret; 1203 + } 1204 + 1205 + int mptcp_pm_nl_get_limits_doit(struct sk_buff *skb, struct genl_info *info) 1206 + { 1207 + struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 1208 + struct sk_buff *msg; 1209 + void *reply; 1210 + 1211 + msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); 1212 + if (!msg) 1213 + return -ENOMEM; 1214 + 1215 + reply = genlmsg_put_reply(msg, info, &mptcp_genl_family, 0, 1216 + MPTCP_PM_CMD_GET_LIMITS); 1217 + if (!reply) 1218 + goto fail; 1219 + 1220 + if (nla_put_u32(msg, MPTCP_PM_ATTR_RCV_ADD_ADDRS, 1221 + READ_ONCE(pernet->add_addr_accept_max))) 1222 + goto fail; 1223 + 1224 + if (nla_put_u32(msg, MPTCP_PM_ATTR_SUBFLOWS, 1225 + READ_ONCE(pernet->subflows_max))) 1226 + goto fail; 1227 + 1228 + genlmsg_end(msg, reply); 1229 + return genlmsg_reply(msg, info); 1230 + 1231 + fail: 1232 + GENL_SET_ERR_MSG(info, "not enough space in Netlink message"); 1233 + nlmsg_free(msg); 1234 + return -EMSGSIZE; 1235 + } 1236 + 1237 + static void mptcp_pm_nl_fullmesh(struct mptcp_sock *msk, 1238 + struct mptcp_addr_info *addr) 1239 + { 1240 + struct mptcp_rm_list list = { .nr = 0 }; 1241 + 1242 + list.ids[list.nr++] = mptcp_endp_get_local_id(msk, addr); 1243 + 1244 + spin_lock_bh(&msk->pm.lock); 1245 + mptcp_pm_rm_subflow(msk, &list); 1246 + __mark_subflow_endp_available(msk, list.ids[0]); 1247 + mptcp_pm_create_subflow_or_signal_addr(msk); 1248 + spin_unlock_bh(&msk->pm.lock); 1249 + } 1250 + 1251 + static void mptcp_pm_nl_set_flags_all(struct net *net, 1252 + struct mptcp_pm_addr_entry *local, 1253 + u8 changed) 1254 + { 1255 + u8 is_subflow = !!(local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW); 1256 + u8 bkup = !!(local->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 1257 + long s_slot = 0, s_num = 0; 1258 + struct mptcp_sock *msk; 1259 + 1260 + if (changed == MPTCP_PM_ADDR_FLAG_FULLMESH && !is_subflow) 1261 + return; 1262 + 1263 + while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 1264 + struct sock *sk = (struct sock *)msk; 1265 + 1266 + if (list_empty(&msk->conn_list) || mptcp_pm_is_userspace(msk)) 1267 + goto next; 1268 + 1269 + lock_sock(sk); 1270 + if (changed & MPTCP_PM_ADDR_FLAG_BACKUP) 1271 + mptcp_pm_mp_prio_send_ack(msk, &local->addr, NULL, bkup); 1272 + /* Subflows will only be recreated if the SUBFLOW flag is set */ 1273 + if (is_subflow && (changed & MPTCP_PM_ADDR_FLAG_FULLMESH)) 1274 + mptcp_pm_nl_fullmesh(msk, &local->addr); 1275 + release_sock(sk); 1276 + 1277 + next: 1278 + sock_put(sk); 1279 + cond_resched(); 1280 + } 1281 + } 1282 + 1283 + int mptcp_pm_nl_set_flags(struct mptcp_pm_addr_entry *local, 1284 + struct genl_info *info) 1285 + { 1286 + struct nlattr *attr = info->attrs[MPTCP_PM_ATTR_ADDR]; 1287 + u8 changed, mask = MPTCP_PM_ADDR_FLAG_BACKUP | 1288 + MPTCP_PM_ADDR_FLAG_FULLMESH; 1289 + struct net *net = genl_info_net(info); 1290 + struct mptcp_pm_addr_entry *entry; 1291 + struct pm_nl_pernet *pernet; 1292 + u8 lookup_by_id = 0; 1293 + 1294 + pernet = pm_nl_get_pernet(net); 1295 + 1296 + if (local->addr.family == AF_UNSPEC) { 1297 + lookup_by_id = 1; 1298 + if (!local->addr.id) { 1299 + NL_SET_ERR_MSG_ATTR(info->extack, attr, 1300 + "missing address ID"); 1301 + return -EOPNOTSUPP; 1302 + } 1303 + } 1304 + 1305 + spin_lock_bh(&pernet->lock); 1306 + entry = lookup_by_id ? __lookup_addr_by_id(pernet, local->addr.id) : 1307 + __lookup_addr(pernet, &local->addr); 1308 + if (!entry) { 1309 + spin_unlock_bh(&pernet->lock); 1310 + NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 1311 + return -EINVAL; 1312 + } 1313 + if ((local->flags & MPTCP_PM_ADDR_FLAG_FULLMESH) && 1314 + (entry->flags & (MPTCP_PM_ADDR_FLAG_SIGNAL | 1315 + MPTCP_PM_ADDR_FLAG_IMPLICIT))) { 1316 + spin_unlock_bh(&pernet->lock); 1317 + NL_SET_ERR_MSG_ATTR(info->extack, attr, "invalid addr flags"); 1318 + return -EINVAL; 1319 + } 1320 + 1321 + changed = (local->flags ^ entry->flags) & mask; 1322 + entry->flags = (entry->flags & ~mask) | (local->flags & mask); 1323 + *local = *entry; 1324 + spin_unlock_bh(&pernet->lock); 1325 + 1326 + mptcp_pm_nl_set_flags_all(net, local, changed); 1327 + return 0; 1328 + } 1329 + 1330 + bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk) 1331 + { 1332 + struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 1333 + 1334 + if (msk->pm.subflows == mptcp_pm_get_subflows_max(msk) || 1335 + (find_next_and_bit(pernet->id_bitmap, msk->pm.id_avail_bitmap, 1336 + MPTCP_PM_MAX_ADDR_ID + 1, 0) == MPTCP_PM_MAX_ADDR_ID + 1)) { 1337 + WRITE_ONCE(msk->pm.work_pending, false); 1338 + return false; 1339 + } 1340 + return true; 1341 + } 1342 + 1343 + /* Called under PM lock */ 1344 + void __mptcp_pm_kernel_worker(struct mptcp_sock *msk) 1345 + { 1346 + struct mptcp_pm_data *pm = &msk->pm; 1347 + 1348 + if (pm->status & BIT(MPTCP_PM_ADD_ADDR_RECEIVED)) { 1349 + pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_RECEIVED); 1350 + mptcp_pm_nl_add_addr_received(msk); 1351 + } 1352 + if (pm->status & BIT(MPTCP_PM_ESTABLISHED)) { 1353 + pm->status &= ~BIT(MPTCP_PM_ESTABLISHED); 1354 + mptcp_pm_nl_fully_established(msk); 1355 + } 1356 + if (pm->status & BIT(MPTCP_PM_SUBFLOW_ESTABLISHED)) { 1357 + pm->status &= ~BIT(MPTCP_PM_SUBFLOW_ESTABLISHED); 1358 + mptcp_pm_nl_subflow_established(msk); 1359 + } 1360 + } 1361 + 1362 + static int __net_init pm_nl_init_net(struct net *net) 1363 + { 1364 + struct pm_nl_pernet *pernet = pm_nl_get_pernet(net); 1365 + 1366 + INIT_LIST_HEAD_RCU(&pernet->local_addr_list); 1367 + 1368 + /* Cit. 2 subflows ought to be enough for anybody. */ 1369 + pernet->subflows_max = 2; 1370 + pernet->next_id = 1; 1371 + pernet->stale_loss_cnt = 4; 1372 + spin_lock_init(&pernet->lock); 1373 + 1374 + /* No need to initialize other pernet fields, the struct is zeroed at 1375 + * allocation time. 1376 + */ 1377 + 1378 + return 0; 1379 + } 1380 + 1381 + static void __net_exit pm_nl_exit_net(struct list_head *net_list) 1382 + { 1383 + struct net *net; 1384 + 1385 + list_for_each_entry(net, net_list, exit_list) { 1386 + struct pm_nl_pernet *pernet = pm_nl_get_pernet(net); 1387 + 1388 + /* net is removed from namespace list, can't race with 1389 + * other modifiers, also netns core already waited for a 1390 + * RCU grace period. 1391 + */ 1392 + __flush_addrs(&pernet->local_addr_list); 1393 + } 1394 + } 1395 + 1396 + static struct pernet_operations mptcp_pm_pernet_ops = { 1397 + .init = pm_nl_init_net, 1398 + .exit_batch = pm_nl_exit_net, 1399 + .id = &pm_nl_pernet_id, 1400 + .size = sizeof(struct pm_nl_pernet), 1401 + }; 1402 + 1403 + void __init mptcp_pm_nl_init(void) 1404 + { 1405 + if (register_pernet_subsys(&mptcp_pm_pernet_ops) < 0) 1406 + panic("Failed to register MPTCP PM pernet subsystem.\n"); 1407 + 1408 + if (genl_register_family(&mptcp_genl_family)) 1409 + panic("Failed to register MPTCP PM netlink family\n"); 1410 + }
+84 -1853
net/mptcp/pm_netlink.c
··· 6 6 7 7 #define pr_fmt(fmt) "MPTCP: " fmt 8 8 9 - #include <linux/inet.h> 10 - #include <linux/kernel.h> 11 - #include <net/inet_common.h> 12 - #include <net/netns/generic.h> 13 - #include <net/mptcp.h> 14 - 15 9 #include "protocol.h" 16 - #include "mib.h" 17 10 #include "mptcp_pm_gen.h" 18 - 19 - static int pm_nl_pernet_id; 20 - 21 - struct mptcp_pm_add_entry { 22 - struct list_head list; 23 - struct mptcp_addr_info addr; 24 - u8 retrans_times; 25 - struct timer_list add_timer; 26 - struct mptcp_sock *sock; 27 - }; 28 - 29 - struct pm_nl_pernet { 30 - /* protects pernet updates */ 31 - spinlock_t lock; 32 - struct list_head local_addr_list; 33 - unsigned int addrs; 34 - unsigned int stale_loss_cnt; 35 - unsigned int add_addr_signal_max; 36 - unsigned int add_addr_accept_max; 37 - unsigned int local_addr_max; 38 - unsigned int subflows_max; 39 - unsigned int next_id; 40 - DECLARE_BITMAP(id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 41 - }; 42 - 43 - #define MPTCP_PM_ADDR_MAX 8 44 - #define ADD_ADDR_RETRANS_MAX 3 45 - 46 - static struct pm_nl_pernet *pm_nl_get_pernet(const struct net *net) 47 - { 48 - return net_generic(net, pm_nl_pernet_id); 49 - } 50 - 51 - static struct pm_nl_pernet * 52 - pm_nl_get_pernet_from_msk(const struct mptcp_sock *msk) 53 - { 54 - return pm_nl_get_pernet(sock_net((struct sock *)msk)); 55 - } 56 - 57 - bool mptcp_addresses_equal(const struct mptcp_addr_info *a, 58 - const struct mptcp_addr_info *b, bool use_port) 59 - { 60 - bool addr_equals = false; 61 - 62 - if (a->family == b->family) { 63 - if (a->family == AF_INET) 64 - addr_equals = a->addr.s_addr == b->addr.s_addr; 65 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 66 - else 67 - addr_equals = ipv6_addr_equal(&a->addr6, &b->addr6); 68 - } else if (a->family == AF_INET) { 69 - if (ipv6_addr_v4mapped(&b->addr6)) 70 - addr_equals = a->addr.s_addr == b->addr6.s6_addr32[3]; 71 - } else if (b->family == AF_INET) { 72 - if (ipv6_addr_v4mapped(&a->addr6)) 73 - addr_equals = a->addr6.s6_addr32[3] == b->addr.s_addr; 74 - #endif 75 - } 76 - 77 - if (!addr_equals) 78 - return false; 79 - if (!use_port) 80 - return true; 81 - 82 - return a->port == b->port; 83 - } 84 - 85 - void mptcp_local_address(const struct sock_common *skc, struct mptcp_addr_info *addr) 86 - { 87 - addr->family = skc->skc_family; 88 - addr->port = htons(skc->skc_num); 89 - if (addr->family == AF_INET) 90 - addr->addr.s_addr = skc->skc_rcv_saddr; 91 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 92 - else if (addr->family == AF_INET6) 93 - addr->addr6 = skc->skc_v6_rcv_saddr; 94 - #endif 95 - } 96 - 97 - static void remote_address(const struct sock_common *skc, 98 - struct mptcp_addr_info *addr) 99 - { 100 - addr->family = skc->skc_family; 101 - addr->port = skc->skc_dport; 102 - if (addr->family == AF_INET) 103 - addr->addr.s_addr = skc->skc_daddr; 104 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 105 - else if (addr->family == AF_INET6) 106 - addr->addr6 = skc->skc_v6_daddr; 107 - #endif 108 - } 109 - 110 - bool mptcp_lookup_subflow_by_saddr(const struct list_head *list, 111 - const struct mptcp_addr_info *saddr) 112 - { 113 - struct mptcp_subflow_context *subflow; 114 - struct mptcp_addr_info cur; 115 - struct sock_common *skc; 116 - 117 - list_for_each_entry(subflow, list, node) { 118 - skc = (struct sock_common *)mptcp_subflow_tcp_sock(subflow); 119 - 120 - mptcp_local_address(skc, &cur); 121 - if (mptcp_addresses_equal(&cur, saddr, saddr->port)) 122 - return true; 123 - } 124 - 125 - return false; 126 - } 127 - 128 - static bool lookup_subflow_by_daddr(const struct list_head *list, 129 - const struct mptcp_addr_info *daddr) 130 - { 131 - struct mptcp_subflow_context *subflow; 132 - struct mptcp_addr_info cur; 133 - 134 - list_for_each_entry(subflow, list, node) { 135 - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 136 - 137 - if (!((1 << inet_sk_state_load(ssk)) & 138 - (TCPF_ESTABLISHED | TCPF_SYN_SENT | TCPF_SYN_RECV))) 139 - continue; 140 - 141 - remote_address((struct sock_common *)ssk, &cur); 142 - if (mptcp_addresses_equal(&cur, daddr, daddr->port)) 143 - return true; 144 - } 145 - 146 - return false; 147 - } 148 - 149 - static bool 150 - select_local_address(const struct pm_nl_pernet *pernet, 151 - const struct mptcp_sock *msk, 152 - struct mptcp_pm_local *new_local) 153 - { 154 - struct mptcp_pm_addr_entry *entry; 155 - bool found = false; 156 - 157 - msk_owned_by_me(msk); 158 - 159 - rcu_read_lock(); 160 - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 161 - if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) 162 - continue; 163 - 164 - if (!test_bit(entry->addr.id, msk->pm.id_avail_bitmap)) 165 - continue; 166 - 167 - new_local->addr = entry->addr; 168 - new_local->flags = entry->flags; 169 - new_local->ifindex = entry->ifindex; 170 - found = true; 171 - break; 172 - } 173 - rcu_read_unlock(); 174 - 175 - return found; 176 - } 177 - 178 - static bool 179 - select_signal_address(struct pm_nl_pernet *pernet, const struct mptcp_sock *msk, 180 - struct mptcp_pm_local *new_local) 181 - { 182 - struct mptcp_pm_addr_entry *entry; 183 - bool found = false; 184 - 185 - rcu_read_lock(); 186 - /* do not keep any additional per socket state, just signal 187 - * the address list in order. 188 - * Note: removal from the local address list during the msk life-cycle 189 - * can lead to additional addresses not being announced. 190 - */ 191 - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 192 - if (!test_bit(entry->addr.id, msk->pm.id_avail_bitmap)) 193 - continue; 194 - 195 - if (!(entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL)) 196 - continue; 197 - 198 - new_local->addr = entry->addr; 199 - new_local->flags = entry->flags; 200 - new_local->ifindex = entry->ifindex; 201 - found = true; 202 - break; 203 - } 204 - rcu_read_unlock(); 205 - 206 - return found; 207 - } 208 - 209 - unsigned int mptcp_pm_get_add_addr_signal_max(const struct mptcp_sock *msk) 210 - { 211 - const struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 212 - 213 - return READ_ONCE(pernet->add_addr_signal_max); 214 - } 215 - EXPORT_SYMBOL_GPL(mptcp_pm_get_add_addr_signal_max); 216 - 217 - unsigned int mptcp_pm_get_add_addr_accept_max(const struct mptcp_sock *msk) 218 - { 219 - struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 220 - 221 - return READ_ONCE(pernet->add_addr_accept_max); 222 - } 223 - EXPORT_SYMBOL_GPL(mptcp_pm_get_add_addr_accept_max); 224 - 225 - unsigned int mptcp_pm_get_subflows_max(const struct mptcp_sock *msk) 226 - { 227 - struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 228 - 229 - return READ_ONCE(pernet->subflows_max); 230 - } 231 - EXPORT_SYMBOL_GPL(mptcp_pm_get_subflows_max); 232 - 233 - unsigned int mptcp_pm_get_local_addr_max(const struct mptcp_sock *msk) 234 - { 235 - struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 236 - 237 - return READ_ONCE(pernet->local_addr_max); 238 - } 239 - EXPORT_SYMBOL_GPL(mptcp_pm_get_local_addr_max); 240 - 241 - bool mptcp_pm_nl_check_work_pending(struct mptcp_sock *msk) 242 - { 243 - struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 244 - 245 - if (msk->pm.subflows == mptcp_pm_get_subflows_max(msk) || 246 - (find_next_and_bit(pernet->id_bitmap, msk->pm.id_avail_bitmap, 247 - MPTCP_PM_MAX_ADDR_ID + 1, 0) == MPTCP_PM_MAX_ADDR_ID + 1)) { 248 - WRITE_ONCE(msk->pm.work_pending, false); 249 - return false; 250 - } 251 - return true; 252 - } 253 - 254 - struct mptcp_pm_add_entry * 255 - mptcp_lookup_anno_list_by_saddr(const struct mptcp_sock *msk, 256 - const struct mptcp_addr_info *addr) 257 - { 258 - struct mptcp_pm_add_entry *entry; 259 - 260 - lockdep_assert_held(&msk->pm.lock); 261 - 262 - list_for_each_entry(entry, &msk->pm.anno_list, list) { 263 - if (mptcp_addresses_equal(&entry->addr, addr, true)) 264 - return entry; 265 - } 266 - 267 - return NULL; 268 - } 269 - 270 - bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk) 271 - { 272 - struct mptcp_pm_add_entry *entry; 273 - struct mptcp_addr_info saddr; 274 - bool ret = false; 275 - 276 - mptcp_local_address((struct sock_common *)sk, &saddr); 277 - 278 - spin_lock_bh(&msk->pm.lock); 279 - list_for_each_entry(entry, &msk->pm.anno_list, list) { 280 - if (mptcp_addresses_equal(&entry->addr, &saddr, true)) { 281 - ret = true; 282 - goto out; 283 - } 284 - } 285 - 286 - out: 287 - spin_unlock_bh(&msk->pm.lock); 288 - return ret; 289 - } 290 - 291 - static void mptcp_pm_add_timer(struct timer_list *timer) 292 - { 293 - struct mptcp_pm_add_entry *entry = from_timer(entry, timer, add_timer); 294 - struct mptcp_sock *msk = entry->sock; 295 - struct sock *sk = (struct sock *)msk; 296 - 297 - pr_debug("msk=%p\n", msk); 298 - 299 - if (!msk) 300 - return; 301 - 302 - if (inet_sk_state_load(sk) == TCP_CLOSE) 303 - return; 304 - 305 - if (!entry->addr.id) 306 - return; 307 - 308 - if (mptcp_pm_should_add_signal_addr(msk)) { 309 - sk_reset_timer(sk, timer, jiffies + TCP_RTO_MAX / 8); 310 - goto out; 311 - } 312 - 313 - spin_lock_bh(&msk->pm.lock); 314 - 315 - if (!mptcp_pm_should_add_signal_addr(msk)) { 316 - pr_debug("retransmit ADD_ADDR id=%d\n", entry->addr.id); 317 - mptcp_pm_announce_addr(msk, &entry->addr, false); 318 - mptcp_pm_add_addr_send_ack(msk); 319 - entry->retrans_times++; 320 - } 321 - 322 - if (entry->retrans_times < ADD_ADDR_RETRANS_MAX) 323 - sk_reset_timer(sk, timer, 324 - jiffies + mptcp_get_add_addr_timeout(sock_net(sk))); 325 - 326 - spin_unlock_bh(&msk->pm.lock); 327 - 328 - if (entry->retrans_times == ADD_ADDR_RETRANS_MAX) 329 - mptcp_pm_subflow_established(msk); 330 - 331 - out: 332 - __sock_put(sk); 333 - } 334 - 335 - struct mptcp_pm_add_entry * 336 - mptcp_pm_del_add_timer(struct mptcp_sock *msk, 337 - const struct mptcp_addr_info *addr, bool check_id) 338 - { 339 - struct mptcp_pm_add_entry *entry; 340 - struct sock *sk = (struct sock *)msk; 341 - struct timer_list *add_timer = NULL; 342 - 343 - spin_lock_bh(&msk->pm.lock); 344 - entry = mptcp_lookup_anno_list_by_saddr(msk, addr); 345 - if (entry && (!check_id || entry->addr.id == addr->id)) { 346 - entry->retrans_times = ADD_ADDR_RETRANS_MAX; 347 - add_timer = &entry->add_timer; 348 - } 349 - if (!check_id && entry) 350 - list_del(&entry->list); 351 - spin_unlock_bh(&msk->pm.lock); 352 - 353 - /* no lock, because sk_stop_timer_sync() is calling del_timer_sync() */ 354 - if (add_timer) 355 - sk_stop_timer_sync(sk, add_timer); 356 - 357 - return entry; 358 - } 359 - 360 - bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, 361 - const struct mptcp_addr_info *addr) 362 - { 363 - struct mptcp_pm_add_entry *add_entry = NULL; 364 - struct sock *sk = (struct sock *)msk; 365 - struct net *net = sock_net(sk); 366 - 367 - lockdep_assert_held(&msk->pm.lock); 368 - 369 - add_entry = mptcp_lookup_anno_list_by_saddr(msk, addr); 370 - 371 - if (add_entry) { 372 - if (WARN_ON_ONCE(mptcp_pm_is_kernel(msk))) 373 - return false; 374 - 375 - sk_reset_timer(sk, &add_entry->add_timer, 376 - jiffies + mptcp_get_add_addr_timeout(net)); 377 - return true; 378 - } 379 - 380 - add_entry = kmalloc(sizeof(*add_entry), GFP_ATOMIC); 381 - if (!add_entry) 382 - return false; 383 - 384 - list_add(&add_entry->list, &msk->pm.anno_list); 385 - 386 - add_entry->addr = *addr; 387 - add_entry->sock = msk; 388 - add_entry->retrans_times = 0; 389 - 390 - timer_setup(&add_entry->add_timer, mptcp_pm_add_timer, 0); 391 - sk_reset_timer(sk, &add_entry->add_timer, 392 - jiffies + mptcp_get_add_addr_timeout(net)); 393 - 394 - return true; 395 - } 396 - 397 - void mptcp_pm_free_anno_list(struct mptcp_sock *msk) 398 - { 399 - struct mptcp_pm_add_entry *entry, *tmp; 400 - struct sock *sk = (struct sock *)msk; 401 - LIST_HEAD(free_list); 402 - 403 - pr_debug("msk=%p\n", msk); 404 - 405 - spin_lock_bh(&msk->pm.lock); 406 - list_splice_init(&msk->pm.anno_list, &free_list); 407 - spin_unlock_bh(&msk->pm.lock); 408 - 409 - list_for_each_entry_safe(entry, tmp, &free_list, list) { 410 - sk_stop_timer_sync(sk, &entry->add_timer); 411 - kfree(entry); 412 - } 413 - } 414 - 415 - /* Fill all the remote addresses into the array addrs[], 416 - * and return the array size. 417 - */ 418 - static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk, 419 - struct mptcp_addr_info *local, 420 - bool fullmesh, 421 - struct mptcp_addr_info *addrs) 422 - { 423 - bool deny_id0 = READ_ONCE(msk->pm.remote_deny_join_id0); 424 - struct sock *sk = (struct sock *)msk, *ssk; 425 - struct mptcp_subflow_context *subflow; 426 - struct mptcp_addr_info remote = { 0 }; 427 - unsigned int subflows_max; 428 - int i = 0; 429 - 430 - subflows_max = mptcp_pm_get_subflows_max(msk); 431 - remote_address((struct sock_common *)sk, &remote); 432 - 433 - /* Non-fullmesh endpoint, fill in the single entry 434 - * corresponding to the primary MPC subflow remote address 435 - */ 436 - if (!fullmesh) { 437 - if (deny_id0) 438 - return 0; 439 - 440 - if (!mptcp_pm_addr_families_match(sk, local, &remote)) 441 - return 0; 442 - 443 - msk->pm.subflows++; 444 - addrs[i++] = remote; 445 - } else { 446 - DECLARE_BITMAP(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1); 447 - 448 - /* Forbid creation of new subflows matching existing 449 - * ones, possibly already created by incoming ADD_ADDR 450 - */ 451 - bitmap_zero(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1); 452 - mptcp_for_each_subflow(msk, subflow) 453 - if (READ_ONCE(subflow->local_id) == local->id) 454 - __set_bit(subflow->remote_id, unavail_id); 455 - 456 - mptcp_for_each_subflow(msk, subflow) { 457 - ssk = mptcp_subflow_tcp_sock(subflow); 458 - remote_address((struct sock_common *)ssk, &addrs[i]); 459 - addrs[i].id = READ_ONCE(subflow->remote_id); 460 - if (deny_id0 && !addrs[i].id) 461 - continue; 462 - 463 - if (test_bit(addrs[i].id, unavail_id)) 464 - continue; 465 - 466 - if (!mptcp_pm_addr_families_match(sk, local, &addrs[i])) 467 - continue; 468 - 469 - if (msk->pm.subflows < subflows_max) { 470 - /* forbid creating multiple address towards 471 - * this id 472 - */ 473 - __set_bit(addrs[i].id, unavail_id); 474 - msk->pm.subflows++; 475 - i++; 476 - } 477 - } 478 - } 479 - 480 - return i; 481 - } 482 - 483 - static void __mptcp_pm_send_ack(struct mptcp_sock *msk, struct mptcp_subflow_context *subflow, 484 - bool prio, bool backup) 485 - { 486 - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 487 - bool slow; 488 - 489 - pr_debug("send ack for %s\n", 490 - prio ? "mp_prio" : (mptcp_pm_should_add_signal(msk) ? "add_addr" : "rm_addr")); 491 - 492 - slow = lock_sock_fast(ssk); 493 - if (prio) { 494 - subflow->send_mp_prio = 1; 495 - subflow->request_bkup = backup; 496 - } 497 - 498 - __mptcp_subflow_send_ack(ssk); 499 - unlock_sock_fast(ssk, slow); 500 - } 501 - 502 - static void mptcp_pm_send_ack(struct mptcp_sock *msk, struct mptcp_subflow_context *subflow, 503 - bool prio, bool backup) 504 - { 505 - spin_unlock_bh(&msk->pm.lock); 506 - __mptcp_pm_send_ack(msk, subflow, prio, backup); 507 - spin_lock_bh(&msk->pm.lock); 508 - } 509 - 510 - static struct mptcp_pm_addr_entry * 511 - __lookup_addr_by_id(struct pm_nl_pernet *pernet, unsigned int id) 512 - { 513 - struct mptcp_pm_addr_entry *entry; 514 - 515 - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, 516 - lockdep_is_held(&pernet->lock)) { 517 - if (entry->addr.id == id) 518 - return entry; 519 - } 520 - return NULL; 521 - } 522 - 523 - static struct mptcp_pm_addr_entry * 524 - __lookup_addr(struct pm_nl_pernet *pernet, const struct mptcp_addr_info *info) 525 - { 526 - struct mptcp_pm_addr_entry *entry; 527 - 528 - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list, 529 - lockdep_is_held(&pernet->lock)) { 530 - if (mptcp_addresses_equal(&entry->addr, info, entry->addr.port)) 531 - return entry; 532 - } 533 - return NULL; 534 - } 535 - 536 - static void mptcp_pm_create_subflow_or_signal_addr(struct mptcp_sock *msk) 537 - { 538 - struct sock *sk = (struct sock *)msk; 539 - unsigned int add_addr_signal_max; 540 - bool signal_and_subflow = false; 541 - unsigned int local_addr_max; 542 - struct pm_nl_pernet *pernet; 543 - struct mptcp_pm_local local; 544 - unsigned int subflows_max; 545 - 546 - pernet = pm_nl_get_pernet(sock_net(sk)); 547 - 548 - add_addr_signal_max = mptcp_pm_get_add_addr_signal_max(msk); 549 - local_addr_max = mptcp_pm_get_local_addr_max(msk); 550 - subflows_max = mptcp_pm_get_subflows_max(msk); 551 - 552 - /* do lazy endpoint usage accounting for the MPC subflows */ 553 - if (unlikely(!(msk->pm.status & BIT(MPTCP_PM_MPC_ENDPOINT_ACCOUNTED))) && msk->first) { 554 - struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(msk->first); 555 - struct mptcp_pm_addr_entry *entry; 556 - struct mptcp_addr_info mpc_addr; 557 - bool backup = false; 558 - 559 - mptcp_local_address((struct sock_common *)msk->first, &mpc_addr); 560 - rcu_read_lock(); 561 - entry = __lookup_addr(pernet, &mpc_addr); 562 - if (entry) { 563 - __clear_bit(entry->addr.id, msk->pm.id_avail_bitmap); 564 - msk->mpc_endpoint_id = entry->addr.id; 565 - backup = !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 566 - } 567 - rcu_read_unlock(); 568 - 569 - if (backup) 570 - mptcp_pm_send_ack(msk, subflow, true, backup); 571 - 572 - msk->pm.status |= BIT(MPTCP_PM_MPC_ENDPOINT_ACCOUNTED); 573 - } 574 - 575 - pr_debug("local %d:%d signal %d:%d subflows %d:%d\n", 576 - msk->pm.local_addr_used, local_addr_max, 577 - msk->pm.add_addr_signaled, add_addr_signal_max, 578 - msk->pm.subflows, subflows_max); 579 - 580 - /* check first for announce */ 581 - if (msk->pm.add_addr_signaled < add_addr_signal_max) { 582 - /* due to racing events on both ends we can reach here while 583 - * previous add address is still running: if we invoke now 584 - * mptcp_pm_announce_addr(), that will fail and the 585 - * corresponding id will be marked as used. 586 - * Instead let the PM machinery reschedule us when the 587 - * current address announce will be completed. 588 - */ 589 - if (msk->pm.addr_signal & BIT(MPTCP_ADD_ADDR_SIGNAL)) 590 - return; 591 - 592 - if (!select_signal_address(pernet, msk, &local)) 593 - goto subflow; 594 - 595 - /* If the alloc fails, we are on memory pressure, not worth 596 - * continuing, and trying to create subflows. 597 - */ 598 - if (!mptcp_pm_alloc_anno_list(msk, &local.addr)) 599 - return; 600 - 601 - __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); 602 - msk->pm.add_addr_signaled++; 603 - 604 - /* Special case for ID0: set the correct ID */ 605 - if (local.addr.id == msk->mpc_endpoint_id) 606 - local.addr.id = 0; 607 - 608 - mptcp_pm_announce_addr(msk, &local.addr, false); 609 - mptcp_pm_nl_addr_send_ack(msk); 610 - 611 - if (local.flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) 612 - signal_and_subflow = true; 613 - } 614 - 615 - subflow: 616 - /* check if should create a new subflow */ 617 - while (msk->pm.local_addr_used < local_addr_max && 618 - msk->pm.subflows < subflows_max) { 619 - struct mptcp_addr_info addrs[MPTCP_PM_ADDR_MAX]; 620 - bool fullmesh; 621 - int i, nr; 622 - 623 - if (signal_and_subflow) 624 - signal_and_subflow = false; 625 - else if (!select_local_address(pernet, msk, &local)) 626 - break; 627 - 628 - fullmesh = !!(local.flags & MPTCP_PM_ADDR_FLAG_FULLMESH); 629 - 630 - __clear_bit(local.addr.id, msk->pm.id_avail_bitmap); 631 - 632 - /* Special case for ID0: set the correct ID */ 633 - if (local.addr.id == msk->mpc_endpoint_id) 634 - local.addr.id = 0; 635 - else /* local_addr_used is not decr for ID 0 */ 636 - msk->pm.local_addr_used++; 637 - 638 - nr = fill_remote_addresses_vec(msk, &local.addr, fullmesh, addrs); 639 - if (nr == 0) 640 - continue; 641 - 642 - spin_unlock_bh(&msk->pm.lock); 643 - for (i = 0; i < nr; i++) 644 - __mptcp_subflow_connect(sk, &local, &addrs[i]); 645 - spin_lock_bh(&msk->pm.lock); 646 - } 647 - mptcp_pm_nl_check_work_pending(msk); 648 - } 649 - 650 - static void mptcp_pm_nl_fully_established(struct mptcp_sock *msk) 651 - { 652 - mptcp_pm_create_subflow_or_signal_addr(msk); 653 - } 654 - 655 - static void mptcp_pm_nl_subflow_established(struct mptcp_sock *msk) 656 - { 657 - mptcp_pm_create_subflow_or_signal_addr(msk); 658 - } 659 - 660 - /* Fill all the local addresses into the array addrs[], 661 - * and return the array size. 662 - */ 663 - static unsigned int fill_local_addresses_vec(struct mptcp_sock *msk, 664 - struct mptcp_addr_info *remote, 665 - struct mptcp_pm_local *locals) 666 - { 667 - struct sock *sk = (struct sock *)msk; 668 - struct mptcp_pm_addr_entry *entry; 669 - struct mptcp_addr_info mpc_addr; 670 - struct pm_nl_pernet *pernet; 671 - unsigned int subflows_max; 672 - int i = 0; 673 - 674 - pernet = pm_nl_get_pernet_from_msk(msk); 675 - subflows_max = mptcp_pm_get_subflows_max(msk); 676 - 677 - mptcp_local_address((struct sock_common *)msk, &mpc_addr); 678 - 679 - rcu_read_lock(); 680 - list_for_each_entry_rcu(entry, &pernet->local_addr_list, list) { 681 - if (!(entry->flags & MPTCP_PM_ADDR_FLAG_FULLMESH)) 682 - continue; 683 - 684 - if (!mptcp_pm_addr_families_match(sk, &entry->addr, remote)) 685 - continue; 686 - 687 - if (msk->pm.subflows < subflows_max) { 688 - locals[i].addr = entry->addr; 689 - locals[i].flags = entry->flags; 690 - locals[i].ifindex = entry->ifindex; 691 - 692 - /* Special case for ID0: set the correct ID */ 693 - if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port)) 694 - locals[i].addr.id = 0; 695 - 696 - msk->pm.subflows++; 697 - i++; 698 - } 699 - } 700 - rcu_read_unlock(); 701 - 702 - /* If the array is empty, fill in the single 703 - * 'IPADDRANY' local address 704 - */ 705 - if (!i) { 706 - memset(&locals[i], 0, sizeof(locals[i])); 707 - locals[i].addr.family = 708 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 709 - remote->family == AF_INET6 && 710 - ipv6_addr_v4mapped(&remote->addr6) ? AF_INET : 711 - #endif 712 - remote->family; 713 - 714 - if (!mptcp_pm_addr_families_match(sk, &locals[i].addr, remote)) 715 - return 0; 716 - 717 - msk->pm.subflows++; 718 - i++; 719 - } 720 - 721 - return i; 722 - } 723 - 724 - static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) 725 - { 726 - struct mptcp_pm_local locals[MPTCP_PM_ADDR_MAX]; 727 - struct sock *sk = (struct sock *)msk; 728 - unsigned int add_addr_accept_max; 729 - struct mptcp_addr_info remote; 730 - unsigned int subflows_max; 731 - bool sf_created = false; 732 - int i, nr; 733 - 734 - add_addr_accept_max = mptcp_pm_get_add_addr_accept_max(msk); 735 - subflows_max = mptcp_pm_get_subflows_max(msk); 736 - 737 - pr_debug("accepted %d:%d remote family %d\n", 738 - msk->pm.add_addr_accepted, add_addr_accept_max, 739 - msk->pm.remote.family); 740 - 741 - remote = msk->pm.remote; 742 - mptcp_pm_announce_addr(msk, &remote, true); 743 - mptcp_pm_nl_addr_send_ack(msk); 744 - 745 - if (lookup_subflow_by_daddr(&msk->conn_list, &remote)) 746 - return; 747 - 748 - /* pick id 0 port, if none is provided the remote address */ 749 - if (!remote.port) 750 - remote.port = sk->sk_dport; 751 - 752 - /* connect to the specified remote address, using whatever 753 - * local address the routing configuration will pick. 754 - */ 755 - nr = fill_local_addresses_vec(msk, &remote, locals); 756 - if (nr == 0) 757 - return; 758 - 759 - spin_unlock_bh(&msk->pm.lock); 760 - for (i = 0; i < nr; i++) 761 - if (__mptcp_subflow_connect(sk, &locals[i], &remote) == 0) 762 - sf_created = true; 763 - spin_lock_bh(&msk->pm.lock); 764 - 765 - if (sf_created) { 766 - /* add_addr_accepted is not decr for ID 0 */ 767 - if (remote.id) 768 - msk->pm.add_addr_accepted++; 769 - if (msk->pm.add_addr_accepted >= add_addr_accept_max || 770 - msk->pm.subflows >= subflows_max) 771 - WRITE_ONCE(msk->pm.accept_addr, false); 772 - } 773 - } 774 - 775 - bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, 776 - const struct mptcp_addr_info *remote) 777 - { 778 - struct mptcp_addr_info mpc_remote; 779 - 780 - remote_address((struct sock_common *)msk, &mpc_remote); 781 - return mptcp_addresses_equal(&mpc_remote, remote, remote->port); 782 - } 783 - 784 - void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) 785 - { 786 - struct mptcp_subflow_context *subflow, *alt = NULL; 787 - 788 - msk_owned_by_me(msk); 789 - lockdep_assert_held(&msk->pm.lock); 790 - 791 - if (!mptcp_pm_should_add_signal(msk) && 792 - !mptcp_pm_should_rm_signal(msk)) 793 - return; 794 - 795 - mptcp_for_each_subflow(msk, subflow) { 796 - if (__mptcp_subflow_active(subflow)) { 797 - if (!subflow->stale) { 798 - mptcp_pm_send_ack(msk, subflow, false, false); 799 - return; 800 - } 801 - 802 - if (!alt) 803 - alt = subflow; 804 - } 805 - } 806 - 807 - if (alt) 808 - mptcp_pm_send_ack(msk, alt, false, false); 809 - } 810 - 811 - int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk, 812 - struct mptcp_addr_info *addr, 813 - struct mptcp_addr_info *rem, 814 - u8 bkup) 815 - { 816 - struct mptcp_subflow_context *subflow; 817 - 818 - pr_debug("bkup=%d\n", bkup); 819 - 820 - mptcp_for_each_subflow(msk, subflow) { 821 - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 822 - struct mptcp_addr_info local, remote; 823 - 824 - mptcp_local_address((struct sock_common *)ssk, &local); 825 - if (!mptcp_addresses_equal(&local, addr, addr->port)) 826 - continue; 827 - 828 - if (rem && rem->family != AF_UNSPEC) { 829 - remote_address((struct sock_common *)ssk, &remote); 830 - if (!mptcp_addresses_equal(&remote, rem, rem->port)) 831 - continue; 832 - } 833 - 834 - __mptcp_pm_send_ack(msk, subflow, true, bkup); 835 - return 0; 836 - } 837 - 838 - return -EINVAL; 839 - } 840 - 841 - static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk, 842 - const struct mptcp_rm_list *rm_list, 843 - enum linux_mptcp_mib_field rm_type) 844 - { 845 - struct mptcp_subflow_context *subflow, *tmp; 846 - struct sock *sk = (struct sock *)msk; 847 - u8 i; 848 - 849 - pr_debug("%s rm_list_nr %d\n", 850 - rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", rm_list->nr); 851 - 852 - msk_owned_by_me(msk); 853 - 854 - if (sk->sk_state == TCP_LISTEN) 855 - return; 856 - 857 - if (!rm_list->nr) 858 - return; 859 - 860 - if (list_empty(&msk->conn_list)) 861 - return; 862 - 863 - for (i = 0; i < rm_list->nr; i++) { 864 - u8 rm_id = rm_list->ids[i]; 865 - bool removed = false; 866 - 867 - mptcp_for_each_subflow_safe(msk, subflow, tmp) { 868 - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); 869 - u8 remote_id = READ_ONCE(subflow->remote_id); 870 - int how = RCV_SHUTDOWN | SEND_SHUTDOWN; 871 - u8 id = subflow_get_local_id(subflow); 872 - 873 - if ((1 << inet_sk_state_load(ssk)) & 874 - (TCPF_FIN_WAIT1 | TCPF_FIN_WAIT2 | TCPF_CLOSING | TCPF_CLOSE)) 875 - continue; 876 - if (rm_type == MPTCP_MIB_RMADDR && remote_id != rm_id) 877 - continue; 878 - if (rm_type == MPTCP_MIB_RMSUBFLOW && id != rm_id) 879 - continue; 880 - 881 - pr_debug(" -> %s rm_list_ids[%d]=%u local_id=%u remote_id=%u mpc_id=%u\n", 882 - rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", 883 - i, rm_id, id, remote_id, msk->mpc_endpoint_id); 884 - spin_unlock_bh(&msk->pm.lock); 885 - mptcp_subflow_shutdown(sk, ssk, how); 886 - removed |= subflow->request_join; 887 - 888 - /* the following takes care of updating the subflows counter */ 889 - mptcp_close_ssk(sk, ssk, subflow); 890 - spin_lock_bh(&msk->pm.lock); 891 - 892 - if (rm_type == MPTCP_MIB_RMSUBFLOW) 893 - __MPTCP_INC_STATS(sock_net(sk), rm_type); 894 - } 895 - 896 - if (rm_type == MPTCP_MIB_RMADDR) 897 - __MPTCP_INC_STATS(sock_net(sk), rm_type); 898 - 899 - if (!removed) 900 - continue; 901 - 902 - if (!mptcp_pm_is_kernel(msk)) 903 - continue; 904 - 905 - if (rm_type == MPTCP_MIB_RMADDR && rm_id && 906 - !WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { 907 - /* Note: if the subflow has been closed before, this 908 - * add_addr_accepted counter will not be decremented. 909 - */ 910 - if (--msk->pm.add_addr_accepted < mptcp_pm_get_add_addr_accept_max(msk)) 911 - WRITE_ONCE(msk->pm.accept_addr, true); 912 - } 913 - } 914 - } 915 - 916 - static void mptcp_pm_nl_rm_addr_received(struct mptcp_sock *msk) 917 - { 918 - mptcp_pm_nl_rm_addr_or_subflow(msk, &msk->pm.rm_list_rx, MPTCP_MIB_RMADDR); 919 - } 920 - 921 - static void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk, 922 - const struct mptcp_rm_list *rm_list) 923 - { 924 - mptcp_pm_nl_rm_addr_or_subflow(msk, rm_list, MPTCP_MIB_RMSUBFLOW); 925 - } 926 - 927 - void mptcp_pm_nl_work(struct mptcp_sock *msk) 928 - { 929 - struct mptcp_pm_data *pm = &msk->pm; 930 - 931 - msk_owned_by_me(msk); 932 - 933 - if (!(pm->status & MPTCP_PM_WORK_MASK)) 934 - return; 935 - 936 - spin_lock_bh(&msk->pm.lock); 937 - 938 - pr_debug("msk=%p status=%x\n", msk, pm->status); 939 - if (pm->status & BIT(MPTCP_PM_ADD_ADDR_RECEIVED)) { 940 - pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_RECEIVED); 941 - mptcp_pm_nl_add_addr_received(msk); 942 - } 943 - if (pm->status & BIT(MPTCP_PM_ADD_ADDR_SEND_ACK)) { 944 - pm->status &= ~BIT(MPTCP_PM_ADD_ADDR_SEND_ACK); 945 - mptcp_pm_nl_addr_send_ack(msk); 946 - } 947 - if (pm->status & BIT(MPTCP_PM_RM_ADDR_RECEIVED)) { 948 - pm->status &= ~BIT(MPTCP_PM_RM_ADDR_RECEIVED); 949 - mptcp_pm_nl_rm_addr_received(msk); 950 - } 951 - if (pm->status & BIT(MPTCP_PM_ESTABLISHED)) { 952 - pm->status &= ~BIT(MPTCP_PM_ESTABLISHED); 953 - mptcp_pm_nl_fully_established(msk); 954 - } 955 - if (pm->status & BIT(MPTCP_PM_SUBFLOW_ESTABLISHED)) { 956 - pm->status &= ~BIT(MPTCP_PM_SUBFLOW_ESTABLISHED); 957 - mptcp_pm_nl_subflow_established(msk); 958 - } 959 - 960 - spin_unlock_bh(&msk->pm.lock); 961 - } 962 - 963 - static bool address_use_port(struct mptcp_pm_addr_entry *entry) 964 - { 965 - return (entry->flags & 966 - (MPTCP_PM_ADDR_FLAG_SIGNAL | MPTCP_PM_ADDR_FLAG_SUBFLOW)) == 967 - MPTCP_PM_ADDR_FLAG_SIGNAL; 968 - } 969 - 970 - /* caller must ensure the RCU grace period is already elapsed */ 971 - static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry) 972 - { 973 - if (entry->lsk) 974 - sock_release(entry->lsk); 975 - kfree(entry); 976 - } 977 - 978 - static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet, 979 - struct mptcp_pm_addr_entry *entry, 980 - bool needs_id, bool replace) 981 - { 982 - struct mptcp_pm_addr_entry *cur, *del_entry = NULL; 983 - unsigned int addr_max; 984 - int ret = -EINVAL; 985 - 986 - spin_lock_bh(&pernet->lock); 987 - /* to keep the code simple, don't do IDR-like allocation for address ID, 988 - * just bail when we exceed limits 989 - */ 990 - if (pernet->next_id == MPTCP_PM_MAX_ADDR_ID) 991 - pernet->next_id = 1; 992 - if (pernet->addrs >= MPTCP_PM_ADDR_MAX) { 993 - ret = -ERANGE; 994 - goto out; 995 - } 996 - if (test_bit(entry->addr.id, pernet->id_bitmap)) { 997 - ret = -EBUSY; 998 - goto out; 999 - } 1000 - 1001 - /* do not insert duplicate address, differentiate on port only 1002 - * singled addresses 1003 - */ 1004 - if (!address_use_port(entry)) 1005 - entry->addr.port = 0; 1006 - list_for_each_entry(cur, &pernet->local_addr_list, list) { 1007 - if (mptcp_addresses_equal(&cur->addr, &entry->addr, 1008 - cur->addr.port || entry->addr.port)) { 1009 - /* allow replacing the exiting endpoint only if such 1010 - * endpoint is an implicit one and the user-space 1011 - * did not provide an endpoint id 1012 - */ 1013 - if (!(cur->flags & MPTCP_PM_ADDR_FLAG_IMPLICIT)) { 1014 - ret = -EEXIST; 1015 - goto out; 1016 - } 1017 - if (entry->addr.id) 1018 - goto out; 1019 - 1020 - /* allow callers that only need to look up the local 1021 - * addr's id to skip replacement. This allows them to 1022 - * avoid calling synchronize_rcu in the packet recv 1023 - * path. 1024 - */ 1025 - if (!replace) { 1026 - kfree(entry); 1027 - ret = cur->addr.id; 1028 - goto out; 1029 - } 1030 - 1031 - pernet->addrs--; 1032 - entry->addr.id = cur->addr.id; 1033 - list_del_rcu(&cur->list); 1034 - del_entry = cur; 1035 - break; 1036 - } 1037 - } 1038 - 1039 - if (!entry->addr.id && needs_id) { 1040 - find_next: 1041 - entry->addr.id = find_next_zero_bit(pernet->id_bitmap, 1042 - MPTCP_PM_MAX_ADDR_ID + 1, 1043 - pernet->next_id); 1044 - if (!entry->addr.id && pernet->next_id != 1) { 1045 - pernet->next_id = 1; 1046 - goto find_next; 1047 - } 1048 - } 1049 - 1050 - if (!entry->addr.id && needs_id) 1051 - goto out; 1052 - 1053 - __set_bit(entry->addr.id, pernet->id_bitmap); 1054 - if (entry->addr.id > pernet->next_id) 1055 - pernet->next_id = entry->addr.id; 1056 - 1057 - if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) { 1058 - addr_max = pernet->add_addr_signal_max; 1059 - WRITE_ONCE(pernet->add_addr_signal_max, addr_max + 1); 1060 - } 1061 - if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 1062 - addr_max = pernet->local_addr_max; 1063 - WRITE_ONCE(pernet->local_addr_max, addr_max + 1); 1064 - } 1065 - 1066 - pernet->addrs++; 1067 - if (!entry->addr.port) 1068 - list_add_tail_rcu(&entry->list, &pernet->local_addr_list); 1069 - else 1070 - list_add_rcu(&entry->list, &pernet->local_addr_list); 1071 - ret = entry->addr.id; 1072 - 1073 - out: 1074 - spin_unlock_bh(&pernet->lock); 1075 - 1076 - /* just replaced an existing entry, free it */ 1077 - if (del_entry) { 1078 - synchronize_rcu(); 1079 - __mptcp_pm_release_addr_entry(del_entry); 1080 - } 1081 - return ret; 1082 - } 1083 - 1084 - static struct lock_class_key mptcp_slock_keys[2]; 1085 - static struct lock_class_key mptcp_keys[2]; 1086 - 1087 - static int mptcp_pm_nl_create_listen_socket(struct sock *sk, 1088 - struct mptcp_pm_addr_entry *entry) 1089 - { 1090 - bool is_ipv6 = sk->sk_family == AF_INET6; 1091 - int addrlen = sizeof(struct sockaddr_in); 1092 - struct sockaddr_storage addr; 1093 - struct sock *newsk, *ssk; 1094 - int backlog = 1024; 1095 - int err; 1096 - 1097 - err = sock_create_kern(sock_net(sk), entry->addr.family, 1098 - SOCK_STREAM, IPPROTO_MPTCP, &entry->lsk); 1099 - if (err) 1100 - return err; 1101 - 1102 - newsk = entry->lsk->sk; 1103 - if (!newsk) 1104 - return -EINVAL; 1105 - 1106 - /* The subflow socket lock is acquired in a nested to the msk one 1107 - * in several places, even by the TCP stack, and this msk is a kernel 1108 - * socket: lockdep complains. Instead of propagating the _nested 1109 - * modifiers in several places, re-init the lock class for the msk 1110 - * socket to an mptcp specific one. 1111 - */ 1112 - sock_lock_init_class_and_name(newsk, 1113 - is_ipv6 ? "mlock-AF_INET6" : "mlock-AF_INET", 1114 - &mptcp_slock_keys[is_ipv6], 1115 - is_ipv6 ? "msk_lock-AF_INET6" : "msk_lock-AF_INET", 1116 - &mptcp_keys[is_ipv6]); 1117 - 1118 - lock_sock(newsk); 1119 - ssk = __mptcp_nmpc_sk(mptcp_sk(newsk)); 1120 - release_sock(newsk); 1121 - if (IS_ERR(ssk)) 1122 - return PTR_ERR(ssk); 1123 - 1124 - mptcp_info2sockaddr(&entry->addr, &addr, entry->addr.family); 1125 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 1126 - if (entry->addr.family == AF_INET6) 1127 - addrlen = sizeof(struct sockaddr_in6); 1128 - #endif 1129 - if (ssk->sk_family == AF_INET) 1130 - err = inet_bind_sk(ssk, (struct sockaddr *)&addr, addrlen); 1131 - #if IS_ENABLED(CONFIG_MPTCP_IPV6) 1132 - else if (ssk->sk_family == AF_INET6) 1133 - err = inet6_bind_sk(ssk, (struct sockaddr *)&addr, addrlen); 1134 - #endif 1135 - if (err) 1136 - return err; 1137 - 1138 - /* We don't use mptcp_set_state() here because it needs to be called 1139 - * under the msk socket lock. For the moment, that will not bring 1140 - * anything more than only calling inet_sk_state_store(), because the 1141 - * old status is known (TCP_CLOSE). 1142 - */ 1143 - inet_sk_state_store(newsk, TCP_LISTEN); 1144 - lock_sock(ssk); 1145 - WRITE_ONCE(mptcp_subflow_ctx(ssk)->pm_listener, true); 1146 - err = __inet_listen_sk(ssk, backlog); 1147 - if (!err) 1148 - mptcp_event_pm_listener(ssk, MPTCP_EVENT_LISTENER_CREATED); 1149 - release_sock(ssk); 1150 - return err; 1151 - } 1152 - 1153 - int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc) 1154 - { 1155 - struct mptcp_pm_addr_entry *entry; 1156 - struct pm_nl_pernet *pernet; 1157 - int ret; 1158 - 1159 - pernet = pm_nl_get_pernet_from_msk(msk); 1160 - 1161 - rcu_read_lock(); 1162 - entry = __lookup_addr(pernet, skc); 1163 - ret = entry ? entry->addr.id : -1; 1164 - rcu_read_unlock(); 1165 - if (ret >= 0) 1166 - return ret; 1167 - 1168 - /* address not found, add to local list */ 1169 - entry = kmalloc(sizeof(*entry), GFP_ATOMIC); 1170 - if (!entry) 1171 - return -ENOMEM; 1172 - 1173 - entry->addr = *skc; 1174 - entry->addr.id = 0; 1175 - entry->addr.port = 0; 1176 - entry->ifindex = 0; 1177 - entry->flags = MPTCP_PM_ADDR_FLAG_IMPLICIT; 1178 - entry->lsk = NULL; 1179 - ret = mptcp_pm_nl_append_new_local_addr(pernet, entry, true, false); 1180 - if (ret < 0) 1181 - kfree(entry); 1182 - 1183 - return ret; 1184 - } 1185 - 1186 - bool mptcp_pm_nl_is_backup(struct mptcp_sock *msk, struct mptcp_addr_info *skc) 1187 - { 1188 - struct pm_nl_pernet *pernet = pm_nl_get_pernet_from_msk(msk); 1189 - struct mptcp_pm_addr_entry *entry; 1190 - bool backup; 1191 - 1192 - rcu_read_lock(); 1193 - entry = __lookup_addr(pernet, skc); 1194 - backup = entry && !!(entry->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 1195 - rcu_read_unlock(); 1196 - 1197 - return backup; 1198 - } 1199 11 1200 12 #define MPTCP_PM_CMD_GRP_OFFSET 0 1201 13 #define MPTCP_PM_EV_GRP_OFFSET 1 ··· 18 1206 .flags = GENL_MCAST_CAP_NET_ADMIN, 19 1207 }, 20 1208 }; 21 - 22 - void mptcp_pm_nl_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk) 23 - { 24 - struct mptcp_subflow_context *iter, *subflow = mptcp_subflow_ctx(ssk); 25 - struct sock *sk = (struct sock *)msk; 26 - unsigned int active_max_loss_cnt; 27 - struct net *net = sock_net(sk); 28 - unsigned int stale_loss_cnt; 29 - bool slow; 30 - 31 - stale_loss_cnt = mptcp_stale_loss_cnt(net); 32 - if (subflow->stale || !stale_loss_cnt || subflow->stale_count <= stale_loss_cnt) 33 - return; 34 - 35 - /* look for another available subflow not in loss state */ 36 - active_max_loss_cnt = max_t(int, stale_loss_cnt - 1, 1); 37 - mptcp_for_each_subflow(msk, iter) { 38 - if (iter != subflow && mptcp_subflow_active(iter) && 39 - iter->stale_count < active_max_loss_cnt) { 40 - /* we have some alternatives, try to mark this subflow as idle ...*/ 41 - slow = lock_sock_fast(ssk); 42 - if (!tcp_rtx_and_write_queues_empty(ssk)) { 43 - subflow->stale = 1; 44 - __mptcp_retransmit_pending_data(sk); 45 - MPTCP_INC_STATS(net, MPTCP_MIB_SUBFLOWSTALE); 46 - } 47 - unlock_sock_fast(ssk, slow); 48 - 49 - /* always try to push the pending data regardless of re-injections: 50 - * we can possibly use backup subflows now, and subflow selection 51 - * is cheap under the msk socket lock 52 - */ 53 - __mptcp_push_pending(sk, 0); 54 - return; 55 - } 56 - } 57 - } 58 1209 59 1210 static int mptcp_pm_family_to_addr(int family) 60 1211 { ··· 127 1352 return 0; 128 1353 } 129 1354 130 - static struct pm_nl_pernet *genl_info_pm_nl(struct genl_info *info) 131 - { 132 - return pm_nl_get_pernet(genl_info_net(info)); 133 - } 134 - 135 - static int mptcp_nl_add_subflow_or_signal_addr(struct net *net, 136 - struct mptcp_addr_info *addr) 137 - { 138 - struct mptcp_sock *msk; 139 - long s_slot = 0, s_num = 0; 140 - 141 - while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 142 - struct sock *sk = (struct sock *)msk; 143 - struct mptcp_addr_info mpc_addr; 144 - 145 - if (!READ_ONCE(msk->fully_established) || 146 - mptcp_pm_is_userspace(msk)) 147 - goto next; 148 - 149 - /* if the endp linked to the init sf is re-added with a != ID */ 150 - mptcp_local_address((struct sock_common *)msk, &mpc_addr); 151 - 152 - lock_sock(sk); 153 - spin_lock_bh(&msk->pm.lock); 154 - if (mptcp_addresses_equal(addr, &mpc_addr, addr->port)) 155 - msk->mpc_endpoint_id = addr->id; 156 - mptcp_pm_create_subflow_or_signal_addr(msk); 157 - spin_unlock_bh(&msk->pm.lock); 158 - release_sock(sk); 159 - 160 - next: 161 - sock_put(sk); 162 - cond_resched(); 163 - } 164 - 165 - return 0; 166 - } 167 - 168 - static bool mptcp_pm_has_addr_attr_id(const struct nlattr *attr, 169 - struct genl_info *info) 170 - { 171 - struct nlattr *tb[MPTCP_PM_ADDR_ATTR_MAX + 1]; 172 - 173 - if (!nla_parse_nested_deprecated(tb, MPTCP_PM_ADDR_ATTR_MAX, attr, 174 - mptcp_pm_address_nl_policy, info->extack) && 175 - tb[MPTCP_PM_ADDR_ATTR_ID]) 176 - return true; 177 - return false; 178 - } 179 - 180 - int mptcp_pm_nl_add_addr_doit(struct sk_buff *skb, struct genl_info *info) 181 - { 182 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 183 - struct mptcp_pm_addr_entry addr, *entry; 184 - struct nlattr *attr; 185 - int ret; 186 - 187 - if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 188 - return -EINVAL; 189 - 190 - attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 191 - ret = mptcp_pm_parse_entry(attr, info, true, &addr); 192 - if (ret < 0) 193 - return ret; 194 - 195 - if (addr.addr.port && !address_use_port(&addr)) { 196 - NL_SET_ERR_MSG_ATTR(info->extack, attr, 197 - "flags must have signal and not subflow when using port"); 198 - return -EINVAL; 199 - } 200 - 201 - if (addr.flags & MPTCP_PM_ADDR_FLAG_SIGNAL && 202 - addr.flags & MPTCP_PM_ADDR_FLAG_FULLMESH) { 203 - NL_SET_ERR_MSG_ATTR(info->extack, attr, 204 - "flags mustn't have both signal and fullmesh"); 205 - return -EINVAL; 206 - } 207 - 208 - if (addr.flags & MPTCP_PM_ADDR_FLAG_IMPLICIT) { 209 - NL_SET_ERR_MSG_ATTR(info->extack, attr, 210 - "can't create IMPLICIT endpoint"); 211 - return -EINVAL; 212 - } 213 - 214 - entry = kzalloc(sizeof(*entry), GFP_KERNEL_ACCOUNT); 215 - if (!entry) { 216 - GENL_SET_ERR_MSG(info, "can't allocate addr"); 217 - return -ENOMEM; 218 - } 219 - 220 - *entry = addr; 221 - if (entry->addr.port) { 222 - ret = mptcp_pm_nl_create_listen_socket(skb->sk, entry); 223 - if (ret) { 224 - GENL_SET_ERR_MSG_FMT(info, "create listen socket error: %d", ret); 225 - goto out_free; 226 - } 227 - } 228 - ret = mptcp_pm_nl_append_new_local_addr(pernet, entry, 229 - !mptcp_pm_has_addr_attr_id(attr, info), 230 - true); 231 - if (ret < 0) { 232 - GENL_SET_ERR_MSG_FMT(info, "too many addresses or duplicate one: %d", ret); 233 - goto out_free; 234 - } 235 - 236 - mptcp_nl_add_subflow_or_signal_addr(sock_net(skb->sk), &entry->addr); 237 - return 0; 238 - 239 - out_free: 240 - __mptcp_pm_release_addr_entry(entry); 241 - return ret; 242 - } 243 - 244 - bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk, 245 - const struct mptcp_addr_info *addr) 246 - { 247 - struct mptcp_pm_add_entry *entry; 248 - 249 - entry = mptcp_pm_del_add_timer(msk, addr, false); 250 - if (entry) { 251 - kfree(entry); 252 - return true; 253 - } 254 - 255 - return false; 256 - } 257 - 258 - static u8 mptcp_endp_get_local_id(struct mptcp_sock *msk, 259 - const struct mptcp_addr_info *addr) 260 - { 261 - return msk->mpc_endpoint_id == addr->id ? 0 : addr->id; 262 - } 263 - 264 - static bool mptcp_pm_remove_anno_addr(struct mptcp_sock *msk, 265 - const struct mptcp_addr_info *addr, 266 - bool force) 267 - { 268 - struct mptcp_rm_list list = { .nr = 0 }; 269 - bool ret; 270 - 271 - list.ids[list.nr++] = mptcp_endp_get_local_id(msk, addr); 272 - 273 - ret = mptcp_remove_anno_list_by_saddr(msk, addr); 274 - if (ret || force) { 275 - spin_lock_bh(&msk->pm.lock); 276 - if (ret) { 277 - __set_bit(addr->id, msk->pm.id_avail_bitmap); 278 - msk->pm.add_addr_signaled--; 279 - } 280 - mptcp_pm_remove_addr(msk, &list); 281 - spin_unlock_bh(&msk->pm.lock); 282 - } 283 - return ret; 284 - } 285 - 286 - static void __mark_subflow_endp_available(struct mptcp_sock *msk, u8 id) 287 - { 288 - /* If it was marked as used, and not ID 0, decrement local_addr_used */ 289 - if (!__test_and_set_bit(id ? : msk->mpc_endpoint_id, msk->pm.id_avail_bitmap) && 290 - id && !WARN_ON_ONCE(msk->pm.local_addr_used == 0)) 291 - msk->pm.local_addr_used--; 292 - } 293 - 294 - static int mptcp_nl_remove_subflow_and_signal_addr(struct net *net, 295 - const struct mptcp_pm_addr_entry *entry) 296 - { 297 - const struct mptcp_addr_info *addr = &entry->addr; 298 - struct mptcp_rm_list list = { .nr = 1 }; 299 - long s_slot = 0, s_num = 0; 300 - struct mptcp_sock *msk; 301 - 302 - pr_debug("remove_id=%d\n", addr->id); 303 - 304 - while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 305 - struct sock *sk = (struct sock *)msk; 306 - bool remove_subflow; 307 - 308 - if (mptcp_pm_is_userspace(msk)) 309 - goto next; 310 - 311 - lock_sock(sk); 312 - remove_subflow = mptcp_lookup_subflow_by_saddr(&msk->conn_list, addr); 313 - mptcp_pm_remove_anno_addr(msk, addr, remove_subflow && 314 - !(entry->flags & MPTCP_PM_ADDR_FLAG_IMPLICIT)); 315 - 316 - list.ids[0] = mptcp_endp_get_local_id(msk, addr); 317 - if (remove_subflow) { 318 - spin_lock_bh(&msk->pm.lock); 319 - mptcp_pm_nl_rm_subflow_received(msk, &list); 320 - spin_unlock_bh(&msk->pm.lock); 321 - } 322 - 323 - if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 324 - spin_lock_bh(&msk->pm.lock); 325 - __mark_subflow_endp_available(msk, list.ids[0]); 326 - spin_unlock_bh(&msk->pm.lock); 327 - } 328 - 329 - if (msk->mpc_endpoint_id == entry->addr.id) 330 - msk->mpc_endpoint_id = 0; 331 - release_sock(sk); 332 - 333 - next: 334 - sock_put(sk); 335 - cond_resched(); 336 - } 337 - 338 - return 0; 339 - } 340 - 341 - static int mptcp_nl_remove_id_zero_address(struct net *net, 342 - struct mptcp_addr_info *addr) 343 - { 344 - struct mptcp_rm_list list = { .nr = 0 }; 345 - long s_slot = 0, s_num = 0; 346 - struct mptcp_sock *msk; 347 - 348 - list.ids[list.nr++] = 0; 349 - 350 - while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 351 - struct sock *sk = (struct sock *)msk; 352 - struct mptcp_addr_info msk_local; 353 - 354 - if (list_empty(&msk->conn_list) || mptcp_pm_is_userspace(msk)) 355 - goto next; 356 - 357 - mptcp_local_address((struct sock_common *)msk, &msk_local); 358 - if (!mptcp_addresses_equal(&msk_local, addr, addr->port)) 359 - goto next; 360 - 361 - lock_sock(sk); 362 - spin_lock_bh(&msk->pm.lock); 363 - mptcp_pm_remove_addr(msk, &list); 364 - mptcp_pm_nl_rm_subflow_received(msk, &list); 365 - __mark_subflow_endp_available(msk, 0); 366 - spin_unlock_bh(&msk->pm.lock); 367 - release_sock(sk); 368 - 369 - next: 370 - sock_put(sk); 371 - cond_resched(); 372 - } 373 - 374 - return 0; 375 - } 376 - 377 - int mptcp_pm_nl_del_addr_doit(struct sk_buff *skb, struct genl_info *info) 378 - { 379 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 380 - struct mptcp_pm_addr_entry addr, *entry; 381 - unsigned int addr_max; 382 - struct nlattr *attr; 383 - int ret; 384 - 385 - if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 386 - return -EINVAL; 387 - 388 - attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 389 - ret = mptcp_pm_parse_entry(attr, info, false, &addr); 390 - if (ret < 0) 391 - return ret; 392 - 393 - /* the zero id address is special: the first address used by the msk 394 - * always gets such an id, so different subflows can have different zero 395 - * id addresses. Additionally zero id is not accounted for in id_bitmap. 396 - * Let's use an 'mptcp_rm_list' instead of the common remove code. 397 - */ 398 - if (addr.addr.id == 0) 399 - return mptcp_nl_remove_id_zero_address(sock_net(skb->sk), &addr.addr); 400 - 401 - spin_lock_bh(&pernet->lock); 402 - entry = __lookup_addr_by_id(pernet, addr.addr.id); 403 - if (!entry) { 404 - NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 405 - spin_unlock_bh(&pernet->lock); 406 - return -EINVAL; 407 - } 408 - if (entry->flags & MPTCP_PM_ADDR_FLAG_SIGNAL) { 409 - addr_max = pernet->add_addr_signal_max; 410 - WRITE_ONCE(pernet->add_addr_signal_max, addr_max - 1); 411 - } 412 - if (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW) { 413 - addr_max = pernet->local_addr_max; 414 - WRITE_ONCE(pernet->local_addr_max, addr_max - 1); 415 - } 416 - 417 - pernet->addrs--; 418 - list_del_rcu(&entry->list); 419 - __clear_bit(entry->addr.id, pernet->id_bitmap); 420 - spin_unlock_bh(&pernet->lock); 421 - 422 - mptcp_nl_remove_subflow_and_signal_addr(sock_net(skb->sk), entry); 423 - synchronize_rcu(); 424 - __mptcp_pm_release_addr_entry(entry); 425 - 426 - return ret; 427 - } 428 - 429 - static void mptcp_pm_flush_addrs_and_subflows(struct mptcp_sock *msk, 430 - struct list_head *rm_list) 431 - { 432 - struct mptcp_rm_list alist = { .nr = 0 }, slist = { .nr = 0 }; 433 - struct mptcp_pm_addr_entry *entry; 434 - 435 - list_for_each_entry(entry, rm_list, list) { 436 - if (slist.nr < MPTCP_RM_IDS_MAX && 437 - mptcp_lookup_subflow_by_saddr(&msk->conn_list, &entry->addr)) 438 - slist.ids[slist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr); 439 - 440 - if (alist.nr < MPTCP_RM_IDS_MAX && 441 - mptcp_remove_anno_list_by_saddr(msk, &entry->addr)) 442 - alist.ids[alist.nr++] = mptcp_endp_get_local_id(msk, &entry->addr); 443 - } 444 - 445 - spin_lock_bh(&msk->pm.lock); 446 - if (alist.nr) { 447 - msk->pm.add_addr_signaled -= alist.nr; 448 - mptcp_pm_remove_addr(msk, &alist); 449 - } 450 - if (slist.nr) 451 - mptcp_pm_nl_rm_subflow_received(msk, &slist); 452 - /* Reset counters: maybe some subflows have been removed before */ 453 - bitmap_fill(msk->pm.id_avail_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 454 - msk->pm.local_addr_used = 0; 455 - spin_unlock_bh(&msk->pm.lock); 456 - } 457 - 458 - static void mptcp_nl_flush_addrs_list(struct net *net, 459 - struct list_head *rm_list) 460 - { 461 - long s_slot = 0, s_num = 0; 462 - struct mptcp_sock *msk; 463 - 464 - if (list_empty(rm_list)) 465 - return; 466 - 467 - while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 468 - struct sock *sk = (struct sock *)msk; 469 - 470 - if (!mptcp_pm_is_userspace(msk)) { 471 - lock_sock(sk); 472 - mptcp_pm_flush_addrs_and_subflows(msk, rm_list); 473 - release_sock(sk); 474 - } 475 - 476 - sock_put(sk); 477 - cond_resched(); 478 - } 479 - } 480 - 481 - /* caller must ensure the RCU grace period is already elapsed */ 482 - static void __flush_addrs(struct list_head *list) 483 - { 484 - while (!list_empty(list)) { 485 - struct mptcp_pm_addr_entry *cur; 486 - 487 - cur = list_entry(list->next, 488 - struct mptcp_pm_addr_entry, list); 489 - list_del_rcu(&cur->list); 490 - __mptcp_pm_release_addr_entry(cur); 491 - } 492 - } 493 - 494 - static void __reset_counters(struct pm_nl_pernet *pernet) 495 - { 496 - WRITE_ONCE(pernet->add_addr_signal_max, 0); 497 - WRITE_ONCE(pernet->add_addr_accept_max, 0); 498 - WRITE_ONCE(pernet->local_addr_max, 0); 499 - pernet->addrs = 0; 500 - } 501 - 502 - int mptcp_pm_nl_flush_addrs_doit(struct sk_buff *skb, struct genl_info *info) 503 - { 504 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 505 - LIST_HEAD(free_list); 506 - 507 - spin_lock_bh(&pernet->lock); 508 - list_splice_init(&pernet->local_addr_list, &free_list); 509 - __reset_counters(pernet); 510 - pernet->next_id = 1; 511 - bitmap_zero(pernet->id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); 512 - spin_unlock_bh(&pernet->lock); 513 - mptcp_nl_flush_addrs_list(sock_net(skb->sk), &free_list); 514 - synchronize_rcu(); 515 - __flush_addrs(&free_list); 516 - return 0; 517 - } 518 - 519 - int mptcp_nl_fill_addr(struct sk_buff *skb, 520 - struct mptcp_pm_addr_entry *entry) 1355 + static int mptcp_nl_fill_addr(struct sk_buff *skb, 1356 + struct mptcp_pm_addr_entry *entry) 521 1357 { 522 1358 struct mptcp_addr_info *addr = &entry->addr; 523 1359 struct nlattr *attr; ··· 166 1780 return -EMSGSIZE; 167 1781 } 168 1782 169 - int mptcp_pm_nl_get_addr(u8 id, struct mptcp_pm_addr_entry *addr, 170 - struct genl_info *info) 1783 + static int mptcp_pm_get_addr(u8 id, struct mptcp_pm_addr_entry *addr, 1784 + struct genl_info *info) 171 1785 { 172 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 173 - struct mptcp_pm_addr_entry *entry; 174 - int ret = -EINVAL; 175 - 176 - rcu_read_lock(); 177 - entry = __lookup_addr_by_id(pernet, id); 178 - if (entry) { 179 - *addr = *entry; 180 - ret = 0; 181 - } 182 - rcu_read_unlock(); 183 - 184 - return ret; 1786 + if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 1787 + return mptcp_userspace_pm_get_addr(id, addr, info); 1788 + return mptcp_pm_nl_get_addr(id, addr, info); 185 1789 } 186 1790 187 - int mptcp_pm_nl_dump_addr(struct sk_buff *msg, 188 - struct netlink_callback *cb) 1791 + int mptcp_pm_nl_get_addr_doit(struct sk_buff *skb, struct genl_info *info) 189 1792 { 190 - struct net *net = sock_net(msg->sk); 191 - struct mptcp_pm_addr_entry *entry; 192 - struct pm_nl_pernet *pernet; 193 - int id = cb->args[0]; 194 - int i; 195 - 196 - pernet = pm_nl_get_pernet(net); 197 - 198 - rcu_read_lock(); 199 - for (i = id; i < MPTCP_PM_MAX_ADDR_ID + 1; i++) { 200 - if (test_bit(i, pernet->id_bitmap)) { 201 - entry = __lookup_addr_by_id(pernet, i); 202 - if (!entry) 203 - break; 204 - 205 - if (entry->addr.id <= id) 206 - continue; 207 - 208 - if (mptcp_pm_genl_fill_addr(msg, cb, entry) < 0) 209 - break; 210 - 211 - id = entry->addr.id; 212 - } 213 - } 214 - rcu_read_unlock(); 215 - 216 - cb->args[0] = id; 217 - return msg->len; 218 - } 219 - 220 - static int parse_limit(struct genl_info *info, int id, unsigned int *limit) 221 - { 222 - struct nlattr *attr = info->attrs[id]; 223 - 224 - if (!attr) 225 - return 0; 226 - 227 - *limit = nla_get_u32(attr); 228 - if (*limit > MPTCP_PM_ADDR_MAX) { 229 - NL_SET_ERR_MSG_ATTR_FMT(info->extack, attr, 230 - "limit greater than maximum (%u)", 231 - MPTCP_PM_ADDR_MAX); 232 - return -EINVAL; 233 - } 234 - return 0; 235 - } 236 - 237 - int mptcp_pm_nl_set_limits_doit(struct sk_buff *skb, struct genl_info *info) 238 - { 239 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 240 - unsigned int rcv_addrs, subflows; 241 - int ret; 242 - 243 - spin_lock_bh(&pernet->lock); 244 - rcv_addrs = pernet->add_addr_accept_max; 245 - ret = parse_limit(info, MPTCP_PM_ATTR_RCV_ADD_ADDRS, &rcv_addrs); 246 - if (ret) 247 - goto unlock; 248 - 249 - subflows = pernet->subflows_max; 250 - ret = parse_limit(info, MPTCP_PM_ATTR_SUBFLOWS, &subflows); 251 - if (ret) 252 - goto unlock; 253 - 254 - WRITE_ONCE(pernet->add_addr_accept_max, rcv_addrs); 255 - WRITE_ONCE(pernet->subflows_max, subflows); 256 - 257 - unlock: 258 - spin_unlock_bh(&pernet->lock); 259 - return ret; 260 - } 261 - 262 - int mptcp_pm_nl_get_limits_doit(struct sk_buff *skb, struct genl_info *info) 263 - { 264 - struct pm_nl_pernet *pernet = genl_info_pm_nl(info); 1793 + struct mptcp_pm_addr_entry addr; 1794 + struct nlattr *attr; 265 1795 struct sk_buff *msg; 266 1796 void *reply; 1797 + int ret; 1798 + 1799 + if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ENDPOINT_ADDR)) 1800 + return -EINVAL; 1801 + 1802 + attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; 1803 + ret = mptcp_pm_parse_entry(attr, info, false, &addr); 1804 + if (ret < 0) 1805 + return ret; 267 1806 268 1807 msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); 269 1808 if (!msg) 270 1809 return -ENOMEM; 271 1810 272 1811 reply = genlmsg_put_reply(msg, info, &mptcp_genl_family, 0, 273 - MPTCP_PM_CMD_GET_LIMITS); 274 - if (!reply) 1812 + info->genlhdr->cmd); 1813 + if (!reply) { 1814 + GENL_SET_ERR_MSG(info, "not enough space in Netlink message"); 1815 + ret = -EMSGSIZE; 275 1816 goto fail; 1817 + } 276 1818 277 - if (nla_put_u32(msg, MPTCP_PM_ATTR_RCV_ADD_ADDRS, 278 - READ_ONCE(pernet->add_addr_accept_max))) 1819 + ret = mptcp_pm_get_addr(addr.addr.id, &addr, info); 1820 + if (ret) { 1821 + NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 279 1822 goto fail; 1823 + } 280 1824 281 - if (nla_put_u32(msg, MPTCP_PM_ATTR_SUBFLOWS, 282 - READ_ONCE(pernet->subflows_max))) 1825 + ret = mptcp_nl_fill_addr(msg, &addr); 1826 + if (ret) 283 1827 goto fail; 284 1828 285 1829 genlmsg_end(msg, reply); 286 - return genlmsg_reply(msg, info); 1830 + ret = genlmsg_reply(msg, info); 1831 + return ret; 287 1832 288 1833 fail: 289 - GENL_SET_ERR_MSG(info, "not enough space in Netlink message"); 290 1834 nlmsg_free(msg); 291 - return -EMSGSIZE; 1835 + return ret; 292 1836 } 293 1837 294 - static void mptcp_pm_nl_fullmesh(struct mptcp_sock *msk, 295 - struct mptcp_addr_info *addr) 1838 + int mptcp_pm_genl_fill_addr(struct sk_buff *msg, 1839 + struct netlink_callback *cb, 1840 + struct mptcp_pm_addr_entry *entry) 296 1841 { 297 - struct mptcp_rm_list list = { .nr = 0 }; 1842 + void *hdr; 298 1843 299 - list.ids[list.nr++] = mptcp_endp_get_local_id(msk, addr); 300 - 301 - spin_lock_bh(&msk->pm.lock); 302 - mptcp_pm_nl_rm_subflow_received(msk, &list); 303 - __mark_subflow_endp_available(msk, list.ids[0]); 304 - mptcp_pm_create_subflow_or_signal_addr(msk); 305 - spin_unlock_bh(&msk->pm.lock); 306 - } 307 - 308 - static void mptcp_nl_set_flags(struct net *net, 309 - struct mptcp_pm_addr_entry *local, 310 - u8 changed) 311 - { 312 - u8 is_subflow = !!(local->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW); 313 - u8 bkup = !!(local->flags & MPTCP_PM_ADDR_FLAG_BACKUP); 314 - long s_slot = 0, s_num = 0; 315 - struct mptcp_sock *msk; 316 - 317 - if (changed == MPTCP_PM_ADDR_FLAG_FULLMESH && !is_subflow) 318 - return; 319 - 320 - while ((msk = mptcp_token_iter_next(net, &s_slot, &s_num)) != NULL) { 321 - struct sock *sk = (struct sock *)msk; 322 - 323 - if (list_empty(&msk->conn_list) || mptcp_pm_is_userspace(msk)) 324 - goto next; 325 - 326 - lock_sock(sk); 327 - if (changed & MPTCP_PM_ADDR_FLAG_BACKUP) 328 - mptcp_pm_nl_mp_prio_send_ack(msk, &local->addr, NULL, bkup); 329 - /* Subflows will only be recreated if the SUBFLOW flag is set */ 330 - if (is_subflow && (changed & MPTCP_PM_ADDR_FLAG_FULLMESH)) 331 - mptcp_pm_nl_fullmesh(msk, &local->addr); 332 - release_sock(sk); 333 - 334 - next: 335 - sock_put(sk); 336 - cond_resched(); 337 - } 338 - 339 - return; 340 - } 341 - 342 - int mptcp_pm_nl_set_flags(struct mptcp_pm_addr_entry *local, 343 - struct genl_info *info) 344 - { 345 - struct nlattr *attr = info->attrs[MPTCP_PM_ATTR_ADDR]; 346 - u8 changed, mask = MPTCP_PM_ADDR_FLAG_BACKUP | 347 - MPTCP_PM_ADDR_FLAG_FULLMESH; 348 - struct net *net = genl_info_net(info); 349 - struct mptcp_pm_addr_entry *entry; 350 - struct pm_nl_pernet *pernet; 351 - u8 lookup_by_id = 0; 352 - 353 - pernet = pm_nl_get_pernet(net); 354 - 355 - if (local->addr.family == AF_UNSPEC) { 356 - lookup_by_id = 1; 357 - if (!local->addr.id) { 358 - NL_SET_ERR_MSG_ATTR(info->extack, attr, 359 - "missing address ID"); 360 - return -EOPNOTSUPP; 361 - } 362 - } 363 - 364 - spin_lock_bh(&pernet->lock); 365 - entry = lookup_by_id ? __lookup_addr_by_id(pernet, local->addr.id) : 366 - __lookup_addr(pernet, &local->addr); 367 - if (!entry) { 368 - spin_unlock_bh(&pernet->lock); 369 - NL_SET_ERR_MSG_ATTR(info->extack, attr, "address not found"); 1844 + hdr = genlmsg_put(msg, NETLINK_CB(cb->skb).portid, 1845 + cb->nlh->nlmsg_seq, &mptcp_genl_family, 1846 + NLM_F_MULTI, MPTCP_PM_CMD_GET_ADDR); 1847 + if (!hdr) 370 1848 return -EINVAL; 371 - } 372 - if ((local->flags & MPTCP_PM_ADDR_FLAG_FULLMESH) && 373 - (entry->flags & (MPTCP_PM_ADDR_FLAG_SIGNAL | 374 - MPTCP_PM_ADDR_FLAG_IMPLICIT))) { 375 - spin_unlock_bh(&pernet->lock); 376 - NL_SET_ERR_MSG_ATTR(info->extack, attr, "invalid addr flags"); 1849 + 1850 + if (mptcp_nl_fill_addr(msg, entry) < 0) { 1851 + genlmsg_cancel(msg, hdr); 377 1852 return -EINVAL; 378 1853 } 379 1854 380 - changed = (local->flags ^ entry->flags) & mask; 381 - entry->flags = (entry->flags & ~mask) | (local->flags & mask); 382 - *local = *entry; 383 - spin_unlock_bh(&pernet->lock); 384 - 385 - mptcp_nl_set_flags(net, local, changed); 1855 + genlmsg_end(msg, hdr); 386 1856 return 0; 1857 + } 1858 + 1859 + static int mptcp_pm_dump_addr(struct sk_buff *msg, struct netlink_callback *cb) 1860 + { 1861 + const struct genl_info *info = genl_info_dump(cb); 1862 + 1863 + if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 1864 + return mptcp_userspace_pm_dump_addr(msg, cb); 1865 + return mptcp_pm_nl_dump_addr(msg, cb); 1866 + } 1867 + 1868 + int mptcp_pm_nl_get_addr_dumpit(struct sk_buff *msg, 1869 + struct netlink_callback *cb) 1870 + { 1871 + return mptcp_pm_dump_addr(msg, cb); 1872 + } 1873 + 1874 + static int mptcp_pm_set_flags(struct genl_info *info) 1875 + { 1876 + struct mptcp_pm_addr_entry loc = { .addr = { .family = AF_UNSPEC }, }; 1877 + struct nlattr *attr_loc; 1878 + int ret = -EINVAL; 1879 + 1880 + if (GENL_REQ_ATTR_CHECK(info, MPTCP_PM_ATTR_ADDR)) 1881 + return ret; 1882 + 1883 + attr_loc = info->attrs[MPTCP_PM_ATTR_ADDR]; 1884 + ret = mptcp_pm_parse_entry(attr_loc, info, false, &loc); 1885 + if (ret < 0) 1886 + return ret; 1887 + 1888 + if (info->attrs[MPTCP_PM_ATTR_TOKEN]) 1889 + return mptcp_userspace_pm_set_flags(&loc, info); 1890 + return mptcp_pm_nl_set_flags(&loc, info); 1891 + } 1892 + 1893 + int mptcp_pm_nl_set_flags_doit(struct sk_buff *skb, struct genl_info *info) 1894 + { 1895 + return mptcp_pm_set_flags(info); 387 1896 } 388 1897 389 1898 static void mptcp_nl_mcast_send(struct net *net, struct sk_buff *nlskb, gfp_t gfp) ··· 625 2344 .mcgrps = mptcp_pm_mcgrps, 626 2345 .n_mcgrps = ARRAY_SIZE(mptcp_pm_mcgrps), 627 2346 }; 628 - 629 - static int __net_init pm_nl_init_net(struct net *net) 630 - { 631 - struct pm_nl_pernet *pernet = pm_nl_get_pernet(net); 632 - 633 - INIT_LIST_HEAD_RCU(&pernet->local_addr_list); 634 - 635 - /* Cit. 2 subflows ought to be enough for anybody. */ 636 - pernet->subflows_max = 2; 637 - pernet->next_id = 1; 638 - pernet->stale_loss_cnt = 4; 639 - spin_lock_init(&pernet->lock); 640 - 641 - /* No need to initialize other pernet fields, the struct is zeroed at 642 - * allocation time. 643 - */ 644 - 645 - return 0; 646 - } 647 - 648 - static void __net_exit pm_nl_exit_net(struct list_head *net_list) 649 - { 650 - struct net *net; 651 - 652 - list_for_each_entry(net, net_list, exit_list) { 653 - struct pm_nl_pernet *pernet = pm_nl_get_pernet(net); 654 - 655 - /* net is removed from namespace list, can't race with 656 - * other modifiers, also netns core already waited for a 657 - * RCU grace period. 658 - */ 659 - __flush_addrs(&pernet->local_addr_list); 660 - } 661 - } 662 - 663 - static struct pernet_operations mptcp_pm_pernet_ops = { 664 - .init = pm_nl_init_net, 665 - .exit_batch = pm_nl_exit_net, 666 - .id = &pm_nl_pernet_id, 667 - .size = sizeof(struct pm_nl_pernet), 668 - }; 669 - 670 - void __init mptcp_pm_nl_init(void) 671 - { 672 - if (register_pernet_subsys(&mptcp_pm_pernet_ops) < 0) 673 - panic("Failed to register MPTCP PM pernet subsystem.\n"); 674 - 675 - if (genl_register_family(&mptcp_genl_family)) 676 - panic("Failed to register MPTCP PM netlink family\n"); 677 - }
+10 -18
net/mptcp/pm_userspace.c
··· 12 12 list_for_each_entry(__entry, \ 13 13 &((__msk)->pm.userspace_pm_local_addr_list), list) 14 14 15 - void mptcp_free_local_addr_list(struct mptcp_sock *msk) 15 + void mptcp_userspace_pm_free_local_addr_list(struct mptcp_sock *msk) 16 16 { 17 17 struct mptcp_pm_addr_entry *entry, *tmp; 18 18 struct sock *sk = (struct sock *)msk; 19 19 LIST_HEAD(free_list); 20 - 21 - if (!mptcp_pm_is_userspace(msk)) 22 - return; 23 20 24 21 spin_lock_bh(&msk->pm.lock); 25 22 list_splice_init(&msk->pm.userspace_pm_local_addr_list, &free_list); ··· 127 130 } 128 131 129 132 int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk, 130 - struct mptcp_addr_info *skc) 133 + struct mptcp_pm_addr_entry *skc) 131 134 { 132 - struct mptcp_pm_addr_entry *entry = NULL, new_entry; 133 135 __be16 msk_sport = ((struct inet_sock *) 134 136 inet_sk((struct sock *)msk))->inet_sport; 137 + struct mptcp_pm_addr_entry *entry; 135 138 136 139 spin_lock_bh(&msk->pm.lock); 137 - entry = mptcp_userspace_pm_lookup_addr(msk, skc); 140 + entry = mptcp_userspace_pm_lookup_addr(msk, &skc->addr); 138 141 spin_unlock_bh(&msk->pm.lock); 139 142 if (entry) 140 143 return entry->addr.id; 141 144 142 - memset(&new_entry, 0, sizeof(struct mptcp_pm_addr_entry)); 143 - new_entry.addr = *skc; 144 - new_entry.addr.id = 0; 145 - new_entry.flags = MPTCP_PM_ADDR_FLAG_IMPLICIT; 145 + if (skc->addr.port == msk_sport) 146 + skc->addr.port = 0; 146 147 147 - if (new_entry.addr.port == msk_sport) 148 - new_entry.addr.port = 0; 149 - 150 - return mptcp_userspace_pm_append_new_local_addr(msk, &new_entry, true); 148 + return mptcp_userspace_pm_append_new_local_addr(msk, skc, true); 151 149 } 152 150 153 151 bool mptcp_userspace_pm_is_backup(struct mptcp_sock *msk, ··· 231 239 if (mptcp_pm_alloc_anno_list(msk, &addr_val.addr)) { 232 240 msk->pm.add_addr_signaled++; 233 241 mptcp_pm_announce_addr(msk, &addr_val.addr, false); 234 - mptcp_pm_nl_addr_send_ack(msk); 242 + mptcp_pm_addr_send_ack(msk); 235 243 } 236 244 237 245 spin_unlock_bh(&msk->pm.lock); ··· 602 610 spin_unlock_bh(&msk->pm.lock); 603 611 604 612 lock_sock(sk); 605 - ret = mptcp_pm_nl_mp_prio_send_ack(msk, &local->addr, &rem, bkup); 613 + ret = mptcp_pm_mp_prio_send_ack(msk, &local->addr, &rem, bkup); 606 614 release_sock(sk); 607 615 608 - /* mptcp_pm_nl_mp_prio_send_ack() only fails in one case */ 616 + /* mptcp_pm_mp_prio_send_ack() only fails in one case */ 609 617 if (ret < 0) 610 618 GENL_SET_ERR_MSG(info, "subflow not found"); 611 619
+2 -3
net/mptcp/protocol.c
··· 2681 2681 2682 2682 mptcp_check_fastclose(msk); 2683 2683 2684 - mptcp_pm_nl_work(msk); 2684 + mptcp_pm_worker(msk); 2685 2685 2686 2686 mptcp_check_send_data_fin(sk); 2687 2687 mptcp_check_data_fin_ack(sk); ··· 3302 3302 * inet_sock_destruct() will dispose it 3303 3303 */ 3304 3304 mptcp_token_destroy(msk); 3305 - mptcp_pm_free_anno_list(msk); 3306 - mptcp_free_local_addr_list(msk); 3305 + mptcp_pm_destroy(msk); 3307 3306 } 3308 3307 3309 3308 static void mptcp_destroy(struct sock *sk)
+23 -19
net/mptcp/protocol.h
··· 724 724 725 725 bool mptcp_addresses_equal(const struct mptcp_addr_info *a, 726 726 const struct mptcp_addr_info *b, bool use_port); 727 - void mptcp_local_address(const struct sock_common *skc, struct mptcp_addr_info *addr); 727 + void mptcp_local_address(const struct sock_common *skc, 728 + struct mptcp_addr_info *addr); 729 + void mptcp_remote_address(const struct sock_common *skc, 730 + struct mptcp_addr_info *addr); 728 731 729 732 /* called with sk socket lock held */ 730 733 int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_pm_local *local, ··· 986 983 void __init mptcp_pm_init(void); 987 984 void mptcp_pm_data_init(struct mptcp_sock *msk); 988 985 void mptcp_pm_data_reset(struct mptcp_sock *msk); 986 + void mptcp_pm_destroy(struct mptcp_sock *msk); 989 987 int mptcp_pm_parse_addr(struct nlattr *attr, struct genl_info *info, 990 988 struct mptcp_addr_info *addr); 991 989 int mptcp_pm_parse_entry(struct nlattr *attr, struct genl_info *info, ··· 996 992 const struct mptcp_addr_info *loc, 997 993 const struct mptcp_addr_info *rem); 998 994 void mptcp_pm_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk); 999 - void mptcp_pm_nl_subflow_chk_stale(const struct mptcp_sock *msk, struct sock *ssk); 1000 995 void mptcp_pm_new_connection(struct mptcp_sock *msk, const struct sock *ssk, int server_side); 1001 996 void mptcp_pm_fully_established(struct mptcp_sock *msk, const struct sock *ssk); 1002 997 bool mptcp_pm_allow_new_subflow(struct mptcp_sock *msk); ··· 1009 1006 void mptcp_pm_add_addr_echoed(struct mptcp_sock *msk, 1010 1007 const struct mptcp_addr_info *addr); 1011 1008 void mptcp_pm_add_addr_send_ack(struct mptcp_sock *msk); 1012 - bool mptcp_pm_nl_is_init_remote_addr(struct mptcp_sock *msk, 1013 - const struct mptcp_addr_info *remote); 1014 - void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk); 1009 + void mptcp_pm_send_ack(struct mptcp_sock *msk, 1010 + struct mptcp_subflow_context *subflow, 1011 + bool prio, bool backup); 1012 + void mptcp_pm_addr_send_ack(struct mptcp_sock *msk); 1013 + void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id); 1014 + void mptcp_pm_rm_subflow(struct mptcp_sock *msk, 1015 + const struct mptcp_rm_list *rm_list); 1015 1016 void mptcp_pm_rm_addr_received(struct mptcp_sock *msk, 1016 1017 const struct mptcp_rm_list *rm_list); 1017 1018 void mptcp_pm_mp_prio_received(struct sock *sk, u8 bkup); 1018 1019 void mptcp_pm_mp_fail_received(struct sock *sk, u64 fail_seq); 1019 - int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk, 1020 - struct mptcp_addr_info *addr, 1021 - struct mptcp_addr_info *rem, 1022 - u8 bkup); 1020 + int mptcp_pm_mp_prio_send_ack(struct mptcp_sock *msk, 1021 + struct mptcp_addr_info *addr, 1022 + struct mptcp_addr_info *rem, 1023 + u8 bkup); 1023 1024 bool mptcp_pm_alloc_anno_list(struct mptcp_sock *msk, 1024 1025 const struct mptcp_addr_info *addr); 1025 - void mptcp_pm_free_anno_list(struct mptcp_sock *msk); 1026 1026 bool mptcp_pm_sport_in_anno_list(struct mptcp_sock *msk, const struct sock *sk); 1027 1027 struct mptcp_pm_add_entry * 1028 1028 mptcp_pm_del_add_timer(struct mptcp_sock *msk, 1029 1029 const struct mptcp_addr_info *addr, bool check_id); 1030 - struct mptcp_pm_add_entry * 1031 - mptcp_lookup_anno_list_by_saddr(const struct mptcp_sock *msk, 1032 - const struct mptcp_addr_info *addr); 1033 1030 bool mptcp_lookup_subflow_by_saddr(const struct list_head *list, 1034 1031 const struct mptcp_addr_info *saddr); 1035 1032 bool mptcp_remove_anno_list_by_saddr(struct mptcp_sock *msk, ··· 1045 1042 void mptcp_pm_remove_addr_entry(struct mptcp_sock *msk, 1046 1043 struct mptcp_pm_addr_entry *entry); 1047 1044 1048 - void mptcp_free_local_addr_list(struct mptcp_sock *msk); 1045 + void mptcp_userspace_pm_free_local_addr_list(struct mptcp_sock *msk); 1049 1046 1050 1047 void mptcp_event(enum mptcp_event_type type, const struct mptcp_sock *msk, 1051 1048 const struct sock *ssk, gfp_t gfp); ··· 1057 1054 1058 1055 void mptcp_fastopen_subflow_synack_set_params(struct mptcp_subflow_context *subflow, 1059 1056 struct request_sock *req); 1060 - int mptcp_nl_fill_addr(struct sk_buff *skb, 1061 - struct mptcp_pm_addr_entry *entry); 1062 1057 int mptcp_pm_genl_fill_addr(struct sk_buff *msg, 1063 1058 struct netlink_callback *cb, 1064 1059 struct mptcp_pm_addr_entry *entry); ··· 1122 1121 bool mptcp_pm_rm_addr_signal(struct mptcp_sock *msk, unsigned int remaining, 1123 1122 struct mptcp_rm_list *rm_list); 1124 1123 int mptcp_pm_get_local_id(struct mptcp_sock *msk, struct sock_common *skc); 1125 - int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc); 1126 - int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc); 1124 + int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, 1125 + struct mptcp_pm_addr_entry *skc); 1126 + int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk, 1127 + struct mptcp_pm_addr_entry *skc); 1127 1128 bool mptcp_pm_is_backup(struct mptcp_sock *msk, struct sock_common *skc); 1128 1129 bool mptcp_pm_nl_is_backup(struct mptcp_sock *msk, struct mptcp_addr_info *skc); 1129 1130 bool mptcp_userspace_pm_is_backup(struct mptcp_sock *msk, struct mptcp_addr_info *skc); ··· 1148 1145 } 1149 1146 1150 1147 void __init mptcp_pm_nl_init(void); 1151 - void mptcp_pm_nl_work(struct mptcp_sock *msk); 1148 + void mptcp_pm_worker(struct mptcp_sock *msk); 1149 + void __mptcp_pm_kernel_worker(struct mptcp_sock *msk); 1152 1150 unsigned int mptcp_pm_get_add_addr_signal_max(const struct mptcp_sock *msk); 1153 1151 unsigned int mptcp_pm_get_add_addr_accept_max(const struct mptcp_sock *msk); 1154 1152 unsigned int mptcp_pm_get_subflows_max(const struct mptcp_sock *msk);