Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'nf-next-26-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next

Florian Westphal says:

====================
netfilter: updates for net-next

1-3) IPVS updates from Julian Anastasov to enhance visibility into
IPVS internal state by exposing hash size, load factor etc and
allows userspace to tune the load factor used for resizing hash
tables.

4) reject empty/not nul terminated device names from xt_physdev.
This isn't a bug fix; existing code doesn't require a c-string.
But clean this up anyway because conceptually the interface name
definitely should be a c-string.

5) Switch nfnetlink to skb_mac_header helpers that didn't exist back
when this code was written. This gives us additional debug checks
but is not intended to change functionality.

6) Let the xt ttl/hoplimit match reject unknown operator modes.
This is a cleanup, the evaluation function simply returns false when
the mode is out of range. From Marino Dzalto.

7) xt_socket match should enable defrag after all other checks. This
bug is harmless, historically defrag could not be disabled either
except by rmmod.

8) remove UDP-Lite conntrack support, from Fernando Fernandez Mancera.

9) Avoid a couple -Wflex-array-member-not-at-end warnings in the old
xtables 32bit compat code, from Gustavo A. R. Silva.

10) nftables fwd expression should drop packets when their ttl/hl has
expired. This is a bug fix deferred, its not deemed important
enough for -rc8.
11) Add additional checks before assuming the mac header is an ethernet
header, from Zhengchuan Liang.

* tag 'nf-next-26-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
netfilter: require Ethernet MAC header before using eth_hdr()
netfilter: nft_fwd_netdev: check ttl/hl before forwarding
netfilter: x_tables: Avoid a couple -Wflex-array-member-not-at-end warnings
netfilter: conntrack: remove UDP-Lite conntrack support
netfilter: xt_socket: enable defrag after all other checks
netfilter: xt_HL: add pr_fmt and checkentry validation
netfilter: nfnetlink: prefer skb_mac_header helpers
netfilter: x_physdev: reject empty or not-nul terminated device names
ipvs: add conn_lfactor and svc_lfactor sysctl vars
ipvs: add ip_vs_status info
ipvs: show the current conn_tab size to users
====================

Link: https://patch.msgid.link/20260410112352.23599-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>

+399 -231
+37
Documentation/networking/ipvs-sysctl.rst
··· 29 29 If set, disable the director function while the server is 30 30 in backup mode to avoid packet loops for DR/TUN methods. 31 31 32 + conn_lfactor - INTEGER 33 + Possible values: -8 (larger table) .. 8 (smaller table) 34 + 35 + Default: -4 36 + 37 + Controls the sizing of the connection hash table based on the 38 + load factor (number of connections per table buckets): 39 + 40 + 2^conn_lfactor = nodes / buckets 41 + 42 + As result, the table grows if load increases and shrinks when 43 + load decreases in the range of 2^8 - 2^conn_tab_bits (module 44 + parameter). 45 + The value is a shift count where negative values select 46 + buckets = (connection hash nodes << -value) while positive 47 + values select buckets = (connection hash nodes >> value). The 48 + negative values reduce the collisions and reduce the time for 49 + lookups but increase the table size. Positive values will 50 + tolerate load above 100% when using smaller table is 51 + preferred with the cost of more collisions. If using NAT 52 + connections consider decreasing the value with one because 53 + they add two nodes in the hash table. 54 + 55 + Example: 56 + -4: grow if load goes above 6% (buckets = nodes * 16) 57 + 2: grow if load goes above 400% (buckets = nodes / 4) 58 + 32 59 conn_reuse_mode - INTEGER 33 60 1 - default 34 61 ··· 245 218 246 219 The value definition is the same as that of drop_entry and 247 220 drop_packet. 221 + 222 + svc_lfactor - INTEGER 223 + Possible values: -8 (larger table) .. 8 (smaller table) 224 + 225 + Default: -3 226 + 227 + Controls the sizing of the service hash table based on the 228 + load factor (number of services per table buckets). The table 229 + will grow and shrink in the range of 2^4 - 2^20. 230 + See conn_lfactor for explanation. 248 231 249 232 sync_threshold - vector of 2 INTEGERs: sync_threshold, sync_period 250 233 default 3 50
-3
include/net/netfilter/ipv4/nf_conntrack_ipv4.h
··· 16 16 #ifdef CONFIG_NF_CT_PROTO_SCTP 17 17 extern const struct nf_conntrack_l4proto nf_conntrack_l4proto_sctp; 18 18 #endif 19 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 20 - extern const struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite; 21 - #endif 22 19 #ifdef CONFIG_NF_CT_PROTO_GRE 23 20 extern const struct nf_conntrack_l4proto nf_conntrack_l4proto_gre; 24 21 #endif
-7
include/net/netfilter/nf_conntrack_l4proto.h
··· 107 107 unsigned int dataoff, 108 108 enum ip_conntrack_info ctinfo, 109 109 const struct nf_hook_state *state); 110 - int nf_conntrack_udplite_packet(struct nf_conn *ct, 111 - struct sk_buff *skb, 112 - unsigned int dataoff, 113 - enum ip_conntrack_info ctinfo, 114 - const struct nf_hook_state *state); 115 110 int nf_conntrack_tcp_packet(struct nf_conn *ct, 116 111 struct sk_buff *skb, 117 112 unsigned int dataoff, ··· 133 138 134 139 /* Existing built-in generic protocol */ 135 140 extern const struct nf_conntrack_l4proto nf_conntrack_l4proto_generic; 136 - 137 - #define MAX_NF_CT_PROTO IPPROTO_UDPLITE 138 141 139 142 const struct nf_conntrack_l4proto *nf_ct_l4proto_find(u8 l4proto); 140 143
+5 -2
net/ipv6/netfilter/ip6t_eui64.c
··· 7 7 #include <linux/module.h> 8 8 #include <linux/skbuff.h> 9 9 #include <linux/ipv6.h> 10 + #include <linux/if_arp.h> 10 11 #include <linux/if_ether.h> 11 12 12 13 #include <linux/netfilter/x_tables.h> ··· 22 21 { 23 22 unsigned char eui64[8]; 24 23 25 - if (!(skb_mac_header(skb) >= skb->head && 26 - skb_mac_header(skb) + ETH_HLEN <= skb->data)) { 24 + if (!skb->dev || skb->dev->type != ARPHRD_ETHER) 25 + return false; 26 + 27 + if (!skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) { 27 28 par->hotdrop = true; 28 29 return false; 29 30 }
-11
net/netfilter/Kconfig
··· 209 209 210 210 If unsure, say Y. 211 211 212 - config NF_CT_PROTO_UDPLITE 213 - bool 'UDP-Lite protocol connection tracking support' 214 - depends on NETFILTER_ADVANCED 215 - default y 216 - help 217 - With this option enabled, the layer 3 independent connection 218 - tracking code will be able to do state tracking on UDP-Lite 219 - connections. 220 - 221 - If unsure, say Y. 222 - 223 212 config NF_CONNTRACK_AMANDA 224 213 tristate "Amanda backup protocol support" 225 214 depends on NETFILTER_ADVANCED
+3 -2
net/netfilter/ipset/ip_set_bitmap_ipmac.c
··· 11 11 #include <linux/etherdevice.h> 12 12 #include <linux/skbuff.h> 13 13 #include <linux/errno.h> 14 + #include <linux/if_arp.h> 14 15 #include <linux/if_ether.h> 15 16 #include <linux/netlink.h> 16 17 #include <linux/jiffies.h> ··· 221 220 return -IPSET_ERR_BITMAP_RANGE; 222 221 223 222 /* Backward compatibility: we don't check the second flag */ 224 - if (skb_mac_header(skb) < skb->head || 225 - (skb_mac_header(skb) + ETH_HLEN) > skb->data) 223 + if (!skb->dev || skb->dev->type != ARPHRD_ETHER || 224 + !skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 226 225 return -EINVAL; 227 226 228 227 e.id = ip_to_id(map, ip);
+5 -4
net/netfilter/ipset/ip_set_hash_ipmac.c
··· 11 11 #include <linux/skbuff.h> 12 12 #include <linux/errno.h> 13 13 #include <linux/random.h> 14 + #include <linux/if_arp.h> 14 15 #include <linux/if_ether.h> 15 16 #include <net/ip.h> 16 17 #include <net/ipv6.h> ··· 90 89 struct hash_ipmac4_elem e = { .ip = 0, { .foo[0] = 0, .foo[1] = 0 } }; 91 90 struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set); 92 91 93 - if (skb_mac_header(skb) < skb->head || 94 - (skb_mac_header(skb) + ETH_HLEN) > skb->data) 92 + if (!skb->dev || skb->dev->type != ARPHRD_ETHER || 93 + !skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 95 94 return -EINVAL; 96 95 97 96 if (opt->flags & IPSET_DIM_TWO_SRC) ··· 206 205 }; 207 206 struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set); 208 207 209 - if (skb_mac_header(skb) < skb->head || 210 - (skb_mac_header(skb) + ETH_HLEN) > skb->data) 208 + if (!skb->dev || skb->dev->type != ARPHRD_ETHER || 209 + !skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 211 210 return -EINVAL; 212 211 213 212 if (opt->flags & IPSET_DIM_TWO_SRC)
+3 -2
net/netfilter/ipset/ip_set_hash_mac.c
··· 8 8 #include <linux/etherdevice.h> 9 9 #include <linux/skbuff.h> 10 10 #include <linux/errno.h> 11 + #include <linux/if_arp.h> 11 12 #include <linux/if_ether.h> 12 13 #include <net/netlink.h> 13 14 ··· 78 77 struct hash_mac4_elem e = { { .foo[0] = 0, .foo[1] = 0 } }; 79 78 struct ip_set_ext ext = IP_SET_INIT_KEXT(skb, opt, set); 80 79 81 - if (skb_mac_header(skb) < skb->head || 82 - (skb_mac_header(skb) + ETH_HLEN) > skb->data) 80 + if (!skb->dev || skb->dev->type != ARPHRD_ETHER || 81 + !skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 83 82 return -EINVAL; 84 83 85 84 if (opt->flags & IPSET_DIM_ONE_SRC)
+243 -4
net/netfilter/ipvs/ip_vs_ctl.c
··· 281 281 mutex_unlock(&ipvs->est_mutex); 282 282 } 283 283 284 + static int get_conn_tab_size(struct netns_ipvs *ipvs) 285 + { 286 + const struct ip_vs_rht *t; 287 + int size = 0; 288 + 289 + rcu_read_lock(); 290 + t = rcu_dereference(ipvs->conn_tab); 291 + if (t) 292 + size = t->size; 293 + rcu_read_unlock(); 294 + 295 + return size; 296 + } 297 + 284 298 int 285 299 ip_vs_use_count_inc(void) 286 300 { ··· 2445 2431 return ret; 2446 2432 } 2447 2433 2434 + static int ipvs_proc_conn_lfactor(const struct ctl_table *table, int write, 2435 + void *buffer, size_t *lenp, loff_t *ppos) 2436 + { 2437 + struct netns_ipvs *ipvs = table->extra2; 2438 + int *valp = table->data; 2439 + int val = *valp; 2440 + int ret; 2441 + 2442 + struct ctl_table tmp_table = { 2443 + .data = &val, 2444 + .maxlen = sizeof(int), 2445 + }; 2446 + 2447 + ret = proc_dointvec(&tmp_table, write, buffer, lenp, ppos); 2448 + if (write && ret >= 0) { 2449 + if (val < -8 || val > 8) { 2450 + ret = -EINVAL; 2451 + } else { 2452 + *valp = val; 2453 + if (rcu_access_pointer(ipvs->conn_tab)) 2454 + mod_delayed_work(system_unbound_wq, 2455 + &ipvs->conn_resize_work, 0); 2456 + } 2457 + } 2458 + return ret; 2459 + } 2460 + 2461 + static int ipvs_proc_svc_lfactor(const struct ctl_table *table, int write, 2462 + void *buffer, size_t *lenp, loff_t *ppos) 2463 + { 2464 + struct netns_ipvs *ipvs = table->extra2; 2465 + int *valp = table->data; 2466 + int val = *valp; 2467 + int ret; 2468 + 2469 + struct ctl_table tmp_table = { 2470 + .data = &val, 2471 + .maxlen = sizeof(int), 2472 + }; 2473 + 2474 + ret = proc_dointvec(&tmp_table, write, buffer, lenp, ppos); 2475 + if (write && ret >= 0) { 2476 + if (val < -8 || val > 8) { 2477 + ret = -EINVAL; 2478 + } else { 2479 + *valp = val; 2480 + if (rcu_access_pointer(ipvs->svc_table)) 2481 + mod_delayed_work(system_unbound_wq, 2482 + &ipvs->svc_resize_work, 0); 2483 + } 2484 + } 2485 + return ret; 2486 + } 2487 + 2448 2488 /* 2449 2489 * IPVS sysctl table (under the /proc/sys/net/ipv4/vs/) 2450 2490 * Do not change order or insert new entries without ··· 2687 2619 .mode = 0644, 2688 2620 .proc_handler = ipvs_proc_est_nice, 2689 2621 }, 2622 + { 2623 + .procname = "conn_lfactor", 2624 + .maxlen = sizeof(int), 2625 + .mode = 0644, 2626 + .proc_handler = ipvs_proc_conn_lfactor, 2627 + }, 2628 + { 2629 + .procname = "svc_lfactor", 2630 + .maxlen = sizeof(int), 2631 + .mode = 0644, 2632 + .proc_handler = ipvs_proc_svc_lfactor, 2633 + }, 2690 2634 #ifdef CONFIG_IP_VS_DEBUG 2691 2635 { 2692 2636 .procname = "debug_level", ··· 2821 2741 2822 2742 static int ip_vs_info_seq_show(struct seq_file *seq, void *v) 2823 2743 { 2744 + struct net *net = seq_file_net(seq); 2745 + struct netns_ipvs *ipvs = net_ipvs(net); 2746 + 2824 2747 if (v == SEQ_START_TOKEN) { 2825 2748 seq_printf(seq, 2826 2749 "IP Virtual Server version %d.%d.%d (size=%d)\n", 2827 - NVERSION(IP_VS_VERSION_CODE), ip_vs_conn_tab_size); 2750 + NVERSION(IP_VS_VERSION_CODE), get_conn_tab_size(ipvs)); 2828 2751 seq_puts(seq, 2829 2752 "Prot LocalAddress:Port Scheduler Flags\n"); 2830 2753 seq_puts(seq, ··· 2990 2907 2991 2908 return 0; 2992 2909 } 2910 + 2911 + static int ip_vs_status_show(struct seq_file *seq, void *v) 2912 + { 2913 + struct net *net = seq_file_single_net(seq); 2914 + struct netns_ipvs *ipvs = net_ipvs(net); 2915 + unsigned int resched_score = 0; 2916 + struct ip_vs_conn_hnode *hn; 2917 + struct hlist_bl_head *head; 2918 + struct ip_vs_service *svc; 2919 + struct ip_vs_rht *t, *pt; 2920 + struct hlist_bl_node *e; 2921 + int old_gen, new_gen; 2922 + u32 counts[8]; 2923 + u32 bucket; 2924 + int count; 2925 + u32 sum1; 2926 + u32 sum; 2927 + int i; 2928 + 2929 + rcu_read_lock(); 2930 + 2931 + t = rcu_dereference(ipvs->conn_tab); 2932 + 2933 + seq_printf(seq, "Conns:\t%d\n", atomic_read(&ipvs->conn_count)); 2934 + seq_printf(seq, "Conn buckets:\t%d (%d bits, lfactor %d)\n", 2935 + t ? t->size : 0, t ? t->bits : 0, t ? t->lfactor : 0); 2936 + 2937 + if (!atomic_read(&ipvs->conn_count)) 2938 + goto after_conns; 2939 + old_gen = atomic_read(&ipvs->conn_tab_changes); 2940 + 2941 + repeat_conn: 2942 + smp_rmb(); /* ipvs->conn_tab and conn_tab_changes */ 2943 + memset(counts, 0, sizeof(counts)); 2944 + ip_vs_rht_for_each_table_rcu(ipvs->conn_tab, t, pt) { 2945 + for (bucket = 0; bucket < t->size; bucket++) { 2946 + DECLARE_IP_VS_RHT_WALK_BUCKET_RCU(); 2947 + 2948 + count = 0; 2949 + resched_score++; 2950 + ip_vs_rht_walk_bucket_rcu(t, bucket, head) { 2951 + count = 0; 2952 + hlist_bl_for_each_entry_rcu(hn, e, head, node) 2953 + count++; 2954 + } 2955 + resched_score += count; 2956 + if (resched_score >= 100) { 2957 + resched_score = 0; 2958 + cond_resched_rcu(); 2959 + new_gen = atomic_read(&ipvs->conn_tab_changes); 2960 + /* New table installed ? */ 2961 + if (old_gen != new_gen) { 2962 + old_gen = new_gen; 2963 + goto repeat_conn; 2964 + } 2965 + } 2966 + counts[min(count, (int)ARRAY_SIZE(counts) - 1)]++; 2967 + } 2968 + } 2969 + for (sum = 0, i = 0; i < ARRAY_SIZE(counts); i++) 2970 + sum += counts[i]; 2971 + sum1 = sum - counts[0]; 2972 + seq_printf(seq, "Conn buckets empty:\t%u (%lu%%)\n", 2973 + counts[0], (unsigned long)counts[0] * 100 / max(sum, 1U)); 2974 + for (i = 1; i < ARRAY_SIZE(counts); i++) { 2975 + if (!counts[i]) 2976 + continue; 2977 + seq_printf(seq, "Conn buckets len-%d:\t%u (%lu%%)\n", 2978 + i, counts[i], 2979 + (unsigned long)counts[i] * 100 / max(sum1, 1U)); 2980 + } 2981 + 2982 + after_conns: 2983 + t = rcu_dereference(ipvs->svc_table); 2984 + 2985 + count = ip_vs_get_num_services(ipvs); 2986 + seq_printf(seq, "Services:\t%d\n", count); 2987 + seq_printf(seq, "Service buckets:\t%d (%d bits, lfactor %d)\n", 2988 + t ? t->size : 0, t ? t->bits : 0, t ? t->lfactor : 0); 2989 + 2990 + if (!count) 2991 + goto after_svc; 2992 + old_gen = atomic_read(&ipvs->svc_table_changes); 2993 + 2994 + repeat_svc: 2995 + smp_rmb(); /* ipvs->svc_table and svc_table_changes */ 2996 + memset(counts, 0, sizeof(counts)); 2997 + ip_vs_rht_for_each_table_rcu(ipvs->svc_table, t, pt) { 2998 + for (bucket = 0; bucket < t->size; bucket++) { 2999 + DECLARE_IP_VS_RHT_WALK_BUCKET_RCU(); 3000 + 3001 + count = 0; 3002 + resched_score++; 3003 + ip_vs_rht_walk_bucket_rcu(t, bucket, head) { 3004 + count = 0; 3005 + hlist_bl_for_each_entry_rcu(svc, e, head, 3006 + s_list) 3007 + count++; 3008 + } 3009 + resched_score += count; 3010 + if (resched_score >= 100) { 3011 + resched_score = 0; 3012 + cond_resched_rcu(); 3013 + new_gen = atomic_read(&ipvs->svc_table_changes); 3014 + /* New table installed ? */ 3015 + if (old_gen != new_gen) { 3016 + old_gen = new_gen; 3017 + goto repeat_svc; 3018 + } 3019 + } 3020 + counts[min(count, (int)ARRAY_SIZE(counts) - 1)]++; 3021 + } 3022 + } 3023 + for (sum = 0, i = 0; i < ARRAY_SIZE(counts); i++) 3024 + sum += counts[i]; 3025 + sum1 = sum - counts[0]; 3026 + seq_printf(seq, "Service buckets empty:\t%u (%lu%%)\n", 3027 + counts[0], (unsigned long)counts[0] * 100 / max(sum, 1U)); 3028 + for (i = 1; i < ARRAY_SIZE(counts); i++) { 3029 + if (!counts[i]) 3030 + continue; 3031 + seq_printf(seq, "Service buckets len-%d:\t%u (%lu%%)\n", 3032 + i, counts[i], 3033 + (unsigned long)counts[i] * 100 / max(sum1, 1U)); 3034 + } 3035 + 3036 + after_svc: 3037 + seq_printf(seq, "Stats thread slots:\t%d (max %lu)\n", 3038 + ipvs->est_kt_count, ipvs->est_max_threads); 3039 + seq_printf(seq, "Stats chain max len:\t%d\n", ipvs->est_chain_max); 3040 + seq_printf(seq, "Stats thread ests:\t%d\n", 3041 + ipvs->est_chain_max * IPVS_EST_CHAIN_FACTOR * 3042 + IPVS_EST_NTICKS); 3043 + 3044 + rcu_read_unlock(); 3045 + return 0; 3046 + } 3047 + 2993 3048 #endif 2994 3049 2995 3050 /* ··· 3646 3425 char buf[64]; 3647 3426 3648 3427 sprintf(buf, "IP Virtual Server version %d.%d.%d (size=%d)", 3649 - NVERSION(IP_VS_VERSION_CODE), ip_vs_conn_tab_size); 3428 + NVERSION(IP_VS_VERSION_CODE), get_conn_tab_size(ipvs)); 3650 3429 if (copy_to_user(user, buf, strlen(buf)+1) != 0) { 3651 3430 ret = -EFAULT; 3652 3431 goto out; ··· 3658 3437 case IP_VS_SO_GET_INFO: 3659 3438 { 3660 3439 struct ip_vs_getinfo info; 3440 + 3661 3441 info.version = IP_VS_VERSION_CODE; 3662 - info.size = ip_vs_conn_tab_size; 3442 + info.size = get_conn_tab_size(ipvs); 3663 3443 info.num_services = 3664 3444 atomic_read(&ipvs->num_services[IP_VS_AF_INET]); 3665 3445 if (copy_to_user(user, &info, sizeof(info)) != 0) ··· 4669 4447 if (nla_put_u32(msg, IPVS_INFO_ATTR_VERSION, 4670 4448 IP_VS_VERSION_CODE) || 4671 4449 nla_put_u32(msg, IPVS_INFO_ATTR_CONN_TAB_SIZE, 4672 - ip_vs_conn_tab_size)) 4450 + get_conn_tab_size(ipvs))) 4673 4451 goto nla_put_failure; 4674 4452 break; 4675 4453 } ··· 4919 4697 tbl[idx].extra2 = ipvs; 4920 4698 tbl[idx++].data = &ipvs->sysctl_est_nice; 4921 4699 4700 + if (unpriv) 4701 + tbl[idx].mode = 0444; 4702 + tbl[idx].extra2 = ipvs; 4703 + tbl[idx++].data = &ipvs->sysctl_conn_lfactor; 4704 + 4705 + if (unpriv) 4706 + tbl[idx].mode = 0444; 4707 + tbl[idx].extra2 = ipvs; 4708 + tbl[idx++].data = &ipvs->sysctl_svc_lfactor; 4709 + 4922 4710 #ifdef CONFIG_IP_VS_DEBUG 4923 4711 /* Global sysctls must be ro in non-init netns */ 4924 4712 if (!net_eq(net, &init_net)) ··· 5039 4807 ipvs->net->proc_net, 5040 4808 ip_vs_stats_percpu_show, NULL)) 5041 4809 goto err_percpu; 4810 + if (!proc_create_net_single("ip_vs_status", 0, ipvs->net->proc_net, 4811 + ip_vs_status_show, NULL)) 4812 + goto err_status; 5042 4813 #endif 5043 4814 5044 4815 ret = ip_vs_control_net_init_sysctl(ipvs); ··· 5052 4817 5053 4818 err: 5054 4819 #ifdef CONFIG_PROC_FS 4820 + remove_proc_entry("ip_vs_status", ipvs->net->proc_net); 4821 + 4822 + err_status: 5055 4823 remove_proc_entry("ip_vs_stats_percpu", ipvs->net->proc_net); 5056 4824 5057 4825 err_percpu: ··· 5080 4842 ip_vs_control_net_cleanup_sysctl(ipvs); 5081 4843 cancel_delayed_work_sync(&ipvs->est_reload_work); 5082 4844 #ifdef CONFIG_PROC_FS 4845 + remove_proc_entry("ip_vs_status", ipvs->net->proc_net); 5083 4846 remove_proc_entry("ip_vs_stats_percpu", ipvs->net->proc_net); 5084 4847 remove_proc_entry("ip_vs_stats", ipvs->net->proc_net); 5085 4848 remove_proc_entry("ip_vs", ipvs->net->proc_net);
-8
net/netfilter/nf_conntrack_core.c
··· 323 323 #endif 324 324 case IPPROTO_TCP: 325 325 case IPPROTO_UDP: 326 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 327 - case IPPROTO_UDPLITE: 328 - #endif 329 326 #ifdef CONFIG_NF_CT_PROTO_SCTP 330 327 case IPPROTO_SCTP: 331 328 #endif ··· 1983 1986 #if IS_ENABLED(CONFIG_IPV6) 1984 1987 case IPPROTO_ICMPV6: 1985 1988 return nf_conntrack_icmpv6_packet(ct, skb, ctinfo, state); 1986 - #endif 1987 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 1988 - case IPPROTO_UDPLITE: 1989 - return nf_conntrack_udplite_packet(ct, skb, dataoff, 1990 - ctinfo, state); 1991 1989 #endif 1992 1990 #ifdef CONFIG_NF_CT_PROTO_SCTP 1993 1991 case IPPROTO_SCTP:
-3
net/netfilter/nf_conntrack_proto.c
··· 103 103 #ifdef CONFIG_NF_CT_PROTO_SCTP 104 104 case IPPROTO_SCTP: return &nf_conntrack_l4proto_sctp; 105 105 #endif 106 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 107 - case IPPROTO_UDPLITE: return &nf_conntrack_l4proto_udplite; 108 - #endif 109 106 #ifdef CONFIG_NF_CT_PROTO_GRE 110 107 case IPPROTO_GRE: return &nf_conntrack_l4proto_gre; 111 108 #endif
-108
net/netfilter/nf_conntrack_proto_udp.c
··· 129 129 return NF_ACCEPT; 130 130 } 131 131 132 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 133 - static void udplite_error_log(const struct sk_buff *skb, 134 - const struct nf_hook_state *state, 135 - const char *msg) 136 - { 137 - nf_l4proto_log_invalid(skb, state, IPPROTO_UDPLITE, "%s", msg); 138 - } 139 - 140 - static bool udplite_error(struct sk_buff *skb, 141 - unsigned int dataoff, 142 - const struct nf_hook_state *state) 143 - { 144 - unsigned int udplen = skb->len - dataoff; 145 - const struct udphdr *hdr; 146 - struct udphdr _hdr; 147 - unsigned int cscov; 148 - 149 - /* Header is too small? */ 150 - hdr = skb_header_pointer(skb, dataoff, sizeof(_hdr), &_hdr); 151 - if (!hdr) { 152 - udplite_error_log(skb, state, "short packet"); 153 - return true; 154 - } 155 - 156 - cscov = ntohs(hdr->len); 157 - if (cscov == 0) { 158 - cscov = udplen; 159 - } else if (cscov < sizeof(*hdr) || cscov > udplen) { 160 - udplite_error_log(skb, state, "invalid checksum coverage"); 161 - return true; 162 - } 163 - 164 - /* UDPLITE mandates checksums */ 165 - if (!hdr->check) { 166 - udplite_error_log(skb, state, "checksum missing"); 167 - return true; 168 - } 169 - 170 - /* Checksum invalid? Ignore. */ 171 - if (state->hook == NF_INET_PRE_ROUTING && 172 - state->net->ct.sysctl_checksum && 173 - nf_checksum_partial(skb, state->hook, dataoff, cscov, IPPROTO_UDP, 174 - state->pf)) { 175 - udplite_error_log(skb, state, "bad checksum"); 176 - return true; 177 - } 178 - 179 - return false; 180 - } 181 - 182 - /* Returns verdict for packet, and may modify conntracktype */ 183 - int nf_conntrack_udplite_packet(struct nf_conn *ct, 184 - struct sk_buff *skb, 185 - unsigned int dataoff, 186 - enum ip_conntrack_info ctinfo, 187 - const struct nf_hook_state *state) 188 - { 189 - unsigned int *timeouts; 190 - 191 - if (udplite_error(skb, dataoff, state)) 192 - return -NF_ACCEPT; 193 - 194 - timeouts = nf_ct_timeout_lookup(ct); 195 - if (!timeouts) 196 - timeouts = udp_get_timeouts(nf_ct_net(ct)); 197 - 198 - /* If we've seen traffic both ways, this is some kind of UDP 199 - stream. Extend timeout. */ 200 - if (test_bit(IPS_SEEN_REPLY_BIT, &ct->status)) { 201 - nf_ct_refresh_acct(ct, ctinfo, skb, 202 - timeouts[UDP_CT_REPLIED]); 203 - 204 - if (unlikely((ct->status & IPS_NAT_CLASH))) 205 - return NF_ACCEPT; 206 - 207 - /* Also, more likely to be important, and not a probe */ 208 - if (!test_and_set_bit(IPS_ASSURED_BIT, &ct->status)) 209 - nf_conntrack_event_cache(IPCT_ASSURED, ct); 210 - } else { 211 - nf_ct_refresh_acct(ct, ctinfo, skb, timeouts[UDP_CT_UNREPLIED]); 212 - } 213 - return NF_ACCEPT; 214 - } 215 - #endif 216 - 217 132 #ifdef CONFIG_NF_CONNTRACK_TIMEOUT 218 133 219 134 #include <linux/netfilter/nfnetlink.h> ··· 214 299 }, 215 300 #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ 216 301 }; 217 - 218 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 219 - const struct nf_conntrack_l4proto nf_conntrack_l4proto_udplite = 220 - { 221 - .l4proto = IPPROTO_UDPLITE, 222 - .allow_clash = true, 223 - #if IS_ENABLED(CONFIG_NF_CT_NETLINK) 224 - .tuple_to_nlattr = nf_ct_port_tuple_to_nlattr, 225 - .nlattr_to_tuple = nf_ct_port_nlattr_to_tuple, 226 - .nlattr_tuple_size = nf_ct_port_nlattr_tuple_size, 227 - .nla_policy = nf_ct_port_nla_policy, 228 - #endif 229 - #ifdef CONFIG_NF_CONNTRACK_TIMEOUT 230 - .ctnl_timeout = { 231 - .nlattr_to_obj = udp_timeout_nlattr_to_obj, 232 - .obj_to_nlattr = udp_timeout_obj_to_nlattr, 233 - .nlattr_max = CTA_TIMEOUT_UDP_MAX, 234 - .obj_size = sizeof(unsigned int) * CTA_TIMEOUT_UDP_MAX, 235 - .nla_policy = udp_timeout_nla_policy, 236 - }, 237 - #endif /* CONFIG_NF_CONNTRACK_TIMEOUT */ 238 - }; 239 - #endif
-2
net/netfilter/nf_conntrack_standalone.c
··· 61 61 ntohs(tuple->src.u.tcp.port), 62 62 ntohs(tuple->dst.u.tcp.port)); 63 63 break; 64 - case IPPROTO_UDPLITE: 65 64 case IPPROTO_UDP: 66 65 seq_printf(s, "sport=%hu dport=%hu ", 67 66 ntohs(tuple->src.u.udp.port), ··· 276 277 case IPPROTO_UDP: return "udp"; 277 278 case IPPROTO_GRE: return "gre"; 278 279 case IPPROTO_SCTP: return "sctp"; 279 - case IPPROTO_UDPLITE: return "udplite"; 280 280 case IPPROTO_ICMPV6: return "icmpv6"; 281 281 } 282 282
+7 -1
net/netfilter/nf_log_syslog.c
··· 78 78 else 79 79 logflags = NF_LOG_DEFAULT_MASK; 80 80 81 - if (logflags & NF_LOG_MACDECODE) { 81 + if ((logflags & NF_LOG_MACDECODE) && 82 + skb->dev && skb->dev->type == ARPHRD_ETHER && 83 + skb_mac_header_was_set(skb) && 84 + skb_mac_header_len(skb) >= ETH_HLEN) { 82 85 nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM ", 83 86 eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest); 84 87 nf_log_dump_vlan(m, skb); ··· 800 797 801 798 switch (dev->type) { 802 799 case ARPHRD_ETHER: 800 + if (!skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 801 + return; 802 + 803 803 nf_log_buf_add(m, "MACSRC=%pM MACDST=%pM ", 804 804 eth_hdr(skb)->h_source, eth_hdr(skb)->h_dest); 805 805 nf_log_dump_vlan(m, skb);
-6
net/netfilter/nf_nat_core.c
··· 68 68 fl4->daddr = t->dst.u3.ip; 69 69 if (t->dst.protonum == IPPROTO_TCP || 70 70 t->dst.protonum == IPPROTO_UDP || 71 - t->dst.protonum == IPPROTO_UDPLITE || 72 71 t->dst.protonum == IPPROTO_SCTP) 73 72 fl4->fl4_dport = t->dst.u.all; 74 73 } ··· 78 79 fl4->saddr = t->src.u3.ip; 79 80 if (t->dst.protonum == IPPROTO_TCP || 80 81 t->dst.protonum == IPPROTO_UDP || 81 - t->dst.protonum == IPPROTO_UDPLITE || 82 82 t->dst.protonum == IPPROTO_SCTP) 83 83 fl4->fl4_sport = t->src.u.all; 84 84 } ··· 97 99 fl6->daddr = t->dst.u3.in6; 98 100 if (t->dst.protonum == IPPROTO_TCP || 99 101 t->dst.protonum == IPPROTO_UDP || 100 - t->dst.protonum == IPPROTO_UDPLITE || 101 102 t->dst.protonum == IPPROTO_SCTP) 102 103 fl6->fl6_dport = t->dst.u.all; 103 104 } ··· 107 110 fl6->saddr = t->src.u3.in6; 108 111 if (t->dst.protonum == IPPROTO_TCP || 109 112 t->dst.protonum == IPPROTO_UDP || 110 - t->dst.protonum == IPPROTO_UDPLITE || 111 113 t->dst.protonum == IPPROTO_SCTP) 112 114 fl6->fl6_sport = t->src.u.all; 113 115 } ··· 411 415 case IPPROTO_GRE: /* all fall though */ 412 416 case IPPROTO_TCP: 413 417 case IPPROTO_UDP: 414 - case IPPROTO_UDPLITE: 415 418 case IPPROTO_SCTP: 416 419 if (maniptype == NF_NAT_MANIP_SRC) 417 420 port = tuple->src.u.all; ··· 607 612 goto find_free_id; 608 613 #endif 609 614 case IPPROTO_UDP: 610 - case IPPROTO_UDPLITE: 611 615 case IPPROTO_TCP: 612 616 case IPPROTO_SCTP: 613 617 if (maniptype == NF_NAT_MANIP_SRC)
-20
net/netfilter/nf_nat_proto.c
··· 79 79 return true; 80 80 } 81 81 82 - static bool udplite_manip_pkt(struct sk_buff *skb, 83 - unsigned int iphdroff, unsigned int hdroff, 84 - const struct nf_conntrack_tuple *tuple, 85 - enum nf_nat_manip_type maniptype) 86 - { 87 - #ifdef CONFIG_NF_CT_PROTO_UDPLITE 88 - struct udphdr *hdr; 89 - 90 - if (skb_ensure_writable(skb, hdroff + sizeof(*hdr))) 91 - return false; 92 - 93 - hdr = (struct udphdr *)(skb->data + hdroff); 94 - __udp_manip_pkt(skb, iphdroff, hdr, tuple, maniptype, true); 95 - #endif 96 - return true; 97 - } 98 - 99 82 static bool 100 83 sctp_manip_pkt(struct sk_buff *skb, 101 84 unsigned int iphdroff, unsigned int hdroff, ··· 270 287 case IPPROTO_UDP: 271 288 return udp_manip_pkt(skb, iphdroff, hdroff, 272 289 tuple, maniptype); 273 - case IPPROTO_UDPLITE: 274 - return udplite_manip_pkt(skb, iphdroff, hdroff, 275 - tuple, maniptype); 276 290 case IPPROTO_SCTP: 277 291 return sctp_manip_pkt(skb, iphdroff, hdroff, 278 292 tuple, maniptype);
-1
net/netfilter/nft_ct.c
··· 1252 1252 switch (priv->l4proto) { 1253 1253 case IPPROTO_TCP: 1254 1254 case IPPROTO_UDP: 1255 - case IPPROTO_UDPLITE: 1256 1255 case IPPROTO_DCCP: 1257 1256 case IPPROTO_SCTP: 1258 1257 break;
+10
net/netfilter/nft_fwd_netdev.c
··· 116 116 goto out; 117 117 } 118 118 iph = ip_hdr(skb); 119 + if (iph->ttl <= 1) { 120 + verdict = NF_DROP; 121 + goto out; 122 + } 123 + 119 124 ip_decrease_ttl(iph); 120 125 neigh_table = NEIGH_ARP_TABLE; 121 126 break; ··· 137 132 goto out; 138 133 } 139 134 ip6h = ipv6_hdr(skb); 135 + if (ip6h->hop_limit <= 1) { 136 + verdict = NF_DROP; 137 + goto out; 138 + } 139 + 140 140 ip6h->hop_limit--; 141 141 neigh_table = NEIGH_ND_TABLE; 142 142 break;
+8 -4
net/netfilter/x_tables.c
··· 819 819 820 820 /* non-compat version may have padding after verdict */ 821 821 struct compat_xt_standard_target { 822 - struct compat_xt_entry_target t; 823 - compat_uint_t verdict; 822 + /* Must be last as it ends in a flexible-array member. */ 823 + TRAILING_OVERLAP(struct compat_xt_entry_target, t, data, 824 + compat_uint_t verdict; 825 + ); 824 826 }; 825 827 826 828 struct compat_xt_error_target { 827 - struct compat_xt_entry_target t; 828 - char errorname[XT_FUNCTION_MAXNAMELEN]; 829 + /* Must be last as it ends in a flexible-array member. */ 830 + TRAILING_OVERLAP(struct compat_xt_entry_target, t, data, 831 + char errorname[XT_FUNCTION_MAXNAMELEN]; 832 + ); 829 833 }; 830 834 831 835 int xt_compat_check_entry_offsets(const void *base, const char *elems,
+27
net/netfilter/xt_hl.c
··· 6 6 * Hop Limit matching module 7 7 * (C) 2001-2002 Maciej Soltysiak <solt@dns.toxicfilms.tv> 8 8 */ 9 + #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt 9 10 10 11 #include <linux/ip.h> 11 12 #include <linux/ipv6.h> ··· 22 21 MODULE_LICENSE("GPL"); 23 22 MODULE_ALIAS("ipt_ttl"); 24 23 MODULE_ALIAS("ip6t_hl"); 24 + 25 + static int ttl_mt_check(const struct xt_mtchk_param *par) 26 + { 27 + const struct ipt_ttl_info *info = par->matchinfo; 28 + 29 + if (info->mode > IPT_TTL_GT) { 30 + pr_err("Unknown TTL match mode: %d\n", info->mode); 31 + return -EINVAL; 32 + } 33 + 34 + return 0; 35 + } 25 36 26 37 static bool ttl_mt(const struct sk_buff *skb, struct xt_action_param *par) 27 38 { ··· 52 39 } 53 40 54 41 return false; 42 + } 43 + 44 + static int hl_mt6_check(const struct xt_mtchk_param *par) 45 + { 46 + const struct ip6t_hl_info *info = par->matchinfo; 47 + 48 + if (info->mode > IP6T_HL_GT) { 49 + pr_err("Unknown Hop Limit match mode: %d\n", info->mode); 50 + return -EINVAL; 51 + } 52 + 53 + return 0; 55 54 } 56 55 57 56 static bool hl_mt6(const struct sk_buff *skb, struct xt_action_param *par) ··· 90 65 .name = "ttl", 91 66 .revision = 0, 92 67 .family = NFPROTO_IPV4, 68 + .checkentry = ttl_mt_check, 93 69 .match = ttl_mt, 94 70 .matchsize = sizeof(struct ipt_ttl_info), 95 71 .me = THIS_MODULE, ··· 99 73 .name = "hl", 100 74 .revision = 0, 101 75 .family = NFPROTO_IPV6, 76 + .checkentry = hl_mt6_check, 102 77 .match = hl_mt6, 103 78 .matchsize = sizeof(struct ip6t_hl_info), 104 79 .me = THIS_MODULE,
+1 -3
net/netfilter/xt_mac.c
··· 29 29 30 30 if (skb->dev == NULL || skb->dev->type != ARPHRD_ETHER) 31 31 return false; 32 - if (skb_mac_header(skb) < skb->head) 33 - return false; 34 - if (skb_mac_header(skb) + ETH_HLEN > skb->data) 32 + if (!skb_mac_header_was_set(skb) || skb_mac_header_len(skb) < ETH_HLEN) 35 33 return false; 36 34 ret = ether_addr_equal(eth_hdr(skb)->h_source, info->srcaddr); 37 35 ret ^= info->invert;
+22
net/netfilter/xt_physdev.c
··· 107 107 return -EINVAL; 108 108 } 109 109 110 + #define X(memb) strnlen(info->memb, sizeof(info->memb)) >= sizeof(info->memb) 111 + if (info->bitmask & XT_PHYSDEV_OP_IN) { 112 + if (info->physindev[0] == '\0') 113 + return -EINVAL; 114 + if (X(physindev)) 115 + return -ENAMETOOLONG; 116 + } 117 + 118 + if (info->bitmask & XT_PHYSDEV_OP_OUT) { 119 + if (info->physoutdev[0] == '\0') 120 + return -EINVAL; 121 + 122 + if (X(physoutdev)) 123 + return -ENAMETOOLONG; 124 + } 125 + 126 + if (X(in_mask)) 127 + return -ENAMETOOLONG; 128 + if (X(out_mask)) 129 + return -ENAMETOOLONG; 130 + #undef X 131 + 110 132 if (!brnf_probed) { 111 133 brnf_probed = true; 112 134 request_module("br_netfilter");
+6 -17
net/netfilter/xt_socket.c
··· 168 168 static int socket_mt_v1_check(const struct xt_mtchk_param *par) 169 169 { 170 170 const struct xt_socket_mtinfo1 *info = (struct xt_socket_mtinfo1 *) par->matchinfo; 171 - int err; 172 - 173 - err = socket_mt_enable_defrag(par->net, par->family); 174 - if (err) 175 - return err; 176 171 177 172 if (info->flags & ~XT_SOCKET_FLAGS_V1) { 178 173 pr_info_ratelimited("unknown flags 0x%x\n", 179 174 info->flags & ~XT_SOCKET_FLAGS_V1); 180 175 return -EINVAL; 181 176 } 182 - return 0; 177 + 178 + return socket_mt_enable_defrag(par->net, par->family); 183 179 } 184 180 185 181 static int socket_mt_v2_check(const struct xt_mtchk_param *par) 186 182 { 187 183 const struct xt_socket_mtinfo2 *info = (struct xt_socket_mtinfo2 *) par->matchinfo; 188 - int err; 189 - 190 - err = socket_mt_enable_defrag(par->net, par->family); 191 - if (err) 192 - return err; 193 184 194 185 if (info->flags & ~XT_SOCKET_FLAGS_V2) { 195 186 pr_info_ratelimited("unknown flags 0x%x\n", 196 187 info->flags & ~XT_SOCKET_FLAGS_V2); 197 188 return -EINVAL; 198 189 } 199 - return 0; 190 + 191 + return socket_mt_enable_defrag(par->net, par->family); 200 192 } 201 193 202 194 static int socket_mt_v3_check(const struct xt_mtchk_param *par) 203 195 { 204 196 const struct xt_socket_mtinfo3 *info = 205 197 (struct xt_socket_mtinfo3 *)par->matchinfo; 206 - int err; 207 198 208 - err = socket_mt_enable_defrag(par->net, par->family); 209 - if (err) 210 - return err; 211 199 if (info->flags & ~XT_SOCKET_FLAGS_V3) { 212 200 pr_info_ratelimited("unknown flags 0x%x\n", 213 201 info->flags & ~XT_SOCKET_FLAGS_V3); 214 202 return -EINVAL; 215 203 } 216 - return 0; 204 + 205 + return socket_mt_enable_defrag(par->net, par->family); 217 206 } 218 207 219 208 static void socket_mt_destroy(const struct xt_mtdtor_param *par)