Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

kernel os linux

Pull networking fixes from David Miller:
"Just the usual assortment of small'ish fixes:

1) Conntrack timeout is sometimes not initialized properly, from
Alexander Potapenko.

2) Add a reasonable range limit to tcp_min_rtt_wlen to avoid
undefined behavior. From ZhangXiaoxu.

3) des1 field of descriptor in stmmac driver is initialized with the
wrong variable. From Yue Haibing.

4) Increase mlxsw pci sw reset timeout a little bit more, from Ido
Schimmel.

5) Match IOT2000 stmmac devices more accurately, from Su Bao Cheng.

6) Fallback refcount fix in TLS code, from Jakub Kicinski.

7) Fix max MTU check when using XDP in mlx5, from Maxim Mikityanskiy.

8) Fix recursive locking in team driver, from Hangbin Liu.

9) Fix tls_set_device_offload_Rx() deadlock, from Jakub Kicinski.

10) Don't use napi_alloc_frag() outside of softiq context of socionext
driver, from Ilias Apalodimas.

11) MAC address increment overflow in ncsi, from Tao Ren.

12) Fix a regression in 8K/1M pool switching of RDS, from Zhu Yanjun.

13) ipv4_link_failure has to validate the headers that are actually
there because RAW sockets can pass in arbitrary garbage, from Eric
Dumazet"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits)
ipv4: add sanity checks in ipv4_link_failure()
net/rose: fix unbound loop in rose_loopback_timer()
rxrpc: fix race condition in rxrpc_input_packet()
net: rds: exchange of 8K and 1M pool
net: vrf: Fix operation not supported when set vrf mac
net/ncsi: handle overflow when incrementing mac address
net: socionext: replace napi_alloc_frag with the netdev variant on init
net: atheros: fix spelling mistake "underun" -> "underrun"
spi: ST ST95HF NFC: declare missing of table
spi: Micrel eth switch: declare missing of table
net: stmmac: move stmmac_check_ether_addr() to driver probe
netfilter: fix nf_l4proto_log_invalid to log invalid packets
netfilter: never get/set skb->tstamp
netfilter: ebtables: CONFIG_COMPAT: drop a bogus WARN_ON
Documentation: decnet: remove reference to CONFIG_DECNET_ROUTE_FWMARK
dt-bindings: add an explanation for internal phy-mode
net/tls: don't leak IV and record seq when offload fails
net/tls: avoid potential deadlock in tls_set_device_offload_rx()
selftests/net: correct the return value for run_afpackettests
team: fix possible recursive locking when add slaves
...

Linus Torvalds 7 years ago cd8dead0 11bfe647

+690 -187

61 changed files

expand all

Documentation

devicetree

bindings

net

davinci_emac.txt

ethernet.txt

macb.txt

networking

decnet.txt

ip-sysctl.txt

drivers

atm

firestream.c

net

ethernet

atheros

atlx

atl1.c

atl1.h

atl2.c

atl2.h

mellanox

mlx5

core

xdp.c

xdp.h

en_ethtool.c

en_main.c

port.c

mlxsw

pci_hw.h

spectrum.c

netronome

nfp

abm

cls.c

socionext

netsec.c

stmicro

stmmac

norm_desc.c

stmmac_main.c

stmmac_pci.c

phy

spi_ks8995.c

team

team.c

vrf.c

nfc

st95hf

core.c

of_net.c

s390

net

ctcm_main.c

include

linux

etherdevice.h

net

netfilter

nf_conntrack.h

nf_conntrack_l4proto.h

net

bridge

netfilter

ebtables.c

ipv4

route.c

sysctl_net_ipv4.c

ipv6

addrlabel.c

ncsi

ncsi-rsp.c

netfilter

ipvs

ip_vs_core.c

nf_conntrack_core.c

nf_conntrack_netlink.c

nf_conntrack_proto.c

nf_conntrack_proto_icmp.c

nf_conntrack_proto_icmpv6.c

nf_nat_core.c

nf_tables_api.c

nfnetlink_log.c

nfnetlink_queue.c

xt_time.c

rds

ib_fmr.c

ib_rdma.c

rose

rose_loopback.c

rxrpc

input.c

local_object.c

tls

tls_device.c

tls_device_fallback.c

tls_main.c

tls_sw.c

tools

testing

selftests

net

run_afpackettests

run_netsocktests

netfilter

Makefile

conntrack_icmp_related.sh

nft_nat.sh

Documentation/devicetree/bindings/net/davinci_emac.txt

··· 20 20 Optional properties: 21 21 - phy-handle: See ethernet.txt file in the same directory. 22 22 If absent, davinci_emac driver defaults to 100/FULL. 23 + - nvmem-cells: phandle, reference to an nvmem node for the MAC address 24 + - nvmem-cell-names: string, should be "mac-address" if nvmem is to be used 23 25 - ti,davinci-rmii-en: 1 byte, 1 means use RMII 24 26 - ti,davinci-no-bd-ram: boolean, does EMAC have BD RAM? 25 27

+2 -3

Documentation/devicetree/bindings/net/ethernet.txt

··· 10 10 the boot program; should be used in cases where the MAC address assigned to 11 11 the device by the boot program is different from the "local-mac-address" 12 12 property; 13 - - nvmem-cells: phandle, reference to an nvmem node for the MAC address; 14 - - nvmem-cell-names: string, should be "mac-address" if nvmem is to be used; 15 13 - max-speed: number, specifies maximum speed in Mbit/s supported by the device; 16 14 - max-frame-size: number, maximum transfer unit (IEEE defined MTU), rather than 17 15 the maximum frame size (there's contradiction in the Devicetree 18 16 Specification). 19 17 - phy-mode: string, operation mode of the PHY interface. This is now a de-facto 20 18 standard property; supported values are: 21 - * "internal" 19 + * "internal" (Internal means there is not a standard bus between the MAC and 20 + the PHY, something proprietary is being used to embed the PHY in the MAC.) 22 21 * "mii" 23 22 * "gmii" 24 23 * "sgmii"

Documentation/devicetree/bindings/net/macb.txt

··· 26 26 Optional elements: 'tsu_clk' 27 27 - clocks: Phandles to input clocks. 28 28 29 + Optional properties: 30 + - nvmem-cells: phandle, reference to an nvmem node for the MAC address 31 + - nvmem-cell-names: string, should be "mac-address" if nvmem is to be used 32 + 29 33 Optional properties for PHY child node: 30 34 - reset-gpios : Should specify the gpio for phy reset 31 35 - magic-packet : If present, indicates that the hardware supports waking

-2

Documentation/networking/decnet.txt

··· 22 22 CONFIG_DECNET_ROUTER (to be able to add/delete routes) 23 23 CONFIG_NETFILTER (will be required for the DECnet routing daemon) 24 24 25 - CONFIG_DECNET_ROUTE_FWMARK is optional 26 - 27 25 Don't turn on SIOCGIFCONF support for DECnet unless you are really sure 28 26 that you need it, in general you won't and it can cause ifconfig to 29 27 malfunction.

Documentation/networking/ip-sysctl.txt

··· 422 422 minimum RTT when it is moved to a longer path (e.g., due to traffic 423 423 engineering). A longer window makes the filter more resistant to RTT 424 424 inflations such as transient congestion. The unit is seconds. 425 + Possible values: 0 - 86400 (1 day) 425 426 Default: 300 426 427 427 428 tcp_moderate_rcvbuf - BOOLEAN

+1 -1

drivers/atm/firestream.c

··· 1646 1646 } 1647 1647 1648 1648 if (status & ISR_TBRQ_W) { 1649 - fs_dprintk (FS_DEBUG_IRQ, "Data tramsitted!\n"); 1649 + fs_dprintk (FS_DEBUG_IRQ, "Data transmitted!\n"); 1650 1650 process_txdone_queue (dev, &dev->tx_relq); 1651 1651 } 1652 1652

+2 -2

drivers/net/ethernet/atheros/atlx/atl1.c

··· 1721 1721 adapter->soft_stats.scc += smb->tx_1_col; 1722 1722 adapter->soft_stats.mcc += smb->tx_2_col; 1723 1723 adapter->soft_stats.latecol += smb->tx_late_col; 1724 - adapter->soft_stats.tx_underun += smb->tx_underrun; 1724 + adapter->soft_stats.tx_underrun += smb->tx_underrun; 1725 1725 adapter->soft_stats.tx_trunc += smb->tx_trunc; 1726 1726 adapter->soft_stats.tx_pause += smb->tx_pause; 1727 1727 ··· 3179 3179 {"tx_deferred_ok", ATL1_STAT(soft_stats.deffer)}, 3180 3180 {"tx_single_coll_ok", ATL1_STAT(soft_stats.scc)}, 3181 3181 {"tx_multi_coll_ok", ATL1_STAT(soft_stats.mcc)}, 3182 - {"tx_underun", ATL1_STAT(soft_stats.tx_underun)}, 3182 + {"tx_underrun", ATL1_STAT(soft_stats.tx_underrun)}, 3183 3183 {"tx_trunc", ATL1_STAT(soft_stats.tx_trunc)}, 3184 3184 {"tx_pause", ATL1_STAT(soft_stats.tx_pause)}, 3185 3185 {"rx_pause", ATL1_STAT(soft_stats.rx_pause)},

+1 -1

drivers/net/ethernet/atheros/atlx/atl1.h

··· 681 681 u64 scc; /* packets TX after a single collision */ 682 682 u64 mcc; /* packets TX after multiple collisions */ 683 683 u64 latecol; /* TX packets w/ late collisions */ 684 - u64 tx_underun; /* TX packets aborted due to TX FIFO underrun 684 + u64 tx_underrun; /* TX packets aborted due to TX FIFO underrun 685 685 * or TRD FIFO underrun */ 686 686 u64 tx_trunc; /* TX packets truncated due to size > MTU */ 687 687 u64 rx_pause; /* num Pause packets received. */

+1 -1

drivers/net/ethernet/atheros/atlx/atl2.c

··· 553 553 netdev->stats.tx_aborted_errors++; 554 554 if (txs->late_col) 555 555 netdev->stats.tx_window_errors++; 556 - if (txs->underun) 556 + if (txs->underrun) 557 557 netdev->stats.tx_fifo_errors++; 558 558 } while (1); 559 559

+1 -1

drivers/net/ethernet/atheros/atlx/atl2.h

··· 260 260 unsigned multi_col:1; 261 261 unsigned late_col:1; 262 262 unsigned abort_col:1; 263 - unsigned underun:1; /* current packet is aborted 263 + unsigned underrun:1; /* current packet is aborted 264 264 * due to txram underrun */ 265 265 unsigned:3; /* reserved */ 266 266 unsigned update:1; /* always 1'b1 in tx_status_buf */

+22 -2

drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c

··· 33 33 #include <linux/bpf_trace.h> 34 34 #include "en/xdp.h" 35 35 36 + int mlx5e_xdp_max_mtu(struct mlx5e_params *params) 37 + { 38 + int hr = NET_IP_ALIGN + XDP_PACKET_HEADROOM; 39 + 40 + /* Let S := SKB_DATA_ALIGN(sizeof(struct skb_shared_info)). 41 + * The condition checked in mlx5e_rx_is_linear_skb is: 42 + * SKB_DATA_ALIGN(sw_mtu + hard_mtu + hr) + S <= PAGE_SIZE (1) 43 + * (Note that hw_mtu == sw_mtu + hard_mtu.) 44 + * What is returned from this function is: 45 + * max_mtu = PAGE_SIZE - S - hr - hard_mtu (2) 46 + * After assigning sw_mtu := max_mtu, the left side of (1) turns to 47 + * SKB_DATA_ALIGN(PAGE_SIZE - S) + S, which is equal to PAGE_SIZE, 48 + * because both PAGE_SIZE and S are already aligned. Any number greater 49 + * than max_mtu would make the left side of (1) greater than PAGE_SIZE, 50 + * so max_mtu is the maximum MTU allowed. 51 + */ 52 + 53 + return MLX5E_HW2SW_MTU(params, SKB_MAX_HEAD(hr)); 54 + } 55 + 36 56 static inline bool 37 57 mlx5e_xmit_xdp_buff(struct mlx5e_xdpsq *sq, struct mlx5e_dma_info *di, 38 58 struct xdp_buff *xdp) ··· 324 304 mlx5e_xdpi_fifo_pop(xdpi_fifo); 325 305 326 306 if (is_redirect) { 327 - xdp_return_frame(xdpi.xdpf); 328 307 dma_unmap_single(sq->pdev, xdpi.dma_addr, 329 308 xdpi.xdpf->len, DMA_TO_DEVICE); 309 + xdp_return_frame(xdpi.xdpf); 330 310 } else { 331 311 /* Recycle RX page */ 332 312 mlx5e_page_release(rq, &xdpi.di, true); ··· 365 345 mlx5e_xdpi_fifo_pop(xdpi_fifo); 366 346 367 347 if (is_redirect) { 368 - xdp_return_frame(xdpi.xdpf); 369 348 dma_unmap_single(sq->pdev, xdpi.dma_addr, 370 349 xdpi.xdpf->len, DMA_TO_DEVICE); 350 + xdp_return_frame(xdpi.xdpf); 371 351 } else { 372 352 /* Recycle RX page */ 373 353 mlx5e_page_release(rq, &xdpi.di, false);

+1 -2

drivers/net/ethernet/mellanox/mlx5/core/en/xdp.h

··· 34 34 35 35 #include "en.h" 36 36 37 - #define MLX5E_XDP_MAX_MTU ((int)(PAGE_SIZE - \ 38 - MLX5_SKB_FRAG_SZ(XDP_PACKET_HEADROOM))) 39 37 #define MLX5E_XDP_MIN_INLINE (ETH_HLEN + VLAN_HLEN) 40 38 #define MLX5E_XDP_TX_EMPTY_DS_COUNT \ 41 39 (sizeof(struct mlx5e_tx_wqe) / MLX5_SEND_WQE_DS) 42 40 #define MLX5E_XDP_TX_DS_COUNT (MLX5E_XDP_TX_EMPTY_DS_COUNT + 1 /* SG DS */) 43 41 42 + int mlx5e_xdp_max_mtu(struct mlx5e_params *params); 44 43 bool mlx5e_xdp_handle(struct mlx5e_rq *rq, struct mlx5e_dma_info *di, 45 44 void *va, u16 *rx_headroom, u32 *len); 46 45 bool mlx5e_poll_xdpsq_cq(struct mlx5e_cq *cq, struct mlx5e_rq *rq);

+1 -1

drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c

··· 1586 1586 break; 1587 1587 case MLX5_MODULE_ID_SFP: 1588 1588 modinfo->type = ETH_MODULE_SFF_8472; 1589 - modinfo->eeprom_len = ETH_MODULE_SFF_8472_LEN; 1589 + modinfo->eeprom_len = MLX5_EEPROM_PAGE_LENGTH; 1590 1590 break; 1591 1591 default: 1592 1592 netdev_err(priv->netdev, "%s: cable type not recognized:0x%x\n",

+3 -2

drivers/net/ethernet/mellanox/mlx5/core/en_main.c

··· 3777 3777 if (params->xdp_prog && 3778 3778 !mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) { 3779 3779 netdev_err(netdev, "MTU(%d) > %d is not allowed while XDP enabled\n", 3780 - new_mtu, MLX5E_XDP_MAX_MTU); 3780 + new_mtu, mlx5e_xdp_max_mtu(params)); 3781 3781 err = -EINVAL; 3782 3782 goto out; 3783 3783 } ··· 4212 4212 4213 4213 if (!mlx5e_rx_is_linear_skb(priv->mdev, &new_channels.params)) { 4214 4214 netdev_warn(netdev, "XDP is not allowed with MTU(%d) > %d\n", 4215 - new_channels.params.sw_mtu, MLX5E_XDP_MAX_MTU); 4215 + new_channels.params.sw_mtu, 4216 + mlx5e_xdp_max_mtu(&new_channels.params)); 4216 4217 return -EINVAL; 4217 4218 } 4218 4219

-4

drivers/net/ethernet/mellanox/mlx5/core/port.c

··· 317 317 size -= offset + size - MLX5_EEPROM_PAGE_LENGTH; 318 318 319 319 i2c_addr = MLX5_I2C_ADDR_LOW; 320 - if (offset >= MLX5_EEPROM_PAGE_LENGTH) { 321 - i2c_addr = MLX5_I2C_ADDR_HIGH; 322 - offset -= MLX5_EEPROM_PAGE_LENGTH; 323 - } 324 320 325 321 MLX5_SET(mcia_reg, in, l, 0); 326 322 MLX5_SET(mcia_reg, in, module, module_num);

+1 -1

drivers/net/ethernet/mellanox/mlxsw/pci_hw.h

··· 27 27 28 28 #define MLXSW_PCI_SW_RESET 0xF0010 29 29 #define MLXSW_PCI_SW_RESET_RST_BIT BIT(0) 30 - #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS 13000 30 + #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS 20000 31 31 #define MLXSW_PCI_SW_RESET_WAIT_MSECS 100 32 32 #define MLXSW_PCI_FW_READY 0xA1844 33 33 #define MLXSW_PCI_FW_READY_MASK 0xFFFF

+3 -3

drivers/net/ethernet/mellanox/mlxsw/spectrum.c

··· 3126 3126 if (err) 3127 3127 return err; 3128 3128 3129 + mlxsw_sp_port->link.autoneg = autoneg; 3130 + 3129 3131 if (!netif_running(dev)) 3130 3132 return 0; 3131 - 3132 - mlxsw_sp_port->link.autoneg = autoneg; 3133 3133 3134 3134 mlxsw_sp_port_admin_status_set(mlxsw_sp_port, false); 3135 3135 mlxsw_sp_port_admin_status_set(mlxsw_sp_port, true); ··· 3316 3316 err = mlxsw_sp_port_ets_set(mlxsw_sp_port, 3317 3317 MLXSW_REG_QEEC_HIERARCY_TC, 3318 3318 i + 8, i, 3319 - false, 0); 3319 + true, 100); 3320 3320 if (err) 3321 3321 return err; 3322 3322 }

+2 -2

drivers/net/ethernet/netronome/nfp/abm/cls.c

··· 39 39 } 40 40 if (knode->sel->off || knode->sel->offshift || knode->sel->offmask || 41 41 knode->sel->offoff || knode->fshift) { 42 - NL_SET_ERR_MSG_MOD(extack, "variable offseting not supported"); 42 + NL_SET_ERR_MSG_MOD(extack, "variable offsetting not supported"); 43 43 return false; 44 44 } 45 45 if (knode->sel->hoff || knode->sel->hmask) { ··· 78 78 79 79 k = &knode->sel->keys[0]; 80 80 if (k->offmask) { 81 - NL_SET_ERR_MSG_MOD(extack, "offset mask - variable offseting not supported"); 81 + NL_SET_ERR_MSG_MOD(extack, "offset mask - variable offsetting not supported"); 82 82 return false; 83 83 } 84 84 if (k->off) {

+7 -4

drivers/net/ethernet/socionext/netsec.c

··· 673 673 } 674 674 675 675 static void *netsec_alloc_rx_data(struct netsec_priv *priv, 676 - dma_addr_t *dma_handle, u16 *desc_len) 676 + dma_addr_t *dma_handle, u16 *desc_len, 677 + bool napi) 677 678 { 678 679 size_t total_len = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); 679 680 size_t payload_len = NETSEC_RX_BUF_SZ; ··· 683 682 684 683 total_len += SKB_DATA_ALIGN(payload_len + NETSEC_SKB_PAD); 685 684 686 - buf = napi_alloc_frag(total_len); 685 + buf = napi ? napi_alloc_frag(total_len) : netdev_alloc_frag(total_len); 687 686 if (!buf) 688 687 return NULL; 689 688 ··· 766 765 /* allocate a fresh buffer and map it to the hardware. 767 766 * This will eventually replace the old buffer in the hardware 768 767 */ 769 - buf_addr = netsec_alloc_rx_data(priv, &dma_handle, &desc_len); 768 + buf_addr = netsec_alloc_rx_data(priv, &dma_handle, &desc_len, 769 + true); 770 770 if (unlikely(!buf_addr)) 771 771 break; 772 772 ··· 1071 1069 void *buf; 1072 1070 u16 len; 1073 1071 1074 - buf = netsec_alloc_rx_data(priv, &dma_handle, &len); 1072 + buf = netsec_alloc_rx_data(priv, &dma_handle, &len, 1073 + false); 1075 1074 if (!buf) { 1076 1075 netsec_uninit_pkt_dring(priv, NETSEC_RING_RX); 1077 1076 goto err_out;

+1 -1

drivers/net/ethernet/stmicro/stmmac/norm_desc.c

··· 140 140 p->des0 |= cpu_to_le32(RDES0_OWN); 141 141 142 142 bfsize1 = min(bfsize, BUF_SIZE_2KiB - 1); 143 - p->des1 |= cpu_to_le32(bfsize & RDES1_BUFFER1_SIZE_MASK); 143 + p->des1 |= cpu_to_le32(bfsize1 & RDES1_BUFFER1_SIZE_MASK); 144 144 145 145 if (mode == STMMAC_CHAIN_MODE) 146 146 ndesc_rx_set_on_chain(p, end);

+2 -2

drivers/net/ethernet/stmicro/stmmac/stmmac_main.c

··· 2616 2616 u32 chan; 2617 2617 int ret; 2618 2618 2619 - stmmac_check_ether_addr(priv); 2620 - 2621 2619 if (priv->hw->pcs != STMMAC_PCS_RGMII && 2622 2620 priv->hw->pcs != STMMAC_PCS_TBI && 2623 2621 priv->hw->pcs != STMMAC_PCS_RTBI) { ··· 4300 4302 ret = stmmac_hw_init(priv); 4301 4303 if (ret) 4302 4304 goto error_hw_init; 4305 + 4306 + stmmac_check_ether_addr(priv); 4303 4307 4304 4308 /* Configure real RX and TX queues */ 4305 4309 netif_set_real_num_rx_queues(ndev, priv->plat->rx_queues_to_use);

+6 -2

drivers/net/ethernet/stmicro/stmmac/stmmac_pci.c

··· 159 159 }, 160 160 .driver_data = (void *)&galileo_stmmac_dmi_data, 161 161 }, 162 + /* 163 + * There are 2 types of SIMATIC IOT2000: IOT20202 and IOT2040. 164 + * The asset tag "6ES7647-0AA00-0YA2" is only for IOT2020 which 165 + * has only one pci network device while other asset tags are 166 + * for IOT2040 which has two. 167 + */ 162 168 { 163 169 .matches = { 164 170 DMI_EXACT_MATCH(DMI_BOARD_NAME, "SIMATIC IOT2000"), ··· 176 170 { 177 171 .matches = { 178 172 DMI_EXACT_MATCH(DMI_BOARD_NAME, "SIMATIC IOT2000"), 179 - DMI_EXACT_MATCH(DMI_BOARD_ASSET_TAG, 180 - "6ES7647-0AA00-1YA2"), 181 173 }, 182 174 .driver_data = (void *)&iot2040_stmmac_dmi_data, 183 175 },

drivers/net/phy/spi_ks8995.c

··· 159 159 }; 160 160 MODULE_DEVICE_TABLE(spi, ks8995_id); 161 161 162 + static const struct of_device_id ks8895_spi_of_match[] = { 163 + { .compatible = "micrel,ks8995" }, 164 + { .compatible = "micrel,ksz8864" }, 165 + { .compatible = "micrel,ksz8795" }, 166 + { }, 167 + }; 168 + MODULE_DEVICE_TABLE(of, ks8895_spi_of_match); 169 + 162 170 static inline u8 get_chip_id(u8 val) 163 171 { 164 172 return (val >> ID1_CHIPID_S) & ID1_CHIPID_M; ··· 534 526 static struct spi_driver ks8995_driver = { 535 527 .driver = { 536 528 .name = "spi-ks8995", 529 + .of_match_table = of_match_ptr(ks8895_spi_of_match), 537 530 }, 538 531 .probe = ks8995_probe, 539 532 .remove = ks8995_remove,

drivers/net/team/team.c

··· 1156 1156 return -EINVAL; 1157 1157 } 1158 1158 1159 + if (netdev_has_upper_dev(dev, port_dev)) { 1160 + NL_SET_ERR_MSG(extack, "Device is already an upper device of the team interface"); 1161 + netdev_err(dev, "Device %s is already an upper device of the team interface\n", 1162 + portname); 1163 + return -EBUSY; 1164 + } 1165 + 1159 1166 if (port_dev->features & NETIF_F_VLAN_CHALLENGED && 1160 1167 vlan_uses_dev(dev)) { 1161 1168 NL_SET_ERR_MSG(extack, "Device is VLAN challenged and team device has VLAN set up");

drivers/net/vrf.c

··· 875 875 .ndo_init = vrf_dev_init, 876 876 .ndo_uninit = vrf_dev_uninit, 877 877 .ndo_start_xmit = vrf_xmit, 878 + .ndo_set_mac_address = eth_mac_addr, 878 879 .ndo_get_stats64 = vrf_get_stats64, 879 880 .ndo_add_slave = vrf_add_slave, 880 881 .ndo_del_slave = vrf_del_slave, ··· 1275 1274 /* default to no qdisc; user can add if desired */ 1276 1275 dev->priv_flags |= IFF_NO_QUEUE; 1277 1276 dev->priv_flags |= IFF_NO_RX_HANDLER; 1277 + dev->priv_flags |= IFF_LIVE_ADDR_CHANGE; 1278 1278 1279 1279 /* VRF devices do not care about MTU, but if the MTU is set 1280 1280 * too low then the ipv4 and ipv6 protocols are disabled

drivers/nfc/st95hf/core.c

··· 1074 1074 }; 1075 1075 MODULE_DEVICE_TABLE(spi, st95hf_id); 1076 1076 1077 + static const struct of_device_id st95hf_spi_of_match[] = { 1078 + { .compatible = "st,st95hf" }, 1079 + { }, 1080 + }; 1081 + MODULE_DEVICE_TABLE(of, st95hf_spi_of_match); 1082 + 1077 1083 static int st95hf_probe(struct spi_device *nfc_spi_dev) 1078 1084 { 1079 1085 int ret; ··· 1266 1260 .driver = { 1267 1261 .name = "st95hf", 1268 1262 .owner = THIS_MODULE, 1263 + .of_match_table = of_match_ptr(st95hf_spi_of_match), 1269 1264 }, 1270 1265 .id_table = st95hf_id, 1271 1266 .probe = st95hf_probe,

-1

drivers/of/of_net.c

··· 7 7 */ 8 8 #include <linux/etherdevice.h> 9 9 #include <linux/kernel.h> 10 - #include <linux/nvmem-consumer.h> 11 10 #include <linux/of_net.h> 12 11 #include <linux/phy.h> 13 12 #include <linux/export.h>

drivers/s390/net/ctcm_main.c

··· 1595 1595 if (priv->channel[direction] == NULL) { 1596 1596 if (direction == CTCM_WRITE) 1597 1597 channel_free(priv->channel[CTCM_READ]); 1598 + result = -ENODEV; 1598 1599 goto out_dev; 1599 1600 } 1600 1601 priv->channel[direction]->netdev = dev;

+12

include/linux/etherdevice.h

··· 449 449 } 450 450 451 451 /** 452 + * eth_addr_inc() - Increment the given MAC address. 453 + * @addr: Pointer to a six-byte array containing Ethernet address to increment. 454 + */ 455 + static inline void eth_addr_inc(u8 *addr) 456 + { 457 + u64 u = ether_addr_to_u64(addr); 458 + 459 + u++; 460 + u64_to_ether_addr(u, addr); 461 + } 462 + 463 + /** 452 464 * is_etherdev_addr - Tell if given Ethernet address belongs to the device. 453 465 * @dev: Pointer to a device structure 454 466 * @addr: Pointer to a six-byte array containing the Ethernet address

include/net/netfilter/nf_conntrack.h

··· 316 316 gfp_t flags); 317 317 void nf_ct_tmpl_free(struct nf_conn *tmpl); 318 318 319 + u32 nf_ct_get_id(const struct nf_conn *ct); 320 + 319 321 static inline void 320 322 nf_ct_set(struct sk_buff *skb, struct nf_conn *ct, enum ip_conntrack_info info) 321 323 {

include/net/netfilter/nf_conntrack_l4proto.h

··· 75 75 bool nf_conntrack_invert_icmpv6_tuple(struct nf_conntrack_tuple *tuple, 76 76 const struct nf_conntrack_tuple *orig); 77 77 78 + int nf_conntrack_inet_error(struct nf_conn *tmpl, struct sk_buff *skb, 79 + unsigned int dataoff, 80 + const struct nf_hook_state *state, 81 + u8 l4proto, 82 + union nf_inet_addr *outer_daddr); 83 + 78 84 int nf_conntrack_icmpv4_error(struct nf_conn *tmpl, 79 85 struct sk_buff *skb, 80 86 unsigned int dataoff,

+2 -1

net/bridge/netfilter/ebtables.c

··· 2032 2032 if (match_kern) 2033 2033 match_kern->match_size = ret; 2034 2034 2035 - if (WARN_ON(type == EBT_COMPAT_TARGET && size_left)) 2035 + /* rule should have no remaining data after target */ 2036 + if (type == EBT_COMPAT_TARGET && size_left) 2036 2037 return -EINVAL; 2037 2038 2038 2039 match32 = (struct compat_ebt_entry_mwt *) buf;

+24 -10

net/ipv4/route.c

··· 1183 1183 return dst; 1184 1184 } 1185 1185 1186 - static void ipv4_link_failure(struct sk_buff *skb) 1186 + static void ipv4_send_dest_unreach(struct sk_buff *skb) 1187 1187 { 1188 1188 struct ip_options opt; 1189 - struct rtable *rt; 1190 1189 int res; 1191 1190 1192 1191 /* Recompile ip options since IPCB may not be valid anymore. 1192 + * Also check we have a reasonable ipv4 header. 1193 1193 */ 1194 - memset(&opt, 0, sizeof(opt)); 1195 - opt.optlen = ip_hdr(skb)->ihl*4 - sizeof(struct iphdr); 1196 - 1197 - rcu_read_lock(); 1198 - res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); 1199 - rcu_read_unlock(); 1200 - 1201 - if (res) 1194 + if (!pskb_network_may_pull(skb, sizeof(struct iphdr)) || 1195 + ip_hdr(skb)->version != 4 || ip_hdr(skb)->ihl < 5) 1202 1196 return; 1203 1197 1198 + memset(&opt, 0, sizeof(opt)); 1199 + if (ip_hdr(skb)->ihl > 5) { 1200 + if (!pskb_network_may_pull(skb, ip_hdr(skb)->ihl * 4)) 1201 + return; 1202 + opt.optlen = ip_hdr(skb)->ihl * 4 - sizeof(struct iphdr); 1203 + 1204 + rcu_read_lock(); 1205 + res = __ip_options_compile(dev_net(skb->dev), &opt, skb, NULL); 1206 + rcu_read_unlock(); 1207 + 1208 + if (res) 1209 + return; 1210 + } 1204 1211 __icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_UNREACH, 0, &opt); 1212 + } 1213 + 1214 + static void ipv4_link_failure(struct sk_buff *skb) 1215 + { 1216 + struct rtable *rt; 1217 + 1218 + ipv4_send_dest_unreach(skb); 1205 1219 1206 1220 rt = skb_rtable(skb); 1207 1221 if (rt)

+4 -1

net/ipv4/sysctl_net_ipv4.c

··· 49 49 static int ip_ping_group_range_max[] = { GID_T_MAX, GID_T_MAX }; 50 50 static int comp_sack_nr_max = 255; 51 51 static u32 u32_max_div_HZ = UINT_MAX / HZ; 52 + static int one_day_secs = 24 * 3600; 52 53 53 54 /* obsolete */ 54 55 static int sysctl_tcp_low_latency __read_mostly; ··· 1152 1151 .data = &init_net.ipv4.sysctl_tcp_min_rtt_wlen, 1153 1152 .maxlen = sizeof(int), 1154 1153 .mode = 0644, 1155 - .proc_handler = proc_dointvec 1154 + .proc_handler = proc_dointvec_minmax, 1155 + .extra1 = &zero, 1156 + .extra2 = &one_day_secs 1156 1157 }, 1157 1158 { 1158 1159 .procname = "tcp_autocorking",

+1 -1

net/ipv6/addrlabel.c

··· 476 476 } 477 477 478 478 if (nlmsg_attrlen(nlh, sizeof(*ifal))) { 479 - NL_SET_ERR_MSG_MOD(extack, "Invalid data after header for address label dump requewst"); 479 + NL_SET_ERR_MSG_MOD(extack, "Invalid data after header for address label dump request"); 480 480 return -EINVAL; 481 481 } 482 482

+5 -1

net/ncsi/ncsi-rsp.c

··· 11 11 #include <linux/kernel.h> 12 12 #include <linux/init.h> 13 13 #include <linux/netdevice.h> 14 + #include <linux/etherdevice.h> 14 15 #include <linux/skbuff.h> 15 16 16 17 #include <net/ncsi.h> ··· 668 667 ndev->priv_flags |= IFF_LIVE_ADDR_CHANGE; 669 668 memcpy(saddr.sa_data, &rsp->data[BCM_MAC_ADDR_OFFSET], ETH_ALEN); 670 669 /* Increase mac address by 1 for BMC's address */ 671 - saddr.sa_data[ETH_ALEN - 1]++; 670 + eth_addr_inc((u8 *)saddr.sa_data); 671 + if (!is_valid_ether_addr((const u8 *)saddr.sa_data)) 672 + return -ENXIO; 673 + 672 674 ret = ops->ndo_set_mac_address(ndev, &saddr); 673 675 if (ret < 0) 674 676 netdev_warn(ndev, "NCSI: 'Writing mac address to device failed\n");

+1 -1

net/netfilter/ipvs/ip_vs_core.c

··· 1678 1678 if (!cp) { 1679 1679 int v; 1680 1680 1681 - if (!sysctl_schedule_icmp(ipvs)) 1681 + if (ipip || !sysctl_schedule_icmp(ipvs)) 1682 1682 return NF_ACCEPT; 1683 1683 1684 1684 if (!ip_vs_try_to_schedule(ipvs, AF_INET, skb, pd, &v, &cp, &ciph))

+38 -5

net/netfilter/nf_conntrack_core.c

··· 25 25 #include <linux/slab.h> 26 26 #include <linux/random.h> 27 27 #include <linux/jhash.h> 28 + #include <linux/siphash.h> 28 29 #include <linux/err.h> 29 30 #include <linux/percpu.h> 30 31 #include <linux/moduleparam.h> ··· 449 448 return true; 450 449 } 451 450 EXPORT_SYMBOL_GPL(nf_ct_invert_tuple); 451 + 452 + /* Generate a almost-unique pseudo-id for a given conntrack. 453 + * 454 + * intentionally doesn't re-use any of the seeds used for hash 455 + * table location, we assume id gets exposed to userspace. 456 + * 457 + * Following nf_conn items do not change throughout lifetime 458 + * of the nf_conn after it has been committed to main hash table: 459 + * 460 + * 1. nf_conn address 461 + * 2. nf_conn->ext address 462 + * 3. nf_conn->master address (normally NULL) 463 + * 4. tuple 464 + * 5. the associated net namespace 465 + */ 466 + u32 nf_ct_get_id(const struct nf_conn *ct) 467 + { 468 + static __read_mostly siphash_key_t ct_id_seed; 469 + unsigned long a, b, c, d; 470 + 471 + net_get_random_once(&ct_id_seed, sizeof(ct_id_seed)); 472 + 473 + a = (unsigned long)ct; 474 + b = (unsigned long)ct->master ^ net_hash_mix(nf_ct_net(ct)); 475 + c = (unsigned long)ct->ext; 476 + d = (unsigned long)siphash(&ct->tuplehash, sizeof(ct->tuplehash), 477 + &ct_id_seed); 478 + #ifdef CONFIG_64BIT 479 + return siphash_4u64((u64)a, (u64)b, (u64)c, (u64)d, &ct_id_seed); 480 + #else 481 + return siphash_4u32((u32)a, (u32)b, (u32)c, (u32)d, &ct_id_seed); 482 + #endif 483 + } 484 + EXPORT_SYMBOL_GPL(nf_ct_get_id); 452 485 453 486 static void 454 487 clean_from_lists(struct nf_conn *ct) ··· 1017 982 1018 983 /* set conntrack timestamp, if enabled. */ 1019 984 tstamp = nf_conn_tstamp_find(ct); 1020 - if (tstamp) { 1021 - if (skb->tstamp == 0) 1022 - __net_timestamp(skb); 985 + if (tstamp) 986 + tstamp->start = ktime_get_real_ns(); 1023 987 1024 - tstamp->start = ktime_to_ns(skb->tstamp); 1025 - } 1026 988 /* Since the lookup is lockless, hash insertion must be done after 1027 989 * starting the timer and setting the CONFIRMED bit. The RCU barriers 1028 990 * guarantee that no other CPU can find the conntrack before the above ··· 1382 1350 /* save hash for reusing when confirming */ 1383 1351 *(unsigned long *)(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev) = hash; 1384 1352 ct->status = 0; 1353 + ct->timeout = 0; 1385 1354 write_pnet(&ct->ct_net, net); 1386 1355 memset(&ct->__nfct_init_offset[0], 0, 1387 1356 offsetof(struct nf_conn, proto) -

+29 -5

net/netfilter/nf_conntrack_netlink.c

··· 29 29 #include <linux/spinlock.h> 30 30 #include <linux/interrupt.h> 31 31 #include <linux/slab.h> 32 + #include <linux/siphash.h> 32 33 33 34 #include <linux/netfilter.h> 34 35 #include <net/netlink.h> ··· 486 485 487 486 static int ctnetlink_dump_id(struct sk_buff *skb, const struct nf_conn *ct) 488 487 { 489 - if (nla_put_be32(skb, CTA_ID, htonl((unsigned long)ct))) 488 + __be32 id = (__force __be32)nf_ct_get_id(ct); 489 + 490 + if (nla_put_be32(skb, CTA_ID, id)) 490 491 goto nla_put_failure; 491 492 return 0; 492 493 ··· 1289 1286 } 1290 1287 1291 1288 if (cda[CTA_ID]) { 1292 - u_int32_t id = ntohl(nla_get_be32(cda[CTA_ID])); 1293 - if (id != (u32)(unsigned long)ct) { 1289 + __be32 id = nla_get_be32(cda[CTA_ID]); 1290 + 1291 + if (id != (__force __be32)nf_ct_get_id(ct)) { 1294 1292 nf_ct_put(ct); 1295 1293 return -ENOENT; 1296 1294 } ··· 2696 2692 2697 2693 static const union nf_inet_addr any_addr; 2698 2694 2695 + static __be32 nf_expect_get_id(const struct nf_conntrack_expect *exp) 2696 + { 2697 + static __read_mostly siphash_key_t exp_id_seed; 2698 + unsigned long a, b, c, d; 2699 + 2700 + net_get_random_once(&exp_id_seed, sizeof(exp_id_seed)); 2701 + 2702 + a = (unsigned long)exp; 2703 + b = (unsigned long)exp->helper; 2704 + c = (unsigned long)exp->master; 2705 + d = (unsigned long)siphash(&exp->tuple, sizeof(exp->tuple), &exp_id_seed); 2706 + 2707 + #ifdef CONFIG_64BIT 2708 + return (__force __be32)siphash_4u64((u64)a, (u64)b, (u64)c, (u64)d, &exp_id_seed); 2709 + #else 2710 + return (__force __be32)siphash_4u32((u32)a, (u32)b, (u32)c, (u32)d, &exp_id_seed); 2711 + #endif 2712 + } 2713 + 2699 2714 static int 2700 2715 ctnetlink_exp_dump_expect(struct sk_buff *skb, 2701 2716 const struct nf_conntrack_expect *exp) ··· 2762 2739 } 2763 2740 #endif 2764 2741 if (nla_put_be32(skb, CTA_EXPECT_TIMEOUT, htonl(timeout)) || 2765 - nla_put_be32(skb, CTA_EXPECT_ID, htonl((unsigned long)exp)) || 2742 + nla_put_be32(skb, CTA_EXPECT_ID, nf_expect_get_id(exp)) || 2766 2743 nla_put_be32(skb, CTA_EXPECT_FLAGS, htonl(exp->flags)) || 2767 2744 nla_put_be32(skb, CTA_EXPECT_CLASS, htonl(exp->class))) 2768 2745 goto nla_put_failure; ··· 3067 3044 3068 3045 if (cda[CTA_EXPECT_ID]) { 3069 3046 __be32 id = nla_get_be32(cda[CTA_EXPECT_ID]); 3070 - if (ntohl(id) != (u32)(unsigned long)exp) { 3047 + 3048 + if (id != nf_expect_get_id(exp)) { 3071 3049 nf_ct_expect_put(exp); 3072 3050 return -ENOENT; 3073 3051 }

+1 -1

net/netfilter/nf_conntrack_proto.c

··· 55 55 struct va_format vaf; 56 56 va_list args; 57 57 58 - if (net->ct.sysctl_log_invalid != protonum || 58 + if (net->ct.sysctl_log_invalid != protonum && 59 59 net->ct.sysctl_log_invalid != IPPROTO_RAW) 60 60 return; 61 61

+74 -23

net/netfilter/nf_conntrack_proto_icmp.c

··· 103 103 return NF_ACCEPT; 104 104 } 105 105 106 - /* Returns conntrack if it dealt with ICMP, and filled in skb fields */ 107 - static int 108 - icmp_error_message(struct nf_conn *tmpl, struct sk_buff *skb, 109 - const struct nf_hook_state *state) 106 + /* Check inner header is related to any of the existing connections */ 107 + int nf_conntrack_inet_error(struct nf_conn *tmpl, struct sk_buff *skb, 108 + unsigned int dataoff, 109 + const struct nf_hook_state *state, 110 + u8 l4proto, union nf_inet_addr *outer_daddr) 110 111 { 111 112 struct nf_conntrack_tuple innertuple, origtuple; 112 113 const struct nf_conntrack_tuple_hash *h; 113 114 const struct nf_conntrack_zone *zone; 114 115 enum ip_conntrack_info ctinfo; 115 116 struct nf_conntrack_zone tmp; 117 + union nf_inet_addr *ct_daddr; 118 + enum ip_conntrack_dir dir; 119 + struct nf_conn *ct; 116 120 117 121 WARN_ON(skb_nfct(skb)); 118 122 zone = nf_ct_zone_tmpl(tmpl, skb, &tmp); 119 123 120 124 /* Are they talking about one of our connections? */ 121 - if (!nf_ct_get_tuplepr(skb, 122 - skb_network_offset(skb) + ip_hdrlen(skb) 123 - + sizeof(struct icmphdr), 124 - PF_INET, state->net, &origtuple)) { 125 - pr_debug("icmp_error_message: failed to get tuple\n"); 125 + if (!nf_ct_get_tuplepr(skb, dataoff, 126 + state->pf, state->net, &origtuple)) 126 127 return -NF_ACCEPT; 127 - } 128 128 129 129 /* Ordinarily, we'd expect the inverted tupleproto, but it's 130 130 been preserved inside the ICMP. */ 131 - if (!nf_ct_invert_tuple(&innertuple, &origtuple)) { 132 - pr_debug("icmp_error_message: no match\n"); 131 + if (!nf_ct_invert_tuple(&innertuple, &origtuple)) 132 + return -NF_ACCEPT; 133 + 134 + h = nf_conntrack_find_get(state->net, zone, &innertuple); 135 + if (!h) 136 + return -NF_ACCEPT; 137 + 138 + /* Consider: A -> T (=This machine) -> B 139 + * Conntrack entry will look like this: 140 + * Original: A->B 141 + * Reply: B->T (SNAT case) OR A 142 + * 143 + * When this function runs, we got packet that looks like this: 144 + * iphdr|icmphdr|inner_iphdr|l4header (tcp, udp, ..). 145 + * 146 + * Above nf_conntrack_find_get() makes lookup based on inner_hdr, 147 + * so we should expect that destination of the found connection 148 + * matches outer header destination address. 149 + * 150 + * In above example, we can consider these two cases: 151 + * 1. Error coming in reply direction from B or M (middle box) to 152 + * T (SNAT case) or A. 153 + * Inner saddr will be B, dst will be T or A. 154 + * The found conntrack will be reply tuple (B->T/A). 155 + * 2. Error coming in original direction from A or M to B. 156 + * Inner saddr will be A, inner daddr will be B. 157 + * The found conntrack will be original tuple (A->B). 158 + * 159 + * In both cases, conntrack[dir].dst == inner.dst. 160 + * 161 + * A bogus packet could look like this: 162 + * Inner: B->T 163 + * Outer: B->X (other machine reachable by T). 164 + * 165 + * In this case, lookup yields connection A->B and will 166 + * set packet from B->X as *RELATED*, even though no connection 167 + * from X was ever seen. 168 + */ 169 + ct = nf_ct_tuplehash_to_ctrack(h); 170 + dir = NF_CT_DIRECTION(h); 171 + ct_daddr = &ct->tuplehash[dir].tuple.dst.u3; 172 + if (!nf_inet_addr_cmp(outer_daddr, ct_daddr)) { 173 + if (state->pf == AF_INET) { 174 + nf_l4proto_log_invalid(skb, state->net, state->pf, 175 + l4proto, 176 + "outer daddr %pI4 != inner %pI4", 177 + &outer_daddr->ip, &ct_daddr->ip); 178 + } else if (state->pf == AF_INET6) { 179 + nf_l4proto_log_invalid(skb, state->net, state->pf, 180 + l4proto, 181 + "outer daddr %pI6 != inner %pI6", 182 + &outer_daddr->ip6, &ct_daddr->ip6); 183 + } 184 + nf_ct_put(ct); 133 185 return -NF_ACCEPT; 134 186 } 135 187 136 188 ctinfo = IP_CT_RELATED; 137 - 138 - h = nf_conntrack_find_get(state->net, zone, &innertuple); 139 - if (!h) { 140 - pr_debug("icmp_error_message: no match\n"); 141 - return -NF_ACCEPT; 142 - } 143 - 144 - if (NF_CT_DIRECTION(h) == IP_CT_DIR_REPLY) 189 + if (dir == IP_CT_DIR_REPLY) 145 190 ctinfo += IP_CT_IS_REPLY; 146 191 147 192 /* Update skb to refer to this connection */ 148 - nf_ct_set(skb, nf_ct_tuplehash_to_ctrack(h), ctinfo); 193 + nf_ct_set(skb, ct, ctinfo); 149 194 return NF_ACCEPT; 150 195 } 151 196 ··· 207 162 struct sk_buff *skb, unsigned int dataoff, 208 163 const struct nf_hook_state *state) 209 164 { 165 + union nf_inet_addr outer_daddr; 210 166 const struct icmphdr *icmph; 211 167 struct icmphdr _ih; 212 168 213 169 /* Not enough header? */ 214 - icmph = skb_header_pointer(skb, ip_hdrlen(skb), sizeof(_ih), &_ih); 170 + icmph = skb_header_pointer(skb, dataoff, sizeof(_ih), &_ih); 215 171 if (icmph == NULL) { 216 172 icmp_error_log(skb, state, "short packet"); 217 173 return -NF_ACCEPT; ··· 245 199 icmph->type != ICMP_REDIRECT) 246 200 return NF_ACCEPT; 247 201 248 - return icmp_error_message(tmpl, skb, state); 202 + memset(&outer_daddr, 0, sizeof(outer_daddr)); 203 + outer_daddr.ip = ip_hdr(skb)->daddr; 204 + 205 + dataoff += sizeof(*icmph); 206 + return nf_conntrack_inet_error(tmpl, skb, dataoff, state, 207 + IPPROTO_ICMP, &outer_daddr); 249 208 } 250 209 251 210 #if IS_ENABLED(CONFIG_NF_CT_NETLINK)

+6 -46

net/netfilter/nf_conntrack_proto_icmpv6.c

··· 123 123 return NF_ACCEPT; 124 124 } 125 125 126 - static int 127 - icmpv6_error_message(struct net *net, struct nf_conn *tmpl, 128 - struct sk_buff *skb, 129 - unsigned int icmp6off) 130 - { 131 - struct nf_conntrack_tuple intuple, origtuple; 132 - const struct nf_conntrack_tuple_hash *h; 133 - enum ip_conntrack_info ctinfo; 134 - struct nf_conntrack_zone tmp; 135 - 136 - WARN_ON(skb_nfct(skb)); 137 - 138 - /* Are they talking about one of our connections? */ 139 - if (!nf_ct_get_tuplepr(skb, 140 - skb_network_offset(skb) 141 - + sizeof(struct ipv6hdr) 142 - + sizeof(struct icmp6hdr), 143 - PF_INET6, net, &origtuple)) { 144 - pr_debug("icmpv6_error: Can't get tuple\n"); 145 - return -NF_ACCEPT; 146 - } 147 - 148 - /* Ordinarily, we'd expect the inverted tupleproto, but it's 149 - been preserved inside the ICMP. */ 150 - if (!nf_ct_invert_tuple(&intuple, &origtuple)) { 151 - pr_debug("icmpv6_error: Can't invert tuple\n"); 152 - return -NF_ACCEPT; 153 - } 154 - 155 - ctinfo = IP_CT_RELATED; 156 - 157 - h = nf_conntrack_find_get(net, nf_ct_zone_tmpl(tmpl, skb, &tmp), 158 - &intuple); 159 - if (!h) { 160 - pr_debug("icmpv6_error: no match\n"); 161 - return -NF_ACCEPT; 162 - } else { 163 - if (NF_CT_DIRECTION(h) == IP_CT_DIR_REPLY) 164 - ctinfo += IP_CT_IS_REPLY; 165 - } 166 - 167 - /* Update skb to refer to this connection */ 168 - nf_ct_set(skb, nf_ct_tuplehash_to_ctrack(h), ctinfo); 169 - return NF_ACCEPT; 170 - } 171 126 172 127 static void icmpv6_error_log(const struct sk_buff *skb, 173 128 const struct nf_hook_state *state, ··· 137 182 unsigned int dataoff, 138 183 const struct nf_hook_state *state) 139 184 { 185 + union nf_inet_addr outer_daddr; 140 186 const struct icmp6hdr *icmp6h; 141 187 struct icmp6hdr _ih; 142 188 int type; ··· 166 210 if (icmp6h->icmp6_type >= 128) 167 211 return NF_ACCEPT; 168 212 169 - return icmpv6_error_message(state->net, tmpl, skb, dataoff); 213 + memcpy(&outer_daddr.ip6, &ipv6_hdr(skb)->daddr, 214 + sizeof(outer_daddr.ip6)); 215 + dataoff += sizeof(*icmp6h); 216 + return nf_conntrack_inet_error(tmpl, skb, dataoff, state, 217 + IPPROTO_ICMPV6, &outer_daddr); 170 218 } 171 219 172 220 #if IS_ENABLED(CONFIG_NF_CT_NETLINK)

+8 -3

net/netfilter/nf_nat_core.c

··· 415 415 case IPPROTO_ICMPV6: 416 416 /* id is same for either direction... */ 417 417 keyptr = &tuple->src.u.icmp.id; 418 - min = range->min_proto.icmp.id; 419 - range_size = ntohs(range->max_proto.icmp.id) - 420 - ntohs(range->min_proto.icmp.id) + 1; 418 + if (!(range->flags & NF_NAT_RANGE_PROTO_SPECIFIED)) { 419 + min = 0; 420 + range_size = 65536; 421 + } else { 422 + min = ntohs(range->min_proto.icmp.id); 423 + range_size = ntohs(range->max_proto.icmp.id) - 424 + ntohs(range->min_proto.icmp.id) + 1; 425 + } 421 426 goto find_free_id; 422 427 #if IS_ENABLED(CONFIG_NF_CT_PROTO_GRE) 423 428 case IPPROTO_GRE:

+1 -1

net/netfilter/nf_tables_api.c

··· 1545 1545 if (IS_ERR(type)) 1546 1546 return PTR_ERR(type); 1547 1547 } 1548 - if (!(type->hook_mask & (1 << hook->num))) 1548 + if (hook->num > NF_MAX_HOOKS || !(type->hook_mask & (1 << hook->num))) 1549 1549 return -EOPNOTSUPP; 1550 1550 1551 1551 if (type->type == NFT_CHAIN_T_NAT &&

+1 -1

net/netfilter/nfnetlink_log.c

··· 540 540 goto nla_put_failure; 541 541 } 542 542 543 - if (skb->tstamp) { 543 + if (hooknum <= NF_INET_FORWARD && skb->tstamp) { 544 544 struct nfulnl_msg_packet_timestamp ts; 545 545 struct timespec64 kts = ktime_to_timespec64(skb->tstamp); 546 546 ts.sec = cpu_to_be64(kts.tv_sec);

+1 -1

net/netfilter/nfnetlink_queue.c

··· 582 582 if (nfqnl_put_bridge(entry, skb) < 0) 583 583 goto nla_put_failure; 584 584 585 - if (entskb->tstamp) { 585 + if (entry->state.hook <= NF_INET_FORWARD && entskb->tstamp) { 586 586 struct nfqnl_msg_packet_timestamp ts; 587 587 struct timespec64 kts = ktime_to_timespec64(entskb->tstamp); 588 588

+14 -9

net/netfilter/xt_time.c

··· 163 163 s64 stamp; 164 164 165 165 /* 166 - * We cannot use get_seconds() instead of __net_timestamp() here. 166 + * We need real time here, but we can neither use skb->tstamp 167 + * nor __net_timestamp(). 168 + * 169 + * skb->tstamp and skb->skb_mstamp_ns overlap, however, they 170 + * use different clock types (real vs monotonic). 171 + * 167 172 * Suppose you have two rules: 168 - * 1. match before 13:00 169 - * 2. match after 13:00 173 + * 1. match before 13:00 174 + * 2. match after 13:00 175 + * 170 176 * If you match against processing time (get_seconds) it 171 177 * may happen that the same packet matches both rules if 172 - * it arrived at the right moment before 13:00. 178 + * it arrived at the right moment before 13:00, so it would be 179 + * better to check skb->tstamp and set it via __net_timestamp() 180 + * if needed. This however breaks outgoing packets tx timestamp, 181 + * and causes them to get delayed forever by fq packet scheduler. 173 182 */ 174 - if (skb->tstamp == 0) 175 - __net_timestamp((struct sk_buff *)skb); 176 - 177 - stamp = ktime_to_ns(skb->tstamp); 178 - stamp = div_s64(stamp, NSEC_PER_SEC); 183 + stamp = get_seconds(); 179 184 180 185 if (info->flags & XT_TIME_LOCAL_TZ) 181 186 /* Adjust for local timezone */

+11

net/rds/ib_fmr.c

··· 44 44 else 45 45 pool = rds_ibdev->mr_1m_pool; 46 46 47 + if (atomic_read(&pool->dirty_count) >= pool->max_items / 10) 48 + queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10); 49 + 50 + /* Switch pools if one of the pool is reaching upper limit */ 51 + if (atomic_read(&pool->dirty_count) >= pool->max_items * 9 / 10) { 52 + if (pool->pool_type == RDS_IB_MR_8K_POOL) 53 + pool = rds_ibdev->mr_1m_pool; 54 + else 55 + pool = rds_ibdev->mr_8k_pool; 56 + } 57 + 47 58 ibmr = rds_ib_try_reuse_ibmr(pool); 48 59 if (ibmr) 49 60 return ibmr;

-3

net/rds/ib_rdma.c

··· 454 454 struct rds_ib_mr *ibmr = NULL; 455 455 int iter = 0; 456 456 457 - if (atomic_read(&pool->dirty_count) >= pool->max_items_soft / 10) 458 - queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10); 459 - 460 457 while (1) { 461 458 ibmr = rds_ib_reuse_mr(pool); 462 459 if (ibmr)

+16 -11

net/rose/rose_loopback.c

··· 16 16 #include <linux/init.h> 17 17 18 18 static struct sk_buff_head loopback_queue; 19 + #define ROSE_LOOPBACK_LIMIT 1000 19 20 static struct timer_list loopback_timer; 20 21 21 22 static void rose_set_loopback_timer(void); ··· 36 35 37 36 int rose_loopback_queue(struct sk_buff *skb, struct rose_neigh *neigh) 38 37 { 39 - struct sk_buff *skbn; 38 + struct sk_buff *skbn = NULL; 40 39 41 - skbn = skb_clone(skb, GFP_ATOMIC); 40 + if (skb_queue_len(&loopback_queue) < ROSE_LOOPBACK_LIMIT) 41 + skbn = skb_clone(skb, GFP_ATOMIC); 42 42 43 - kfree_skb(skb); 44 - 45 - if (skbn != NULL) { 43 + if (skbn) { 44 + consume_skb(skb); 46 45 skb_queue_tail(&loopback_queue, skbn); 47 46 48 47 if (!rose_loopback_running()) 49 48 rose_set_loopback_timer(); 49 + } else { 50 + kfree_skb(skb); 50 51 } 51 52 52 53 return 1; 53 54 } 54 55 55 - 56 56 static void rose_set_loopback_timer(void) 57 57 { 58 - del_timer(&loopback_timer); 59 - 60 - loopback_timer.expires = jiffies + 10; 61 - add_timer(&loopback_timer); 58 + mod_timer(&loopback_timer, jiffies + 10); 62 59 } 63 60 64 61 static void rose_loopback_timer(struct timer_list *unused) ··· 67 68 struct sock *sk; 68 69 unsigned short frametype; 69 70 unsigned int lci_i, lci_o; 71 + int count; 70 72 71 - while ((skb = skb_dequeue(&loopback_queue)) != NULL) { 73 + for (count = 0; count < ROSE_LOOPBACK_LIMIT; count++) { 74 + skb = skb_dequeue(&loopback_queue); 75 + if (!skb) 76 + return; 72 77 if (skb->len < ROSE_MIN_LEN) { 73 78 kfree_skb(skb); 74 79 continue; ··· 109 106 kfree_skb(skb); 110 107 } 111 108 } 109 + if (!skb_queue_empty(&loopback_queue)) 110 + mod_timer(&loopback_timer, jiffies + 1); 112 111 } 113 112 114 113 void __exit rose_loopback_clear(void)

+8 -4

net/rxrpc/input.c

··· 1161 1161 * handle data received on the local endpoint 1162 1162 * - may be called in interrupt context 1163 1163 * 1164 - * The socket is locked by the caller and this prevents the socket from being 1165 - * shut down and the local endpoint from going away, thus sk_user_data will not 1166 - * be cleared until this function returns. 1164 + * [!] Note that as this is called from the encap_rcv hook, the socket is not 1165 + * held locked by the caller and nothing prevents sk_user_data on the UDP from 1166 + * being cleared in the middle of processing this function. 1167 1167 * 1168 1168 * Called with the RCU read lock held from the IP layer via UDP. 1169 1169 */ 1170 1170 int rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb) 1171 1171 { 1172 + struct rxrpc_local *local = rcu_dereference_sk_user_data(udp_sk); 1172 1173 struct rxrpc_connection *conn; 1173 1174 struct rxrpc_channel *chan; 1174 1175 struct rxrpc_call *call = NULL; 1175 1176 struct rxrpc_skb_priv *sp; 1176 - struct rxrpc_local *local = udp_sk->sk_user_data; 1177 1177 struct rxrpc_peer *peer = NULL; 1178 1178 struct rxrpc_sock *rx = NULL; 1179 1179 unsigned int channel; ··· 1181 1181 1182 1182 _enter("%p", udp_sk); 1183 1183 1184 + if (unlikely(!local)) { 1185 + kfree_skb(skb); 1186 + return 0; 1187 + } 1184 1188 if (skb->tstamp == 0) 1185 1189 skb->tstamp = ktime_get_real(); 1186 1190

+2 -1

net/rxrpc/local_object.c

··· 304 304 ret = -ENOMEM; 305 305 sock_error: 306 306 mutex_unlock(&rxnet->local_mutex); 307 - kfree(local); 307 + if (local) 308 + call_rcu(&local->rcu, rxrpc_local_rcu); 308 309 _leave(" = %d", ret); 309 310 return ERR_PTR(ret); 310 311

+2 -2

net/tls/tls_device.c

··· 904 904 goto release_netdev; 905 905 906 906 free_sw_resources: 907 + up_read(&device_offload_lock); 907 908 tls_sw_free_resources_rx(sk); 909 + down_read(&device_offload_lock); 908 910 release_ctx: 909 911 ctx->priv_ctx_rx = NULL; 910 912 release_netdev: ··· 941 939 } 942 940 out: 943 941 up_read(&device_offload_lock); 944 - kfree(tls_ctx->rx.rec_seq); 945 - kfree(tls_ctx->rx.iv); 946 942 tls_sw_release_resources_rx(sk); 947 943 } 948 944

+10 -3

net/tls/tls_device_fallback.c

··· 194 194 195 195 static void complete_skb(struct sk_buff *nskb, struct sk_buff *skb, int headln) 196 196 { 197 + struct sock *sk = skb->sk; 198 + int delta; 199 + 197 200 skb_copy_header(nskb, skb); 198 201 199 202 skb_put(nskb, skb->len); ··· 204 201 update_chksum(nskb, headln); 205 202 206 203 nskb->destructor = skb->destructor; 207 - nskb->sk = skb->sk; 204 + nskb->sk = sk; 208 205 skb->destructor = NULL; 209 206 skb->sk = NULL; 210 - refcount_add(nskb->truesize - skb->truesize, 211 - &nskb->sk->sk_wmem_alloc); 207 + 208 + delta = nskb->truesize - skb->truesize; 209 + if (likely(delta < 0)) 210 + WARN_ON_ONCE(refcount_sub_and_test(-delta, &sk->sk_wmem_alloc)); 211 + else if (delta) 212 + refcount_add(delta, &sk->sk_wmem_alloc); 212 213 } 213 214 214 215 /* This function may be called after the user socket is already

+1 -4

net/tls/tls_main.c

··· 293 293 #endif 294 294 } 295 295 296 - if (ctx->rx_conf == TLS_SW) { 297 - kfree(ctx->rx.rec_seq); 298 - kfree(ctx->rx.iv); 296 + if (ctx->rx_conf == TLS_SW) 299 297 tls_sw_free_resources_rx(sk); 300 - } 301 298 302 299 #ifdef CONFIG_TLS_DEVICE 303 300 if (ctx->rx_conf == TLS_HW)

net/tls/tls_sw.c

··· 2078 2078 struct tls_context *tls_ctx = tls_get_ctx(sk); 2079 2079 struct tls_sw_context_rx *ctx = tls_sw_ctx_rx(tls_ctx); 2080 2080 2081 + kfree(tls_ctx->rx.rec_seq); 2082 + kfree(tls_ctx->rx.iv); 2083 + 2081 2084 if (ctx->aead_recv) { 2082 2085 kfree_skb(ctx->recv_pkt); 2083 2086 ctx->recv_pkt = NULL;

tools/testing/selftests/net/run_afpackettests

··· 6 6 exit 0 7 7 fi 8 8 9 + ret=0 9 10 echo "--------------------" 10 11 echo "running psock_fanout test" 11 12 echo "--------------------" 12 13 ./in_netns.sh ./psock_fanout 13 14 if [ $? -ne 0 ]; then 14 15 echo "[FAIL]" 16 + ret=1 15 17 else 16 18 echo "[PASS]" 17 19 fi ··· 24 22 ./in_netns.sh ./psock_tpacket 25 23 if [ $? -ne 0 ]; then 26 24 echo "[FAIL]" 25 + ret=1 27 26 else 28 27 echo "[PASS]" 29 28 fi ··· 35 32 ./in_netns.sh ./txring_overwrite 36 33 if [ $? -ne 0 ]; then 37 34 echo "[FAIL]" 35 + ret=1 38 36 else 39 37 echo "[PASS]" 40 38 fi 39 + exit $ret

+1 -1

tools/testing/selftests/net/run_netsocktests

··· 7 7 ./socket 8 8 if [ $? -ne 0 ]; then 9 9 echo "[FAIL]" 10 + exit 1 10 11 else 11 12 echo "[PASS]" 12 13 fi 13 -

+1 -1

tools/testing/selftests/netfilter/Makefile

··· 1 1 # SPDX-License-Identifier: GPL-2.0 2 2 # Makefile for netfilter selftests 3 3 4 - TEST_PROGS := nft_trans_stress.sh nft_nat.sh 4 + TEST_PROGS := nft_trans_stress.sh nft_nat.sh conntrack_icmp_related.sh 5 5 6 6 include ../lib.mk

+283

tools/testing/selftests/netfilter/conntrack_icmp_related.sh

··· 1 + #!/bin/bash 2 + # 3 + # check that ICMP df-needed/pkttoobig icmp are set are set as related 4 + # state 5 + # 6 + # Setup is: 7 + # 8 + # nsclient1 -> nsrouter1 -> nsrouter2 -> nsclient2 9 + # MTU 1500, except for nsrouter2 <-> nsclient2 link (1280). 10 + # ping nsclient2 from nsclient1, checking that conntrack did set RELATED 11 + # 'fragmentation needed' icmp packet. 12 + # 13 + # In addition, nsrouter1 will perform IP masquerading, i.e. also 14 + # check the icmp errors are propagated to the correct host as per 15 + # nat of "established" icmp-echo "connection". 16 + 17 + # Kselftest framework requirement - SKIP code is 4. 18 + ksft_skip=4 19 + ret=0 20 + 21 + nft --version > /dev/null 2>&1 22 + if [ $? -ne 0 ];then 23 + echo "SKIP: Could not run test without nft tool" 24 + exit $ksft_skip 25 + fi 26 + 27 + ip -Version > /dev/null 2>&1 28 + if [ $? -ne 0 ];then 29 + echo "SKIP: Could not run test without ip tool" 30 + exit $ksft_skip 31 + fi 32 + 33 + cleanup() { 34 + for i in 1 2;do ip netns del nsclient$i;done 35 + for i in 1 2;do ip netns del nsrouter$i;done 36 + } 37 + 38 + ipv4() { 39 + echo -n 192.168.$1.2 40 + } 41 + 42 + ipv6 () { 43 + echo -n dead:$1::2 44 + } 45 + 46 + check_counter() 47 + { 48 + ns=$1 49 + name=$2 50 + expect=$3 51 + local lret=0 52 + 53 + cnt=$(ip netns exec $ns nft list counter inet filter "$name" | grep -q "$expect") 54 + if [ $? -ne 0 ]; then 55 + echo "ERROR: counter $name in $ns has unexpected value (expected $expect)" 1>&2 56 + ip netns exec $ns nft list counter inet filter "$name" 1>&2 57 + lret=1 58 + fi 59 + 60 + return $lret 61 + } 62 + 63 + check_unknown() 64 + { 65 + expect="packets 0 bytes 0" 66 + for n in nsclient1 nsclient2 nsrouter1 nsrouter2; do 67 + check_counter $n "unknown" "$expect" 68 + if [ $? -ne 0 ] ;then 69 + return 1 70 + fi 71 + done 72 + 73 + return 0 74 + } 75 + 76 + for n in nsclient1 nsclient2 nsrouter1 nsrouter2; do 77 + ip netns add $n 78 + ip -net $n link set lo up 79 + done 80 + 81 + DEV=veth0 82 + ip link add $DEV netns nsclient1 type veth peer name eth1 netns nsrouter1 83 + DEV=veth0 84 + ip link add $DEV netns nsclient2 type veth peer name eth1 netns nsrouter2 85 + 86 + DEV=veth0 87 + ip link add $DEV netns nsrouter1 type veth peer name eth2 netns nsrouter2 88 + 89 + DEV=veth0 90 + for i in 1 2; do 91 + ip -net nsclient$i link set $DEV up 92 + ip -net nsclient$i addr add $(ipv4 $i)/24 dev $DEV 93 + ip -net nsclient$i addr add $(ipv6 $i)/64 dev $DEV 94 + done 95 + 96 + ip -net nsrouter1 link set eth1 up 97 + ip -net nsrouter1 link set veth0 up 98 + 99 + ip -net nsrouter2 link set eth1 up 100 + ip -net nsrouter2 link set eth2 up 101 + 102 + ip -net nsclient1 route add default via 192.168.1.1 103 + ip -net nsclient1 -6 route add default via dead:1::1 104 + 105 + ip -net nsclient2 route add default via 192.168.2.1 106 + ip -net nsclient2 route add default via dead:2::1 107 + 108 + i=3 109 + ip -net nsrouter1 addr add 192.168.1.1/24 dev eth1 110 + ip -net nsrouter1 addr add 192.168.3.1/24 dev veth0 111 + ip -net nsrouter1 addr add dead:1::1/64 dev eth1 112 + ip -net nsrouter1 addr add dead:3::1/64 dev veth0 113 + ip -net nsrouter1 route add default via 192.168.3.10 114 + ip -net nsrouter1 -6 route add default via dead:3::10 115 + 116 + ip -net nsrouter2 addr add 192.168.2.1/24 dev eth1 117 + ip -net nsrouter2 addr add 192.168.3.10/24 dev eth2 118 + ip -net nsrouter2 addr add dead:2::1/64 dev eth1 119 + ip -net nsrouter2 addr add dead:3::10/64 dev eth2 120 + ip -net nsrouter2 route add default via 192.168.3.1 121 + ip -net nsrouter2 route add default via dead:3::1 122 + 123 + sleep 2 124 + for i in 4 6; do 125 + ip netns exec nsrouter1 sysctl -q net.ipv$i.conf.all.forwarding=1 126 + ip netns exec nsrouter2 sysctl -q net.ipv$i.conf.all.forwarding=1 127 + done 128 + 129 + for netns in nsrouter1 nsrouter2; do 130 + ip netns exec $netns nft -f - <<EOF 131 + table inet filter { 132 + counter unknown { } 133 + counter related { } 134 + chain forward { 135 + type filter hook forward priority 0; policy accept; 136 + meta l4proto icmpv6 icmpv6 type "packet-too-big" ct state "related" counter name "related" accept 137 + meta l4proto icmp icmp type "destination-unreachable" ct state "related" counter name "related" accept 138 + meta l4proto { icmp, icmpv6 } ct state new,established accept 139 + counter name "unknown" drop 140 + } 141 + } 142 + EOF 143 + done 144 + 145 + ip netns exec nsclient1 nft -f - <<EOF 146 + table inet filter { 147 + counter unknown { } 148 + counter related { } 149 + chain input { 150 + type filter hook input priority 0; policy accept; 151 + meta l4proto { icmp, icmpv6 } ct state established,untracked accept 152 + 153 + meta l4proto { icmp, icmpv6 } ct state "related" counter name "related" accept 154 + counter name "unknown" drop 155 + } 156 + } 157 + EOF 158 + 159 + ip netns exec nsclient2 nft -f - <<EOF 160 + table inet filter { 161 + counter unknown { } 162 + counter new { } 163 + counter established { } 164 + 165 + chain input { 166 + type filter hook input priority 0; policy accept; 167 + meta l4proto { icmp, icmpv6 } ct state established,untracked accept 168 + 169 + meta l4proto { icmp, icmpv6 } ct state "new" counter name "new" accept 170 + meta l4proto { icmp, icmpv6 } ct state "established" counter name "established" accept 171 + counter name "unknown" drop 172 + } 173 + chain output { 174 + type filter hook output priority 0; policy accept; 175 + meta l4proto { icmp, icmpv6 } ct state established,untracked accept 176 + 177 + meta l4proto { icmp, icmpv6 } ct state "new" counter name "new" 178 + meta l4proto { icmp, icmpv6 } ct state "established" counter name "established" 179 + counter name "unknown" drop 180 + } 181 + } 182 + EOF 183 + 184 + 185 + # make sure NAT core rewrites adress of icmp error if nat is used according to 186 + # conntrack nat information (icmp error will be directed at nsrouter1 address, 187 + # but it needs to be routed to nsclient1 address). 188 + ip netns exec nsrouter1 nft -f - <<EOF 189 + table ip nat { 190 + chain postrouting { 191 + type nat hook postrouting priority 0; policy accept; 192 + ip protocol icmp oifname "veth0" counter masquerade 193 + } 194 + } 195 + table ip6 nat { 196 + chain postrouting { 197 + type nat hook postrouting priority 0; policy accept; 198 + ip6 nexthdr icmpv6 oifname "veth0" counter masquerade 199 + } 200 + } 201 + EOF 202 + 203 + ip netns exec nsrouter2 ip link set eth1 mtu 1280 204 + ip netns exec nsclient2 ip link set veth0 mtu 1280 205 + sleep 1 206 + 207 + ip netns exec nsclient1 ping -c 1 -s 1000 -q -M do 192.168.2.2 >/dev/null 208 + if [ $? -ne 0 ]; then 209 + echo "ERROR: netns ip routing/connectivity broken" 1>&2 210 + cleanup 211 + exit 1 212 + fi 213 + ip netns exec nsclient1 ping6 -q -c 1 -s 1000 dead:2::2 >/dev/null 214 + if [ $? -ne 0 ]; then 215 + echo "ERROR: netns ipv6 routing/connectivity broken" 1>&2 216 + cleanup 217 + exit 1 218 + fi 219 + 220 + check_unknown 221 + if [ $? -ne 0 ]; then 222 + ret=1 223 + fi 224 + 225 + expect="packets 0 bytes 0" 226 + for netns in nsrouter1 nsrouter2 nsclient1;do 227 + check_counter "$netns" "related" "$expect" 228 + if [ $? -ne 0 ]; then 229 + ret=1 230 + fi 231 + done 232 + 233 + expect="packets 2 bytes 2076" 234 + check_counter nsclient2 "new" "$expect" 235 + if [ $? -ne 0 ]; then 236 + ret=1 237 + fi 238 + 239 + ip netns exec nsclient1 ping -q -c 1 -s 1300 -M do 192.168.2.2 > /dev/null 240 + if [ $? -eq 0 ]; then 241 + echo "ERROR: ping should have failed with PMTU too big error" 1>&2 242 + ret=1 243 + fi 244 + 245 + # nsrouter2 should have generated the icmp error, so 246 + # related counter should be 0 (its in forward). 247 + expect="packets 0 bytes 0" 248 + check_counter "nsrouter2" "related" "$expect" 249 + if [ $? -ne 0 ]; then 250 + ret=1 251 + fi 252 + 253 + # but nsrouter1 should have seen it, same for nsclient1. 254 + expect="packets 1 bytes 576" 255 + for netns in nsrouter1 nsclient1;do 256 + check_counter "$netns" "related" "$expect" 257 + if [ $? -ne 0 ]; then 258 + ret=1 259 + fi 260 + done 261 + 262 + ip netns exec nsclient1 ping6 -c 1 -s 1300 dead:2::2 > /dev/null 263 + if [ $? -eq 0 ]; then 264 + echo "ERROR: ping6 should have failed with PMTU too big error" 1>&2 265 + ret=1 266 + fi 267 + 268 + expect="packets 2 bytes 1856" 269 + for netns in nsrouter1 nsclient1;do 270 + check_counter "$netns" "related" "$expect" 271 + if [ $? -ne 0 ]; then 272 + ret=1 273 + fi 274 + done 275 + 276 + if [ $ret -eq 0 ];then 277 + echo "PASS: icmp mtu error had RELATED state" 278 + else 279 + echo "ERROR: icmp error RELATED state test has failed" 280 + fi 281 + 282 + cleanup 283 + exit $ret

+27 -9

tools/testing/selftests/netfilter/nft_nat.sh

··· 321 321 322 322 test_masquerade6() 323 323 { 324 + local natflags=$1 324 325 local lret=0 325 326 326 327 ip netns exec ns0 sysctl net.ipv6.conf.all.forwarding=1 > /dev/null ··· 355 354 table ip6 nat { 356 355 chain postrouting { 357 356 type nat hook postrouting priority 0; policy accept; 358 - meta oif veth0 masquerade 357 + meta oif veth0 masquerade $natflags 359 358 } 360 359 } 361 360 EOF 362 361 ip netns exec ns2 ping -q -c 1 dead:1::99 > /dev/null # ping ns2->ns1 363 362 if [ $? -ne 0 ] ; then 364 - echo "ERROR: cannot ping ns1 from ns2 with active ipv6 masquerading" 363 + echo "ERROR: cannot ping ns1 from ns2 with active ipv6 masquerade $natflags" 365 364 lret=1 366 365 fi 367 366 ··· 398 397 fi 399 398 done 400 399 400 + ip netns exec ns2 ping -q -c 1 dead:1::99 > /dev/null # ping ns2->ns1 401 + if [ $? -ne 0 ] ; then 402 + echo "ERROR: cannot ping ns1 from ns2 with active ipv6 masquerade $natflags (attempt 2)" 403 + lret=1 404 + fi 405 + 401 406 ip netns exec ns0 nft flush chain ip6 nat postrouting 402 407 if [ $? -ne 0 ]; then 403 408 echo "ERROR: Could not flush ip6 nat postrouting" 1>&2 404 409 lret=1 405 410 fi 406 411 407 - test $lret -eq 0 && echo "PASS: IPv6 masquerade for ns2" 412 + test $lret -eq 0 && echo "PASS: IPv6 masquerade $natflags for ns2" 408 413 409 414 return $lret 410 415 } 411 416 412 417 test_masquerade() 413 418 { 419 + local natflags=$1 414 420 local lret=0 415 421 416 422 ip netns exec ns0 sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null ··· 425 417 426 418 ip netns exec ns2 ping -q -c 1 10.0.1.99 > /dev/null # ping ns2->ns1 427 419 if [ $? -ne 0 ] ; then 428 - echo "ERROR: canot ping ns1 from ns2" 420 + echo "ERROR: cannot ping ns1 from ns2 $natflags" 429 421 lret=1 430 422 fi 431 423 ··· 451 443 table ip nat { 452 444 chain postrouting { 453 445 type nat hook postrouting priority 0; policy accept; 454 - meta oif veth0 masquerade 446 + meta oif veth0 masquerade $natflags 455 447 } 456 448 } 457 449 EOF 458 450 ip netns exec ns2 ping -q -c 1 10.0.1.99 > /dev/null # ping ns2->ns1 459 451 if [ $? -ne 0 ] ; then 460 - echo "ERROR: cannot ping ns1 from ns2 with active ip masquerading" 452 + echo "ERROR: cannot ping ns1 from ns2 with active ip masquere $natflags" 461 453 lret=1 462 454 fi 463 455 ··· 493 485 fi 494 486 done 495 487 488 + ip netns exec ns2 ping -q -c 1 10.0.1.99 > /dev/null # ping ns2->ns1 489 + if [ $? -ne 0 ] ; then 490 + echo "ERROR: cannot ping ns1 from ns2 with active ip masquerade $natflags (attempt 2)" 491 + lret=1 492 + fi 493 + 496 494 ip netns exec ns0 nft flush chain ip nat postrouting 497 495 if [ $? -ne 0 ]; then 498 496 echo "ERROR: Could not flush nat postrouting" 1>&2 499 497 lret=1 500 498 fi 501 499 502 - test $lret -eq 0 && echo "PASS: IP masquerade for ns2" 500 + test $lret -eq 0 && echo "PASS: IP masquerade $natflags for ns2" 503 501 504 502 return $lret 505 503 } ··· 764 750 test_local_dnat6 765 751 766 752 reset_counters 767 - test_masquerade 768 - test_masquerade6 753 + test_masquerade "" 754 + test_masquerade6 "" 755 + 756 + reset_counters 757 + test_masquerade "fully-random" 758 + test_masquerade6 "fully-random" 769 759 770 760 reset_counters 771 761 test_redirect

Configure Feed

Configure Feed