Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

bpf: Support getting tunnel flags

Existing 'bpf_skb_get_tunnel_key' extracts various tunnel parameters
(id, ttl, tos, local and remote) but does not expose ip_tunnel_info's
tun_flags to the BPF program.

It makes sense to expose tun_flags to the BPF program.

Assume for example multiple GRE tunnels maintained on a single GRE
interface in collect_md mode. The program expects origins to initiate
over GRE, however different origins use different GRE characteristics
(e.g. some prefer to use GRE checksum, some do not; some pass a GRE key,
some do not, etc..).

A BPF program getting tun_flags can therefore remember the relevant
flags (e.g. TUNNEL_CSUM, TUNNEL_SEQ...) for each initiating remote. In
the reply path, the program can use 'bpf_skb_set_tunnel_key' in order
to correctly reply to the remote, using similar characteristics, based
on the stored tunnel flags.

Introduce BPF_F_TUNINFO_FLAGS flag for bpf_skb_get_tunnel_key. If
specified, 'bpf_tunnel_key->tunnel_flags' is set with the tun_flags.

Decided to use the existing unused 'tunnel_ext' as the storage for the
'tunnel_flags' in order to avoid changing bpf_tunnel_key's layout.

Also, the following has been considered during the design:

1. Convert the "interesting" internal TUNNEL_xxx flags back to BPF_F_yyy
and place into the new 'tunnel_flags' field. This has 2 drawbacks:

- The BPF_F_yyy flags are from *set_tunnel_key* enumeration space,
e.g. BPF_F_ZERO_CSUM_TX. It is awkward that it is "returned" into
tunnel_flags from a *get_tunnel_key* call.
- Not all "interesting" TUNNEL_xxx flags can be mapped to existing
BPF_F_yyy flags, and it doesn't make sense to create new BPF_F_yyy
flags just for purposes of the returned tunnel_flags.

2. Place key.tun_flags into 'tunnel_flags' but mask them, keeping only
"interesting" flags. That's ok, but the drawback is that what's
"interesting" for my usecase might be limiting for other usecases.

Therefore I decided to expose what's in key.tun_flags *as is*, which seems
most flexible. The BPF user can just choose to ignore bits he's not
interested in. The TUNNEL_xxx are also UAPI, so no harm exposing them
back in the get_tunnel_key call.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220831144010.174110-1-shmulik.ladkani@gmail.com

authored by

Shmulik Ladkani and committed by
Daniel Borkmann
44c51472 dc84dbbc

+24 -4
+9 -1
include/uapi/linux/bpf.h
··· 5659 5659 BPF_F_SEQ_NUMBER = (1ULL << 3), 5660 5660 }; 5661 5661 5662 + /* BPF_FUNC_skb_get_tunnel_key flags. */ 5663 + enum { 5664 + BPF_F_TUNINFO_FLAGS = (1ULL << 4), 5665 + }; 5666 + 5662 5667 /* BPF_FUNC_perf_event_output, BPF_FUNC_perf_event_read and 5663 5668 * BPF_FUNC_perf_event_read_value flags. 5664 5669 */ ··· 5853 5848 }; 5854 5849 __u8 tunnel_tos; 5855 5850 __u8 tunnel_ttl; 5856 - __u16 tunnel_ext; /* Padding, future use. */ 5851 + union { 5852 + __u16 tunnel_ext; /* compat */ 5853 + __be16 tunnel_flags; 5854 + }; 5857 5855 __u32 tunnel_label; 5858 5856 union { 5859 5857 __u32 local_ipv4;
+6 -2
net/core/filter.c
··· 4488 4488 void *to_orig = to; 4489 4489 int err; 4490 4490 4491 - if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6)))) { 4491 + if (unlikely(!info || (flags & ~(BPF_F_TUNINFO_IPV6 | 4492 + BPF_F_TUNINFO_FLAGS)))) { 4492 4493 err = -EINVAL; 4493 4494 goto err_clear; 4494 4495 } ··· 4521 4520 to->tunnel_id = be64_to_cpu(info->key.tun_id); 4522 4521 to->tunnel_tos = info->key.tos; 4523 4522 to->tunnel_ttl = info->key.ttl; 4524 - to->tunnel_ext = 0; 4523 + if (flags & BPF_F_TUNINFO_FLAGS) 4524 + to->tunnel_flags = info->key.tun_flags; 4525 + else 4526 + to->tunnel_ext = 0; 4525 4527 4526 4528 if (flags & BPF_F_TUNINFO_IPV6) { 4527 4529 memcpy(to->remote_ipv6, &info->key.u.ipv6.src,
+9 -1
tools/include/uapi/linux/bpf.h
··· 5659 5659 BPF_F_SEQ_NUMBER = (1ULL << 3), 5660 5660 }; 5661 5661 5662 + /* BPF_FUNC_skb_get_tunnel_key flags. */ 5663 + enum { 5664 + BPF_F_TUNINFO_FLAGS = (1ULL << 4), 5665 + }; 5666 + 5662 5667 /* BPF_FUNC_perf_event_output, BPF_FUNC_perf_event_read and 5663 5668 * BPF_FUNC_perf_event_read_value flags. 5664 5669 */ ··· 5853 5848 }; 5854 5849 __u8 tunnel_tos; 5855 5850 __u8 tunnel_ttl; 5856 - __u16 tunnel_ext; /* Padding, future use. */ 5851 + union { 5852 + __u16 tunnel_ext; /* compat */ 5853 + __be16 tunnel_flags; 5854 + }; 5857 5855 __u32 tunnel_label; 5858 5856 union { 5859 5857 __u32 local_ipv4;