Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'bpf: support input xdp_md context in BPF_PROG_TEST_RUN'

Zvi Effron says:

====================

This patchset adds support for passing an xdp_md via ctx_in/ctx_out in
bpf_attr for BPF_PROG_TEST_RUN of XDP programs.

Patch 1 adds a function to validate XDP meta data lengths.

Patch 2 adds initial support for passing XDP meta data in addition to
packet data.

Patch 3 adds support for also specifying the ingress interface and
rx queue.

Patch 4 adds selftests to ensure functionality is correct.

Changelog:
----------
v7->v8
v7: https://lore.kernel.org/bpf/20210624211304.90807-1-zeffron@riotgames.com/

* Fix too long comment line in patch 3

v6->v7
v6: https://lore.kernel.org/bpf/20210617232904.1899-1-zeffron@riotgames.com/

* Add Yonghong Song's Acked-by to commit message in patch 1
* Add Yonghong Song's Acked-by to commit message in patch 2
* Extracted the post-update of the xdp_md context into a function (again)
* Validate that the rx queue was registered with XDP info
* Decrement the reference count on a found netdevice on failure to find
a valid rx queue
* Decrement the reference count on a found netdevice after the XDP
program is run
* Drop Yonghong Song's Acked-By for patch 3 because of patch changes
* Improve a comment in the selftests
* Drop Yonghong Song's Acked-By for patch 4 because of patch changes

v5->v6
v5: https://lore.kernel.org/bpf/20210616224712.3243-1-zeffron@riotgames.com/

* Correct commit messages in patches 1 and 3
* Add Acked-by to commit message in patch 4
* Use gotos instead of returns to correctly free resources in
bpf_prog_test_run_xdp
* Rename xdp_metalen_valid to xdp_metalen_invalid
* Improve the function signature for xdp_metalen_invalid
* Merged declaration of ingress_ifindex and rx_queue_index into one line

v4->v5
v4: https://lore.kernel.org/bpf/20210604220235.6758-1-zeffron@riotgames.com/

* Add new patch to introduce xdp_metalen_valid inline function to avoid
duplicated code from net/core/filter.c
* Correct size of bad_ctx in selftests
* Make all declarations reverse Christmas tree
* Move data check from xdp_convert_md_to_buff to bpf_prog_test_run_xdp
* Merge xdp_convert_buff_to_md into bpf_prog_test_run_xdp
* Fix line too long
* Extracted common checks in selftests to a helper function
* Removed redundant assignment in selftests
* Reordered test cases in selftests
* Check data against 0 instead of data_meta in selftests
* Made selftests use EINVAL instead of hardcoded 22
* Dropped "_" from XDP function name
* Changed casts in XDP program from unsigned long to long
* Added a comment explaining the use of the loopback interface in selftests
* Change parameter order in xdp_convert_md_to_buff to be input first
* Assigned xdp->ingress_ifindex and xdp->rx_queue_index to local variables in
xdp_convert_md_to_buff
* Made use of "meta data" versus "metadata" consistent in comments and commit
messages

v3->v4
v3: https://lore.kernel.org/bpf/20210602190815.8096-1-zeffron@riotgames.com/

* Clean up nits
* Validate xdp_md->data_end in bpf_prog_test_run_xdp
* Remove intermediate metalen variables

v2 -> v3
v2: https://lore.kernel.org/bpf/20210527201341.7128-1-zeffron@riotgames.com/

* Check errno first in selftests
* Use DECLARE_LIBBPF_OPTS
* Rename tattr to opts in selftests
* Remove extra new line
* Rename convert_xdpmd_to_xdpb to xdp_convert_md_to_buff
* Rename convert_xdpb_to_xdpmd to xdp_convert_buff_to_md
* Move declaration of device and rxqueue in xdp_convert_md_to_buff to
patch 2
* Reorder the kfree calls in bpf_prog_test_run_xdp

v1 -> v2
v1: https://lore.kernel.org/bpf/20210524220555.251473-1-zeffron@riotgames.com

* Fix null pointer dereference with no context
* Use the BPF skeleton and replace CHECK with ASSERT macros
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

+233 -13
+5
include/net/xdp.h
··· 276 276 return unlikely(xdp->data_meta > xdp->data); 277 277 } 278 278 279 + static inline bool xdp_metalen_invalid(unsigned long metalen) 280 + { 281 + return (metalen & (sizeof(__u32) - 1)) || (metalen > 32); 282 + } 283 + 279 284 struct xdp_attachment_info { 280 285 struct bpf_prog *prog; 281 286 u32 flags;
-3
include/uapi/linux/bpf.h
··· 324 324 * **BPF_PROG_TYPE_SK_LOOKUP** 325 325 * *data_in* and *data_out* must be NULL. 326 326 * 327 - * **BPF_PROG_TYPE_XDP** 328 - * *ctx_in* and *ctx_out* must be NULL. 329 - * 330 327 * **BPF_PROG_TYPE_RAW_TRACEPOINT**, 331 328 * **BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE** 332 329 *
+101 -8
net/bpf/test_run.c
··· 15 15 #include <linux/error-injection.h> 16 16 #include <linux/smp.h> 17 17 #include <linux/sock_diag.h> 18 + #include <net/xdp.h> 18 19 19 20 #define CREATE_TRACE_POINTS 20 21 #include <trace/events/bpf_test_run.h> ··· 688 687 return ret; 689 688 } 690 689 690 + static int xdp_convert_md_to_buff(struct xdp_md *xdp_md, struct xdp_buff *xdp) 691 + { 692 + unsigned int ingress_ifindex, rx_queue_index; 693 + struct netdev_rx_queue *rxqueue; 694 + struct net_device *device; 695 + 696 + if (!xdp_md) 697 + return 0; 698 + 699 + if (xdp_md->egress_ifindex != 0) 700 + return -EINVAL; 701 + 702 + ingress_ifindex = xdp_md->ingress_ifindex; 703 + rx_queue_index = xdp_md->rx_queue_index; 704 + 705 + if (!ingress_ifindex && rx_queue_index) 706 + return -EINVAL; 707 + 708 + if (ingress_ifindex) { 709 + device = dev_get_by_index(current->nsproxy->net_ns, 710 + ingress_ifindex); 711 + if (!device) 712 + return -ENODEV; 713 + 714 + if (rx_queue_index >= device->real_num_rx_queues) 715 + goto free_dev; 716 + 717 + rxqueue = __netif_get_rx_queue(device, rx_queue_index); 718 + 719 + if (!xdp_rxq_info_is_reg(&rxqueue->xdp_rxq)) 720 + goto free_dev; 721 + 722 + xdp->rxq = &rxqueue->xdp_rxq; 723 + /* The device is now tracked in the xdp->rxq for later 724 + * dev_put() 725 + */ 726 + } 727 + 728 + xdp->data = xdp->data_meta + xdp_md->data; 729 + return 0; 730 + 731 + free_dev: 732 + dev_put(device); 733 + return -EINVAL; 734 + } 735 + 736 + static void xdp_convert_buff_to_md(struct xdp_buff *xdp, struct xdp_md *xdp_md) 737 + { 738 + if (!xdp_md) 739 + return; 740 + 741 + xdp_md->data = xdp->data - xdp->data_meta; 742 + xdp_md->data_end = xdp->data_end - xdp->data_meta; 743 + 744 + if (xdp_md->ingress_ifindex) 745 + dev_put(xdp->rxq->dev); 746 + } 747 + 691 748 int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, 692 749 union bpf_attr __user *uattr) 693 750 { ··· 756 697 struct netdev_rx_queue *rxqueue; 757 698 struct xdp_buff xdp = {}; 758 699 u32 retval, duration; 700 + struct xdp_md *ctx; 759 701 u32 max_data_sz; 760 702 void *data; 761 - int ret; 703 + int ret = -EINVAL; 762 704 763 - if (kattr->test.ctx_in || kattr->test.ctx_out) 764 - return -EINVAL; 705 + ctx = bpf_ctx_init(kattr, sizeof(struct xdp_md)); 706 + if (IS_ERR(ctx)) 707 + return PTR_ERR(ctx); 708 + 709 + if (ctx) { 710 + /* There can't be user provided data before the meta data */ 711 + if (ctx->data_meta || ctx->data_end != size || 712 + ctx->data > ctx->data_end || 713 + unlikely(xdp_metalen_invalid(ctx->data))) 714 + goto free_ctx; 715 + /* Meta data is allocated from the headroom */ 716 + headroom -= ctx->data; 717 + } 765 718 766 719 /* XDP have extra tailroom as (most) drivers use full page */ 767 720 max_data_sz = 4096 - headroom - tailroom; 768 721 769 722 data = bpf_test_init(kattr, max_data_sz, headroom, tailroom); 770 - if (IS_ERR(data)) 771 - return PTR_ERR(data); 723 + if (IS_ERR(data)) { 724 + ret = PTR_ERR(data); 725 + goto free_ctx; 726 + } 772 727 773 728 rxqueue = __netif_get_rx_queue(current->nsproxy->net_ns->loopback_dev, 0); 774 729 xdp_init_buff(&xdp, headroom + max_data_sz + tailroom, 775 730 &rxqueue->xdp_rxq); 776 731 xdp_prepare_buff(&xdp, data, headroom, size, true); 777 732 733 + ret = xdp_convert_md_to_buff(ctx, &xdp); 734 + if (ret) 735 + goto free_data; 736 + 778 737 bpf_prog_change_xdp(NULL, prog); 779 738 ret = bpf_test_run(prog, &xdp, repeat, &retval, &duration, true); 739 + /* We convert the xdp_buff back to an xdp_md before checking the return 740 + * code so the reference count of any held netdevice will be decremented 741 + * even if the test run failed. 742 + */ 743 + xdp_convert_buff_to_md(&xdp, ctx); 780 744 if (ret) 781 745 goto out; 782 - if (xdp.data != data + headroom || xdp.data_end != xdp.data + size) 783 - size = xdp.data_end - xdp.data; 784 - ret = bpf_test_finish(kattr, uattr, xdp.data, size, retval, duration); 746 + 747 + if (xdp.data_meta != data + headroom || 748 + xdp.data_end != xdp.data_meta + size) 749 + size = xdp.data_end - xdp.data_meta; 750 + 751 + ret = bpf_test_finish(kattr, uattr, xdp.data_meta, size, retval, 752 + duration); 753 + if (!ret) 754 + ret = bpf_ctx_finish(kattr, uattr, ctx, 755 + sizeof(struct xdp_md)); 756 + 785 757 out: 786 758 bpf_prog_change_xdp(prog, NULL); 759 + free_data: 787 760 kfree(data); 761 + free_ctx: 762 + kfree(ctx); 788 763 return ret; 789 764 } 790 765
+2 -2
net/core/filter.c
··· 77 77 #include <net/transp_v6.h> 78 78 #include <linux/btf_ids.h> 79 79 #include <net/tls.h> 80 + #include <net/xdp.h> 80 81 81 82 static const struct bpf_func_proto * 82 83 bpf_sk_base_func_proto(enum bpf_func_id func_id); ··· 3881 3880 if (unlikely(meta < xdp_frame_end || 3882 3881 meta > xdp->data)) 3883 3882 return -EINVAL; 3884 - if (unlikely((metalen & (sizeof(__u32) - 1)) || 3885 - (metalen > 32))) 3883 + if (unlikely(xdp_metalen_invalid(metalen))) 3886 3884 return -EACCES; 3887 3885 3888 3886 xdp->data_meta = meta;
+105
tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <test_progs.h> 3 + #include <network_helpers.h> 4 + #include "test_xdp_context_test_run.skel.h" 5 + 6 + void test_xdp_context_error(int prog_fd, struct bpf_test_run_opts opts, 7 + __u32 data_meta, __u32 data, __u32 data_end, 8 + __u32 ingress_ifindex, __u32 rx_queue_index, 9 + __u32 egress_ifindex) 10 + { 11 + struct xdp_md ctx = { 12 + .data = data, 13 + .data_end = data_end, 14 + .data_meta = data_meta, 15 + .ingress_ifindex = ingress_ifindex, 16 + .rx_queue_index = rx_queue_index, 17 + .egress_ifindex = egress_ifindex, 18 + }; 19 + int err; 20 + 21 + opts.ctx_in = &ctx; 22 + opts.ctx_size_in = sizeof(ctx); 23 + err = bpf_prog_test_run_opts(prog_fd, &opts); 24 + ASSERT_EQ(errno, EINVAL, "errno-EINVAL"); 25 + ASSERT_ERR(err, "bpf_prog_test_run"); 26 + } 27 + 28 + void test_xdp_context_test_run(void) 29 + { 30 + struct test_xdp_context_test_run *skel = NULL; 31 + char data[sizeof(pkt_v4) + sizeof(__u32)]; 32 + char bad_ctx[sizeof(struct xdp_md) + 1]; 33 + struct xdp_md ctx_in, ctx_out; 34 + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, 35 + .data_in = &data, 36 + .data_size_in = sizeof(data), 37 + .ctx_out = &ctx_out, 38 + .ctx_size_out = sizeof(ctx_out), 39 + .repeat = 1, 40 + ); 41 + int err, prog_fd; 42 + 43 + skel = test_xdp_context_test_run__open_and_load(); 44 + if (!ASSERT_OK_PTR(skel, "skel")) 45 + return; 46 + prog_fd = bpf_program__fd(skel->progs.xdp_context); 47 + 48 + /* Data past the end of the kernel's struct xdp_md must be 0 */ 49 + bad_ctx[sizeof(bad_ctx) - 1] = 1; 50 + opts.ctx_in = bad_ctx; 51 + opts.ctx_size_in = sizeof(bad_ctx); 52 + err = bpf_prog_test_run_opts(prog_fd, &opts); 53 + ASSERT_EQ(errno, E2BIG, "extradata-errno"); 54 + ASSERT_ERR(err, "bpf_prog_test_run(extradata)"); 55 + 56 + *(__u32 *)data = XDP_PASS; 57 + *(struct ipv4_packet *)(data + sizeof(__u32)) = pkt_v4; 58 + opts.ctx_in = &ctx_in; 59 + opts.ctx_size_in = sizeof(ctx_in); 60 + memset(&ctx_in, 0, sizeof(ctx_in)); 61 + ctx_in.data_meta = 0; 62 + ctx_in.data = sizeof(__u32); 63 + ctx_in.data_end = ctx_in.data + sizeof(pkt_v4); 64 + err = bpf_prog_test_run_opts(prog_fd, &opts); 65 + ASSERT_OK(err, "bpf_prog_test_run(valid)"); 66 + ASSERT_EQ(opts.retval, XDP_PASS, "valid-retval"); 67 + ASSERT_EQ(opts.data_size_out, sizeof(pkt_v4), "valid-datasize"); 68 + ASSERT_EQ(opts.ctx_size_out, opts.ctx_size_in, "valid-ctxsize"); 69 + ASSERT_EQ(ctx_out.data_meta, 0, "valid-datameta"); 70 + ASSERT_EQ(ctx_out.data, 0, "valid-data"); 71 + ASSERT_EQ(ctx_out.data_end, sizeof(pkt_v4), "valid-dataend"); 72 + 73 + /* Meta data's size must be a multiple of 4 */ 74 + test_xdp_context_error(prog_fd, opts, 0, 1, sizeof(data), 0, 0, 0); 75 + 76 + /* data_meta must reference the start of data */ 77 + test_xdp_context_error(prog_fd, opts, 4, sizeof(__u32), sizeof(data), 78 + 0, 0, 0); 79 + 80 + /* Meta data must be 32 bytes or smaller */ 81 + test_xdp_context_error(prog_fd, opts, 0, 36, sizeof(data), 0, 0, 0); 82 + 83 + /* Total size of data must match data_end - data_meta */ 84 + test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), 85 + sizeof(data) - 1, 0, 0, 0); 86 + test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), 87 + sizeof(data) + 1, 0, 0, 0); 88 + 89 + /* RX queue cannot be specified without specifying an ingress */ 90 + test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), sizeof(data), 91 + 0, 1, 0); 92 + 93 + /* Interface 1 is always the loopback interface which always has only 94 + * one RX queue (index 0). This makes index 1 an invalid rx queue index 95 + * for interface 1. 96 + */ 97 + test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), sizeof(data), 98 + 1, 1, 0); 99 + 100 + /* The egress cannot be specified */ 101 + test_xdp_context_error(prog_fd, opts, 0, sizeof(__u32), sizeof(data), 102 + 0, 0, 1); 103 + 104 + test_xdp_context_test_run__destroy(skel); 105 + }
+20
tools/testing/selftests/bpf/progs/test_xdp_context_test_run.c
··· 1 + // SPDX-License-Identifier: GPL-2.0 2 + #include <linux/bpf.h> 3 + #include <bpf/bpf_helpers.h> 4 + 5 + SEC("xdp") 6 + int xdp_context(struct xdp_md *xdp) 7 + { 8 + void *data = (void *)(long)xdp->data; 9 + __u32 *metadata = (void *)(long)xdp->data_meta; 10 + __u32 ret; 11 + 12 + if (metadata + 1 > data) 13 + return XDP_ABORTED; 14 + ret = *metadata; 15 + if (bpf_xdp_adjust_meta(xdp, 4)) 16 + return XDP_ABORTED; 17 + return ret; 18 + } 19 + 20 + char _license[] SEC("license") = "GPL";