Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge branch 'libbpf-fix-btf-dedup-to-support-recursive-typedef'

Paul Houssel says:

====================
libbpf: fix BTF dedup to support recursive typedef

Pahole fails to encode BTF for some Go projects (e.g. Kubernetes and
Podman) due to recursive type definitions that create reference loops
not representable in C. These recursive typedefs trigger a failure in
the BTF deduplication algorithm.

This patch extends btf_dedup_struct_types() to properly handle potential
recursion for BTF_KIND_TYPEDEF, similar to how recursion is already
handled for BTF_KIND_STRUCT. This allows pahole to successfully
generate BTF for Go binaries using recursive types without impacting
existing C-based workflows.

Changes in v4: fix typo found by Claude-based CI

Changes in v3:
1. Patch 1: Adjusted the comment of btf_dedup_ref_type() to refer to
typedef as well.
2. Patch 2: Update of the "dedup: recursive typedef" test to include a
duplicated version of the types to make sure deduplication still happens
in this case.

Changes in v2:
1. Patch 1: Refactored code to prevent copying existing logic. Instead of
adding a new function we modify the existing btf_dedup_struct_type()
function to handle the BTF_KIND_TYPEDEF case. Calls to btf_hash_struct()
and btf_shallow_equal_struct() are replaced with calls to functions that
select btf_hash_struct() / btf_hash_typedef() based on the type.
2. Patch 2: Added tests

v3: https://lore.kernel.org/lkml/cover.1763024337.git.paul.houssel@orange.com/

v2: https://lore.kernel.org/lkml/cover.1762956564.git.paul.houssel@orange.com/

v1: https://lore.kernel.org/lkml/20251107153408.159342-1-paulhoussel2@gmail.com/
====================

Link: https://patch.msgid.link/cover.1763037045.git.paul.houssel@orange.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

+120 -16
+55 -16
tools/lib/bpf/btf.c
··· 3901 3901 return err; 3902 3902 } 3903 3903 3904 + /* 3905 + * Calculate type signature hash of TYPEDEF, ignoring referenced type IDs, 3906 + * as referenced type IDs equivalence is established separately during type 3907 + * graph equivalence check algorithm. 3908 + */ 3909 + static long btf_hash_typedef(struct btf_type *t) 3910 + { 3911 + long h; 3912 + 3913 + h = hash_combine(0, t->name_off); 3914 + h = hash_combine(h, t->info); 3915 + return h; 3916 + } 3917 + 3904 3918 static long btf_hash_common(struct btf_type *t) 3905 3919 { 3906 3920 long h; ··· 3930 3916 return t1->name_off == t2->name_off && 3931 3917 t1->info == t2->info && 3932 3918 t1->size == t2->size; 3919 + } 3920 + 3921 + /* Check structural compatibility of two TYPEDEF. */ 3922 + static bool btf_equal_typedef(struct btf_type *t1, struct btf_type *t2) 3923 + { 3924 + return t1->name_off == t2->name_off && 3925 + t1->info == t2->info; 3933 3926 } 3934 3927 3935 3928 /* Calculate type signature hash of INT or TAG. */ ··· 4865 4844 } 4866 4845 } 4867 4846 4847 + static inline long btf_hash_by_kind(struct btf_type *t, __u16 kind) 4848 + { 4849 + if (kind == BTF_KIND_TYPEDEF) 4850 + return btf_hash_typedef(t); 4851 + else 4852 + return btf_hash_struct(t); 4853 + } 4854 + 4855 + static inline bool btf_equal_by_kind(struct btf_type *t1, struct btf_type *t2, __u16 kind) 4856 + { 4857 + if (kind == BTF_KIND_TYPEDEF) 4858 + return btf_equal_typedef(t1, t2); 4859 + else 4860 + return btf_shallow_equal_struct(t1, t2); 4861 + } 4862 + 4868 4863 /* 4869 - * Deduplicate struct/union types. 4864 + * Deduplicate struct/union and typedef types. 4870 4865 * 4871 4866 * For each struct/union type its type signature hash is calculated, taking 4872 4867 * into account type's name, size, number, order and names of fields, but 4873 4868 * ignoring type ID's referenced from fields, because they might not be deduped 4874 - * completely until after reference types deduplication phase. This type hash 4869 + * completely until after reference types deduplication phase. For each typedef 4870 + * type, the hash is computed based on the type’s name and size. This type hash 4875 4871 * is used to iterate over all potential canonical types, sharing same hash. 4876 4872 * For each canonical candidate we check whether type graphs that they form 4877 4873 * (through referenced types in fields and so on) are equivalent using algorithm ··· 4920 4882 t = btf_type_by_id(d->btf, type_id); 4921 4883 kind = btf_kind(t); 4922 4884 4923 - if (kind != BTF_KIND_STRUCT && kind != BTF_KIND_UNION) 4885 + if (kind != BTF_KIND_STRUCT && 4886 + kind != BTF_KIND_UNION && 4887 + kind != BTF_KIND_TYPEDEF) 4924 4888 return 0; 4925 4889 4926 - h = btf_hash_struct(t); 4890 + h = btf_hash_by_kind(t, kind); 4927 4891 for_each_dedup_cand(d, hash_entry, h) { 4928 4892 __u32 cand_id = hash_entry->value; 4929 4893 int eq; 4930 4894 4931 4895 /* 4932 4896 * Even though btf_dedup_is_equiv() checks for 4933 - * btf_shallow_equal_struct() internally when checking two 4934 - * structs (unions) for equivalence, we need to guard here 4897 + * btf_equal_by_kind() internally when checking two 4898 + * structs (unions) or typedefs for equivalence, we need to guard here 4935 4899 * from picking matching FWD type as a dedup candidate. 4936 4900 * This can happen due to hash collision. In such case just 4937 4901 * relying on btf_dedup_is_equiv() would lead to potentially ··· 4941 4901 * FWD and compatible STRUCT/UNION are considered equivalent. 4942 4902 */ 4943 4903 cand_type = btf_type_by_id(d->btf, cand_id); 4944 - if (!btf_shallow_equal_struct(t, cand_type)) 4904 + if (!btf_equal_by_kind(t, cand_type, kind)) 4945 4905 continue; 4946 4906 4947 4907 btf_dedup_clear_hypot_map(d); ··· 4979 4939 /* 4980 4940 * Deduplicate reference type. 4981 4941 * 4982 - * Once all primitive and struct/union types got deduplicated, we can easily 4942 + * Once all primitive, struct/union and typedef types got deduplicated, we can easily 4983 4943 * deduplicate all other (reference) BTF types. This is done in two steps: 4984 4944 * 4985 4945 * 1. Resolve all referenced type IDs into their canonical type IDs. This 4986 - * resolution can be done either immediately for primitive or struct/union types 4987 - * (because they were deduped in previous two phases) or recursively for 4946 + * resolution can be done either immediately for primitive, struct/union, and typedef 4947 + * types (because they were deduped in previous two phases) or recursively for 4988 4948 * reference types. Recursion will always terminate at either primitive or 4989 - * struct/union type, at which point we can "unwind" chain of reference types 4990 - * one by one. There is no danger of encountering cycles because in C type 4991 - * system the only way to form type cycle is through struct/union, so any chain 4992 - * of reference types, even those taking part in a type cycle, will inevitably 4993 - * reach struct/union at some point. 4949 + * struct/union and typedef types, at which point we can "unwind" chain of reference 4950 + * types one by one. There is no danger of encountering cycles in C, as the only way to 4951 + * form a type cycle is through struct or union types. Go can form such cycles through 4952 + * typedef. Thus, any chain of reference types, even those taking part in a type cycle, 4953 + * will inevitably reach a struct/union or typedef type at some point. 4994 4954 * 4995 4955 * 2. Once all referenced type IDs are resolved into canonical ones, BTF type 4996 4956 * becomes "stable", in the sense that no further deduplication will cause ··· 5022 4982 case BTF_KIND_VOLATILE: 5023 4983 case BTF_KIND_RESTRICT: 5024 4984 case BTF_KIND_PTR: 5025 - case BTF_KIND_TYPEDEF: 5026 4985 case BTF_KIND_FUNC: 5027 4986 case BTF_KIND_TYPE_TAG: 5028 4987 ref_type_id = btf_dedup_ref_type(d, t->type);
+65
tools/testing/selftests/bpf/prog_tests/btf.c
··· 7496 7496 }, 7497 7497 }, 7498 7498 { 7499 + .descr = "dedup: recursive typedef", 7500 + /* 7501 + * This test simulates a recursive typedef, which in GO is defined as such: 7502 + * 7503 + * type Foo func() Foo 7504 + * 7505 + * In BTF terms, this is represented as a TYPEDEF referencing 7506 + * a FUNC_PROTO that returns the same TYPEDEF. 7507 + */ 7508 + .input = { 7509 + .raw_types = { 7510 + /* 7511 + * [1] typedef Foo -> func() Foo 7512 + * [2] func_proto() -> Foo 7513 + * [3] typedef Foo -> func() Foo 7514 + * [4] func_proto() -> Foo 7515 + */ 7516 + BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */ 7517 + BTF_FUNC_PROTO_ENC(1, 0), /* [2] */ 7518 + BTF_TYPEDEF_ENC(NAME_NTH(1), 4), /* [3] */ 7519 + BTF_FUNC_PROTO_ENC(3, 0), /* [4] */ 7520 + BTF_END_RAW, 7521 + }, 7522 + BTF_STR_SEC("\0Foo"), 7523 + }, 7524 + .expect = { 7525 + .raw_types = { 7526 + BTF_TYPEDEF_ENC(NAME_NTH(1), 2), /* [1] */ 7527 + BTF_FUNC_PROTO_ENC(1, 0), /* [2] */ 7528 + BTF_END_RAW, 7529 + }, 7530 + BTF_STR_SEC("\0Foo"), 7531 + }, 7532 + }, 7533 + { 7534 + .descr = "dedup: typedef", 7535 + /* 7536 + * // CU 1: 7537 + * typedef int foo; 7538 + * 7539 + * // CU 2: 7540 + * typedef int foo; 7541 + */ 7542 + .input = { 7543 + .raw_types = { 7544 + /* CU 1 */ 7545 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ 7546 + BTF_TYPEDEF_ENC(NAME_NTH(1), 1), /* [2] */ 7547 + /* CU 2 */ 7548 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [3] */ 7549 + BTF_TYPEDEF_ENC(NAME_NTH(1), 3), /* [4] */ 7550 + BTF_END_RAW, 7551 + }, 7552 + BTF_STR_SEC("\0foo"), 7553 + }, 7554 + .expect = { 7555 + .raw_types = { 7556 + BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 32, 4), /* [1] */ 7557 + BTF_TYPEDEF_ENC(NAME_NTH(1), 1), /* [2] */ 7558 + BTF_END_RAW, 7559 + }, 7560 + BTF_STR_SEC("\0foo"), 7561 + }, 7562 + }, 7563 + { 7499 7564 .descr = "dedup: typedef tags", 7500 7565 .input = { 7501 7566 .raw_types = {