Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

btrfs: add tracepoint for search slot restart tracking

Add a btrfs_search_slot_restart tracepoint that fires at each restart
site in btrfs_search_slot(), recording the root, tree level, and
reason for the restart. This enables tracking search slot restarts
which contribute to COW amplification under memory pressure.

The four restart reasons are:
- write_lock: insufficient write lock level, need to restart with
higher lock
- setup_nodes: node setup returned -EAGAIN
- slot_zero: insertion at slot 0 requires higher write lock level
- read_block: read_block_for_search returned -EAGAIN (block not
cached or lock contention)

COW counts are already tracked by the existing trace_btrfs_cow_block()
tracepoint. The per-restart-site tracepoint avoids counter overhead
in the critical path when tracepoints are disabled, and provides
richer per-event information that bpftrace scripts can aggregate into
counts, histograms, and per-root breakdowns.

Reviewed-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: Leo Martins <loemra.dev@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>

authored by

Leo Martins and committed by
David Sterba
cc970d21 f9a48549

+32 -2
+8 -2
fs/btrfs/ctree.c
··· 2102 2102 p->nodes[level + 1])) { 2103 2103 write_lock_level = level + 1; 2104 2104 btrfs_release_path(p); 2105 + trace_btrfs_search_slot_restart(root, level, "write_lock"); 2105 2106 goto again; 2106 2107 } 2107 2108 ··· 2165 2164 p->slots[level] = slot; 2166 2165 ret2 = setup_nodes_for_search(trans, root, p, b, level, ins_len, 2167 2166 &write_lock_level); 2168 - if (ret2 == -EAGAIN) 2167 + if (ret2 == -EAGAIN) { 2168 + trace_btrfs_search_slot_restart(root, level, "setup_nodes"); 2169 2169 goto again; 2170 + } 2170 2171 if (ret2) { 2171 2172 ret = ret2; 2172 2173 goto done; ··· 2184 2181 if (slot == 0 && ins_len && write_lock_level < level + 1) { 2185 2182 write_lock_level = level + 1; 2186 2183 btrfs_release_path(p); 2184 + trace_btrfs_search_slot_restart(root, level, "slot_zero"); 2187 2185 goto again; 2188 2186 } 2189 2187 ··· 2198 2194 } 2199 2195 2200 2196 ret2 = read_block_for_search(root, p, &b, slot, key); 2201 - if (ret2 == -EAGAIN && !p->nowait) 2197 + if (ret2 == -EAGAIN && !p->nowait) { 2198 + trace_btrfs_search_slot_restart(root, level, "read_block"); 2202 2199 goto again; 2200 + } 2203 2201 if (ret2) { 2204 2202 ret = ret2; 2205 2203 goto done;
+24
include/trace/events/btrfs.h
··· 1113 1113 __entry->cow_level) 1114 1114 ); 1115 1115 1116 + TRACE_EVENT(btrfs_search_slot_restart, 1117 + 1118 + TP_PROTO(const struct btrfs_root *root, int level, 1119 + const char *reason), 1120 + 1121 + TP_ARGS(root, level, reason), 1122 + 1123 + TP_STRUCT__entry_btrfs( 1124 + __field( u64, root_objectid ) 1125 + __field( int, level ) 1126 + __string( reason, reason ) 1127 + ), 1128 + 1129 + TP_fast_assign_btrfs(root->fs_info, 1130 + __entry->root_objectid = btrfs_root_id(root); 1131 + __entry->level = level; 1132 + __assign_str(reason); 1133 + ), 1134 + 1135 + TP_printk_btrfs("root=%llu(%s) level=%d reason=%s", 1136 + show_root_type(__entry->root_objectid), 1137 + __entry->level, __get_str(reason)) 1138 + ); 1139 + 1116 1140 TRACE_EVENT(btrfs_space_reservation, 1117 1141 1118 1142 TP_PROTO(const struct btrfs_fs_info *fs_info, const char *type, u64 val,