Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

maple_tree: use maple copy node for mas_wr_rebalance() operation

Stop using the maple big node for rebalance operations by changing to more
align with spanning store. The rebalance operation needs its own data
calculation in rebalance_data().

In the event of too much data, the rebalance tries to push the data using
push_data_sib(). If there is insufficient data, the rebalance operation
will rebalance against a sibling (found with rebalance_sib()).

The rebalance starts at the leaf and works its way upward in the tree
using rebalance_ascend(). Most of the code is shared with spanning store
such as the copy node having a new root, but is fundamentally different in
that the data must come from a sibling.

A parent maple state is used to track the parent location to avoid
multiple mas_ascend() calls. The maple state tree location is copied from
the parent to the mas (child) in the ascend step. Ascending itself is
done in the main loop.

Link: https://lkml.kernel.org/r/20260130205935.2559335-23-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrew Ballance <andrewjballance@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Kujau <lists@nerdbynature.de>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: SeongJae Park <sj@kernel.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Liam R. Howlett and committed by
Andrew Morton
971f0db1 b00a1804

+206 -7
+206 -7
lib/maple_tree.c
··· 2305 2305 *split = mid_split; 2306 2306 } 2307 2307 2308 + static inline void rebalance_sib(struct ma_state *parent, struct ma_state *sib) 2309 + { 2310 + *sib = *parent; 2311 + /* Prioritize move right to pull data left */ 2312 + if (sib->offset < sib->end) 2313 + sib->offset++; 2314 + else 2315 + sib->offset--; 2316 + 2317 + mas_descend(sib); 2318 + sib->end = mas_data_end(sib); 2319 + } 2320 + 2308 2321 static inline 2309 2322 void spanning_sib(struct ma_wr_state *l_wr_mas, 2310 2323 struct ma_wr_state *r_wr_mas, struct ma_state *nneighbour) ··· 2866 2853 cp->data += cp->end + 1; 2867 2854 /* Data from right (offset + 1 to end), +1 for zero */ 2868 2855 cp->data += r_wr_mas->mas->end - r_wr_mas->offset_end; 2856 + } 2857 + 2858 + static bool data_fits(struct ma_state *sib, struct ma_state *mas, 2859 + struct maple_copy *cp) 2860 + { 2861 + unsigned char new_data; 2862 + enum maple_type type; 2863 + unsigned char space; 2864 + unsigned char end; 2865 + 2866 + type = mte_node_type(mas->node); 2867 + space = 2 * mt_slots[type]; 2868 + end = sib->end; 2869 + 2870 + new_data = end + 1 + cp->data; 2871 + if (new_data > space) 2872 + return false; 2873 + 2874 + /* 2875 + * This is off by one by design. The extra space is left to reduce 2876 + * jitter in operations that add then remove two entries. 2877 + * 2878 + * end is an index while new space and data are both sizes. Adding one 2879 + * to end to convert the index to a size means that the below 2880 + * calculation should be <=, but we want to keep an extra space in nodes 2881 + * to reduce jitter. 2882 + * 2883 + * Note that it is still possible to get a full node on the left by the 2884 + * NULL landing exactly on the split. The NULL ending of a node happens 2885 + * in the dst_setup() function, where we will either increase the split 2886 + * by one or decrease it by one, if possible. In the case of split 2887 + * (this case), it is always possible to shift the spilt by one - again 2888 + * because there is at least one slot free by the below checking. 2889 + */ 2890 + if (new_data < space) 2891 + return true; 2892 + 2893 + return false; 2894 + } 2895 + 2896 + static inline void push_data_sib(struct maple_copy *cp, struct ma_state *mas, 2897 + struct ma_state *sib, struct ma_state *parent) 2898 + { 2899 + 2900 + if (mte_is_root(mas->node)) 2901 + goto no_push; 2902 + 2903 + 2904 + *sib = *parent; 2905 + if (sib->offset) { 2906 + sib->offset--; 2907 + mas_descend(sib); 2908 + sib->end = mas_data_end(sib); 2909 + if (data_fits(sib, mas, cp)) /* Push left */ 2910 + return; 2911 + 2912 + *sib = *parent; 2913 + } 2914 + 2915 + if (sib->offset >= sib->end) 2916 + goto no_push; 2917 + 2918 + sib->offset++; 2919 + mas_descend(sib); 2920 + sib->end = mas_data_end(sib); 2921 + if (data_fits(sib, mas, cp)) /* Push right*/ 2922 + return; 2923 + 2924 + no_push: 2925 + sib->end = 0; 2926 + } 2927 + 2928 + /* 2929 + * rebalance_data() - Calculate the @cp data, populate @sib if insufficient or 2930 + * if the data can be pushed into a sibling. 2931 + * @cp: The maple copy node 2932 + * @wr_mas: The left write maple state 2933 + * @sib: The maple state of the sibling. 2934 + * 2935 + * Note: @cp->data is a size and not indexed by 0. @sib->end may be set to 0 to 2936 + * indicate it will not be used. 2937 + * 2938 + */ 2939 + static inline void rebalance_data(struct maple_copy *cp, 2940 + struct ma_wr_state *wr_mas, struct ma_state *sib, 2941 + struct ma_state *parent) 2942 + { 2943 + cp_data_calc(cp, wr_mas, wr_mas); 2944 + sib->end = 0; 2945 + if (cp->data > mt_slots[wr_mas->type]) { 2946 + push_data_sib(cp, wr_mas->mas, sib, parent); 2947 + if (sib->end) 2948 + goto use_sib; 2949 + } else if (cp->data <= mt_min_slots[wr_mas->type]) { 2950 + if ((wr_mas->mas->min != 0) || 2951 + (wr_mas->mas->max != ULONG_MAX)) { 2952 + rebalance_sib(parent, sib); 2953 + goto use_sib; 2954 + } 2955 + } 2956 + 2957 + return; 2958 + 2959 + use_sib: 2960 + 2961 + cp->data += sib->end + 1; 2869 2962 } 2870 2963 2871 2964 /* ··· 3528 3409 cp->height++; 3529 3410 wr_mas_ascend(l_wr_mas); 3530 3411 wr_mas_ascend(r_wr_mas); 3412 + return true; 3413 + } 3414 + 3415 + /* 3416 + * rebalance_ascend() - Ascend the tree and set up for the next loop - if 3417 + * necessary 3418 + * 3419 + * Return: True if there another rebalancing operation on the next level is 3420 + * needed, false otherwise. 3421 + */ 3422 + static inline bool rebalance_ascend(struct maple_copy *cp, 3423 + struct ma_wr_state *wr_mas, struct ma_state *sib, 3424 + struct ma_state *parent) 3425 + { 3426 + struct ma_state *mas; 3427 + unsigned long min, max; 3428 + 3429 + mas = wr_mas->mas; 3430 + if (!sib->end) { 3431 + min = mas->min; 3432 + max = mas->max; 3433 + } else if (sib->min > mas->max) { /* Move right succeeded */ 3434 + min = mas->min; 3435 + max = sib->max; 3436 + wr_mas->offset_end = parent->offset + 1; 3437 + } else { 3438 + min = sib->min; 3439 + max = mas->max; 3440 + wr_mas->offset_end = parent->offset; 3441 + parent->offset--; 3442 + } 3443 + 3444 + cp_dst_to_slots(cp, min, max, mas); 3445 + if (cp_is_new_root(cp, mas)) 3446 + return false; 3447 + 3448 + if (cp->d_count == 1 && !sib->end) { 3449 + cp->dst[0].node->parent = ma_parent_ptr(mas_mn(mas)->parent); 3450 + return false; 3451 + } 3452 + 3453 + cp->height++; 3454 + mas->node = parent->node; 3455 + mas->offset = parent->offset; 3456 + mas->min = parent->min; 3457 + mas->max = parent->max; 3458 + mas->end = parent->end; 3459 + mas->depth = parent->depth; 3460 + wr_mas_setup(wr_mas, mas); 3531 3461 return true; 3532 3462 } 3533 3463 ··· 4547 4379 * mas_wr_rebalance() - Insufficient data in one node needs to either get data 4548 4380 * from a sibling or absorb a sibling all together. 4549 4381 * @wr_mas: The write maple state 4382 + * 4383 + * Rebalance is different than a spanning store in that the write state is 4384 + * already at the leaf node that's being altered. 4550 4385 */ 4551 - static noinline_for_kasan void mas_wr_rebalance(struct ma_wr_state *wr_mas) 4386 + static void mas_wr_rebalance(struct ma_wr_state *wr_mas) 4552 4387 { 4553 - struct maple_big_node b_node; 4388 + struct maple_enode *old_enode; 4389 + struct ma_state parent; 4390 + struct ma_state *mas; 4391 + struct maple_copy cp; 4392 + struct ma_state sib; 4554 4393 4555 - trace_ma_write(__func__, wr_mas->mas, 0, wr_mas->entry); 4556 - memset(&b_node, 0, sizeof(struct maple_big_node)); 4557 - mas_store_b_node(wr_mas, &b_node, wr_mas->offset_end); 4558 - WARN_ON_ONCE(wr_mas->mas->store_type != wr_rebalance); 4559 - return mas_rebalance(wr_mas->mas, &b_node); 4394 + /* 4395 + * Rebalancing occurs if a node is insufficient. Data is rebalanced 4396 + * against the node to the right if it exists, otherwise the node to the 4397 + * left of this node is rebalanced against this node. If rebalancing 4398 + * causes just one node to be produced instead of two, then the parent 4399 + * is also examined and rebalanced if it is insufficient. Every level 4400 + * tries to combine the data in the same way. If one node contains the 4401 + * entire range of the tree, then that node is used as a new root node. 4402 + */ 4403 + 4404 + mas = wr_mas->mas; 4405 + trace_ma_op(TP_FCT, mas); 4406 + parent = *mas; 4407 + cp_leaf_init(&cp, mas, wr_mas, wr_mas); 4408 + do { 4409 + if (!mte_is_root(parent.node)) { 4410 + mas_ascend(&parent); 4411 + parent.end = mas_data_end(&parent); 4412 + } 4413 + rebalance_data(&cp, wr_mas, &sib, &parent); 4414 + multi_src_setup(&cp, wr_mas, wr_mas, &sib); 4415 + dst_setup(&cp, mas, wr_mas->type); 4416 + cp_data_write(&cp, mas); 4417 + } while (rebalance_ascend(&cp, wr_mas, &sib, &parent)); 4418 + 4419 + old_enode = mas->node; 4420 + mas->node = mt_slot_locked(mas->tree, cp.slot, 0); 4421 + mas_wmb_replace(mas, old_enode, cp.height); 4422 + mtree_range_walk(mas); 4560 4423 } 4561 4424 4562 4425 /*