docs: RCU: Convert rculist_nulls.txt to ReST

+1

Documentation/RCU/index.rst

··· 17 17 rcu_dereference 18 18 whatisRCU 19 19 rcu 20 + rculist_nulls 20 21 listRCU 21 22 NMI-RCU 22 23 UP

+194

Documentation/RCU/rculist_nulls.rst

··· 1 + .. SPDX-License-Identifier: GPL-2.0 2 + 3 + ================================================= 4 + Using RCU hlist_nulls to protect list and objects 5 + ================================================= 6 + 7 + This section describes how to use hlist_nulls to 8 + protect read-mostly linked lists and 9 + objects using SLAB_TYPESAFE_BY_RCU allocations. 10 + 11 + Please read the basics in Documentation/RCU/listRCU.rst 12 + 13 + Using special makers (called 'nulls') is a convenient way 14 + to solve following problem : 15 + 16 + A typical RCU linked list managing objects which are 17 + allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can 18 + use following algos : 19 + 20 + 1) Lookup algo 21 + -------------- 22 + 23 + :: 24 + 25 + rcu_read_lock() 26 + begin: 27 + obj = lockless_lookup(key); 28 + if (obj) { 29 + if (!try_get_ref(obj)) // might fail for free objects 30 + goto begin; 31 + /* 32 + * Because a writer could delete object, and a writer could 33 + * reuse these object before the RCU grace period, we 34 + * must check key after getting the reference on object 35 + */ 36 + if (obj->key != key) { // not the object we expected 37 + put_ref(obj); 38 + goto begin; 39 + } 40 + } 41 + rcu_read_unlock(); 42 + 43 + Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() 44 + but a version with an additional memory barrier (smp_rmb()) 45 + 46 + :: 47 + 48 + lockless_lookup(key) 49 + { 50 + struct hlist_node *node, *next; 51 + for (pos = rcu_dereference((head)->first); 52 + pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && 53 + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); 54 + pos = rcu_dereference(next)) 55 + if (obj->key == key) 56 + return obj; 57 + return NULL; 58 + } 59 + 60 + And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb():: 61 + 62 + struct hlist_node *node; 63 + for (pos = rcu_dereference((head)->first); 64 + pos && ({ prefetch(pos->next); 1; }) && 65 + ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); 66 + pos = rcu_dereference(pos->next)) 67 + if (obj->key == key) 68 + return obj; 69 + return NULL; 70 + 71 + Quoting Corey Minyard:: 72 + 73 + "If the object is moved from one list to another list in-between the 74 + time the hash is calculated and the next field is accessed, and the 75 + object has moved to the end of a new list, the traversal will not 76 + complete properly on the list it should have, since the object will 77 + be on the end of the new list and there's not a way to tell it's on a 78 + new list and restart the list traversal. I think that this can be 79 + solved by pre-fetching the "next" field (with proper barriers) before 80 + checking the key." 81 + 82 + 2) Insert algo 83 + -------------- 84 + 85 + We need to make sure a reader cannot read the new 'obj->obj_next' value 86 + and previous value of 'obj->key'. Or else, an item could be deleted 87 + from a chain, and inserted into another chain. If new chain was empty 88 + before the move, 'next' pointer is NULL, and lockless reader can 89 + not detect it missed following items in original chain. 90 + 91 + :: 92 + 93 + /* 94 + * Please note that new inserts are done at the head of list, 95 + * not in the middle or end. 96 + */ 97 + obj = kmem_cache_alloc(...); 98 + lock_chain(); // typically a spin_lock() 99 + obj->key = key; 100 + /* 101 + * we need to make sure obj->key is updated before obj->next 102 + * or obj->refcnt 103 + */ 104 + smp_wmb(); 105 + atomic_set(&obj->refcnt, 1); 106 + hlist_add_head_rcu(&obj->obj_node, list); 107 + unlock_chain(); // typically a spin_unlock() 108 + 109 + 110 + 3) Remove algo 111 + -------------- 112 + Nothing special here, we can use a standard RCU hlist deletion. 113 + But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused 114 + very very fast (before the end of RCU grace period) 115 + 116 + :: 117 + 118 + if (put_last_reference_on(obj) { 119 + lock_chain(); // typically a spin_lock() 120 + hlist_del_init_rcu(&obj->obj_node); 121 + unlock_chain(); // typically a spin_unlock() 122 + kmem_cache_free(cachep, obj); 123 + } 124 + 125 + 126 + 127 + -------------------------------------------------------------------------- 128 + 129 + With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() 130 + and extra smp_wmb() in insert function. 131 + 132 + For example, if we choose to store the slot number as the 'nulls' 133 + end-of-list marker for each slot of the hash table, we can detect 134 + a race (some writer did a delete and/or a move of an object 135 + to another chain) checking the final 'nulls' value if 136 + the lookup met the end of chain. If final 'nulls' value 137 + is not the slot number, then we must restart the lookup at 138 + the beginning. If the object was moved to the same chain, 139 + then the reader doesn't care : It might eventually 140 + scan the list again without harm. 141 + 142 + 143 + 1) lookup algo 144 + -------------- 145 + 146 + :: 147 + 148 + head = &table[slot]; 149 + rcu_read_lock(); 150 + begin: 151 + hlist_nulls_for_each_entry_rcu(obj, node, head, member) { 152 + if (obj->key == key) { 153 + if (!try_get_ref(obj)) // might fail for free objects 154 + goto begin; 155 + if (obj->key != key) { // not the object we expected 156 + put_ref(obj); 157 + goto begin; 158 + } 159 + goto out; 160 + } 161 + /* 162 + * if the nulls value we got at the end of this lookup is 163 + * not the expected one, we must restart lookup. 164 + * We probably met an item that was moved to another chain. 165 + */ 166 + if (get_nulls_value(node) != slot) 167 + goto begin; 168 + obj = NULL; 169 + 170 + out: 171 + rcu_read_unlock(); 172 + 173 + 2) Insert function 174 + ------------------ 175 + 176 + :: 177 + 178 + /* 179 + * Please note that new inserts are done at the head of list, 180 + * not in the middle or end. 181 + */ 182 + obj = kmem_cache_alloc(cachep); 183 + lock_chain(); // typically a spin_lock() 184 + obj->key = key; 185 + /* 186 + * changes to obj->key must be visible before refcnt one 187 + */ 188 + smp_wmb(); 189 + atomic_set(&obj->refcnt, 1); 190 + /* 191 + * insert obj in RCU way (readers might be traversing chain) 192 + */ 193 + hlist_nulls_add_head_rcu(&obj->obj_node, list); 194 + unlock_chain(); // typically a spin_unlock()

-172

Documentation/RCU/rculist_nulls.txt

··· 1 - Using hlist_nulls to protect read-mostly linked lists and 2 - objects using SLAB_TYPESAFE_BY_RCU allocations. 3 - 4 - Please read the basics in Documentation/RCU/listRCU.rst 5 - 6 - Using special makers (called 'nulls') is a convenient way 7 - to solve following problem : 8 - 9 - A typical RCU linked list managing objects which are 10 - allocated with SLAB_TYPESAFE_BY_RCU kmem_cache can 11 - use following algos : 12 - 13 - 1) Lookup algo 14 - -------------- 15 - rcu_read_lock() 16 - begin: 17 - obj = lockless_lookup(key); 18 - if (obj) { 19 - if (!try_get_ref(obj)) // might fail for free objects 20 - goto begin; 21 - /* 22 - * Because a writer could delete object, and a writer could 23 - * reuse these object before the RCU grace period, we 24 - * must check key after getting the reference on object 25 - */ 26 - if (obj->key != key) { // not the object we expected 27 - put_ref(obj); 28 - goto begin; 29 - } 30 - } 31 - rcu_read_unlock(); 32 - 33 - Beware that lockless_lookup(key) cannot use traditional hlist_for_each_entry_rcu() 34 - but a version with an additional memory barrier (smp_rmb()) 35 - 36 - lockless_lookup(key) 37 - { 38 - struct hlist_node *node, *next; 39 - for (pos = rcu_dereference((head)->first); 40 - pos && ({ next = pos->next; smp_rmb(); prefetch(next); 1; }) && 41 - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); 42 - pos = rcu_dereference(next)) 43 - if (obj->key == key) 44 - return obj; 45 - return NULL; 46 - 47 - And note the traditional hlist_for_each_entry_rcu() misses this smp_rmb() : 48 - 49 - struct hlist_node *node; 50 - for (pos = rcu_dereference((head)->first); 51 - pos && ({ prefetch(pos->next); 1; }) && 52 - ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); 53 - pos = rcu_dereference(pos->next)) 54 - if (obj->key == key) 55 - return obj; 56 - return NULL; 57 - } 58 - 59 - Quoting Corey Minyard : 60 - 61 - "If the object is moved from one list to another list in-between the 62 - time the hash is calculated and the next field is accessed, and the 63 - object has moved to the end of a new list, the traversal will not 64 - complete properly on the list it should have, since the object will 65 - be on the end of the new list and there's not a way to tell it's on a 66 - new list and restart the list traversal. I think that this can be 67 - solved by pre-fetching the "next" field (with proper barriers) before 68 - checking the key." 69 - 70 - 2) Insert algo : 71 - ---------------- 72 - 73 - We need to make sure a reader cannot read the new 'obj->obj_next' value 74 - and previous value of 'obj->key'. Or else, an item could be deleted 75 - from a chain, and inserted into another chain. If new chain was empty 76 - before the move, 'next' pointer is NULL, and lockless reader can 77 - not detect it missed following items in original chain. 78 - 79 - /* 80 - * Please note that new inserts are done at the head of list, 81 - * not in the middle or end. 82 - */ 83 - obj = kmem_cache_alloc(...); 84 - lock_chain(); // typically a spin_lock() 85 - obj->key = key; 86 - /* 87 - * we need to make sure obj->key is updated before obj->next 88 - * or obj->refcnt 89 - */ 90 - smp_wmb(); 91 - atomic_set(&obj->refcnt, 1); 92 - hlist_add_head_rcu(&obj->obj_node, list); 93 - unlock_chain(); // typically a spin_unlock() 94 - 95 - 96 - 3) Remove algo 97 - -------------- 98 - Nothing special here, we can use a standard RCU hlist deletion. 99 - But thanks to SLAB_TYPESAFE_BY_RCU, beware a deleted object can be reused 100 - very very fast (before the end of RCU grace period) 101 - 102 - if (put_last_reference_on(obj) { 103 - lock_chain(); // typically a spin_lock() 104 - hlist_del_init_rcu(&obj->obj_node); 105 - unlock_chain(); // typically a spin_unlock() 106 - kmem_cache_free(cachep, obj); 107 - } 108 - 109 - 110 - 111 - -------------------------------------------------------------------------- 112 - With hlist_nulls we can avoid extra smp_rmb() in lockless_lookup() 113 - and extra smp_wmb() in insert function. 114 - 115 - For example, if we choose to store the slot number as the 'nulls' 116 - end-of-list marker for each slot of the hash table, we can detect 117 - a race (some writer did a delete and/or a move of an object 118 - to another chain) checking the final 'nulls' value if 119 - the lookup met the end of chain. If final 'nulls' value 120 - is not the slot number, then we must restart the lookup at 121 - the beginning. If the object was moved to the same chain, 122 - then the reader doesn't care : It might eventually 123 - scan the list again without harm. 124 - 125 - 126 - 1) lookup algo 127 - 128 - head = &table[slot]; 129 - rcu_read_lock(); 130 - begin: 131 - hlist_nulls_for_each_entry_rcu(obj, node, head, member) { 132 - if (obj->key == key) { 133 - if (!try_get_ref(obj)) // might fail for free objects 134 - goto begin; 135 - if (obj->key != key) { // not the object we expected 136 - put_ref(obj); 137 - goto begin; 138 - } 139 - goto out; 140 - } 141 - /* 142 - * if the nulls value we got at the end of this lookup is 143 - * not the expected one, we must restart lookup. 144 - * We probably met an item that was moved to another chain. 145 - */ 146 - if (get_nulls_value(node) != slot) 147 - goto begin; 148 - obj = NULL; 149 - 150 - out: 151 - rcu_read_unlock(); 152 - 153 - 2) Insert function : 154 - -------------------- 155 - 156 - /* 157 - * Please note that new inserts are done at the head of list, 158 - * not in the middle or end. 159 - */ 160 - obj = kmem_cache_alloc(cachep); 161 - lock_chain(); // typically a spin_lock() 162 - obj->key = key; 163 - /* 164 - * changes to obj->key must be visible before refcnt one 165 - */ 166 - smp_wmb(); 167 - atomic_set(&obj->refcnt, 1); 168 - /* 169 - * insert obj in RCU way (readers might be traversing chain) 170 - */ 171 - hlist_nulls_add_head_rcu(&obj->obj_node, list); 172 - unlock_chain(); // typically a spin_unlock()

+1 -1

include/linux/rculist_nulls.h

··· 162 162 * The barrier() is needed to make sure compiler doesn't cache first element [1], 163 163 * as this loop can be restarted [2] 164 164 * [1] Documentation/core-api/atomic_ops.rst around line 114 165 - * [2] Documentation/RCU/rculist_nulls.txt around line 146 165 + * [2] Documentation/RCU/rculist_nulls.rst around line 146 166 166 */ 167 167 #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ 168 168 for (({barrier();}), \

+2 -2

net/core/sock.c

··· 1973 1973 1974 1974 /* 1975 1975 * Before updating sk_refcnt, we must commit prior changes to memory 1976 - * (Documentation/RCU/rculist_nulls.txt for details) 1976 + * (Documentation/RCU/rculist_nulls.rst for details) 1977 1977 */ 1978 1978 smp_wmb(); 1979 1979 refcount_set(&newsk->sk_refcnt, 2); ··· 3035 3035 sk_rx_queue_clear(sk); 3036 3036 /* 3037 3037 * Before updating sk_refcnt, we must commit prior changes to memory 3038 - * (Documentation/RCU/rculist_nulls.txt for details) 3038 + * (Documentation/RCU/rculist_nulls.rst for details) 3039 3039 */ 3040 3040 smp_wmb(); 3041 3041 refcount_set(&sk->sk_refcnt, 1);

Configure Feed

Configure Feed