Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: limit the scope of vma_start_read()

Limit the scope of vma_start_read() as it is used only as a helper for
higher-level locking functions implemented inside mmap_lock.c and we are
about to introduce more complex RCU rules for this function. The change
is pure code refactoring and has no functional changes.

Link: https://lkml.kernel.org/r/20250804233349.1278678-1-surenb@google.com
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Suren Baghdasaryan and committed by
Andrew Morton
cc483b32 35edbaa0

+85 -85
-85
include/linux/mmap_lock.h
··· 148 148 } 149 149 150 150 /* 151 - * Try to read-lock a vma. The function is allowed to occasionally yield false 152 - * locked result to avoid performance overhead, in which case we fall back to 153 - * using mmap_lock. The function should never yield false unlocked result. 154 - * False locked result is possible if mm_lock_seq overflows or if vma gets 155 - * reused and attached to a different mm before we lock it. 156 - * Returns the vma on success, NULL on failure to lock and EAGAIN if vma got 157 - * detached. 158 - * 159 - * WARNING! The vma passed to this function cannot be used if the function 160 - * fails to lock it because in certain cases RCU lock is dropped and then 161 - * reacquired. Once RCU lock is dropped the vma can be concurently freed. 162 - */ 163 - static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, 164 - struct vm_area_struct *vma) 165 - { 166 - int oldcnt; 167 - 168 - /* 169 - * Check before locking. A race might cause false locked result. 170 - * We can use READ_ONCE() for the mm_lock_seq here, and don't need 171 - * ACQUIRE semantics, because this is just a lockless check whose result 172 - * we don't rely on for anything - the mm_lock_seq read against which we 173 - * need ordering is below. 174 - */ 175 - if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(mm->mm_lock_seq.sequence)) 176 - return NULL; 177 - 178 - /* 179 - * If VMA_LOCK_OFFSET is set, __refcount_inc_not_zero_limited_acquire() 180 - * will fail because VMA_REF_LIMIT is less than VMA_LOCK_OFFSET. 181 - * Acquire fence is required here to avoid reordering against later 182 - * vm_lock_seq check and checks inside lock_vma_under_rcu(). 183 - */ 184 - if (unlikely(!__refcount_inc_not_zero_limited_acquire(&vma->vm_refcnt, &oldcnt, 185 - VMA_REF_LIMIT))) { 186 - /* return EAGAIN if vma got detached from under us */ 187 - return oldcnt ? NULL : ERR_PTR(-EAGAIN); 188 - } 189 - 190 - rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_); 191 - 192 - /* 193 - * If vma got attached to another mm from under us, that mm is not 194 - * stable and can be freed in the narrow window after vma->vm_refcnt 195 - * is dropped and before rcuwait_wake_up(mm) is called. Grab it before 196 - * releasing vma->vm_refcnt. 197 - */ 198 - if (unlikely(vma->vm_mm != mm)) { 199 - /* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */ 200 - struct mm_struct *other_mm = vma->vm_mm; 201 - 202 - /* 203 - * __mmdrop() is a heavy operation and we don't need RCU 204 - * protection here. Release RCU lock during these operations. 205 - * We reinstate the RCU read lock as the caller expects it to 206 - * be held when this function returns even on error. 207 - */ 208 - rcu_read_unlock(); 209 - mmgrab(other_mm); 210 - vma_refcount_put(vma); 211 - mmdrop(other_mm); 212 - rcu_read_lock(); 213 - return NULL; 214 - } 215 - 216 - /* 217 - * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. 218 - * False unlocked result is impossible because we modify and check 219 - * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lock_seq 220 - * modification invalidates all existing locks. 221 - * 222 - * We must use ACQUIRE semantics for the mm_lock_seq so that if we are 223 - * racing with vma_end_write_all(), we only start reading from the VMA 224 - * after it has been unlocked. 225 - * This pairs with RELEASE semantics in vma_end_write_all(). 226 - */ 227 - if (unlikely(vma->vm_lock_seq == raw_read_seqcount(&mm->mm_lock_seq))) { 228 - vma_refcount_put(vma); 229 - return NULL; 230 - } 231 - 232 - return vma; 233 - } 234 - 235 - /* 236 151 * Use only while holding mmap read lock which guarantees that locking will not 237 152 * fail (nobody can concurrently write-lock the vma). vma_start_read() should 238 153 * not be used in such cases because it might fail due to mm_lock_seq overflow.
+85
mm/mmap_lock.c
··· 128 128 } 129 129 130 130 /* 131 + * Try to read-lock a vma. The function is allowed to occasionally yield false 132 + * locked result to avoid performance overhead, in which case we fall back to 133 + * using mmap_lock. The function should never yield false unlocked result. 134 + * False locked result is possible if mm_lock_seq overflows or if vma gets 135 + * reused and attached to a different mm before we lock it. 136 + * Returns the vma on success, NULL on failure to lock and EAGAIN if vma got 137 + * detached. 138 + * 139 + * WARNING! The vma passed to this function cannot be used if the function 140 + * fails to lock it because in certain cases RCU lock is dropped and then 141 + * reacquired. Once RCU lock is dropped the vma can be concurently freed. 142 + */ 143 + static inline struct vm_area_struct *vma_start_read(struct mm_struct *mm, 144 + struct vm_area_struct *vma) 145 + { 146 + int oldcnt; 147 + 148 + /* 149 + * Check before locking. A race might cause false locked result. 150 + * We can use READ_ONCE() for the mm_lock_seq here, and don't need 151 + * ACQUIRE semantics, because this is just a lockless check whose result 152 + * we don't rely on for anything - the mm_lock_seq read against which we 153 + * need ordering is below. 154 + */ 155 + if (READ_ONCE(vma->vm_lock_seq) == READ_ONCE(mm->mm_lock_seq.sequence)) 156 + return NULL; 157 + 158 + /* 159 + * If VMA_LOCK_OFFSET is set, __refcount_inc_not_zero_limited_acquire() 160 + * will fail because VMA_REF_LIMIT is less than VMA_LOCK_OFFSET. 161 + * Acquire fence is required here to avoid reordering against later 162 + * vm_lock_seq check and checks inside lock_vma_under_rcu(). 163 + */ 164 + if (unlikely(!__refcount_inc_not_zero_limited_acquire(&vma->vm_refcnt, &oldcnt, 165 + VMA_REF_LIMIT))) { 166 + /* return EAGAIN if vma got detached from under us */ 167 + return oldcnt ? NULL : ERR_PTR(-EAGAIN); 168 + } 169 + 170 + rwsem_acquire_read(&vma->vmlock_dep_map, 0, 1, _RET_IP_); 171 + 172 + /* 173 + * If vma got attached to another mm from under us, that mm is not 174 + * stable and can be freed in the narrow window after vma->vm_refcnt 175 + * is dropped and before rcuwait_wake_up(mm) is called. Grab it before 176 + * releasing vma->vm_refcnt. 177 + */ 178 + if (unlikely(vma->vm_mm != mm)) { 179 + /* Use a copy of vm_mm in case vma is freed after we drop vm_refcnt */ 180 + struct mm_struct *other_mm = vma->vm_mm; 181 + 182 + /* 183 + * __mmdrop() is a heavy operation and we don't need RCU 184 + * protection here. Release RCU lock during these operations. 185 + * We reinstate the RCU read lock as the caller expects it to 186 + * be held when this function returns even on error. 187 + */ 188 + rcu_read_unlock(); 189 + mmgrab(other_mm); 190 + vma_refcount_put(vma); 191 + mmdrop(other_mm); 192 + rcu_read_lock(); 193 + return NULL; 194 + } 195 + 196 + /* 197 + * Overflow of vm_lock_seq/mm_lock_seq might produce false locked result. 198 + * False unlocked result is impossible because we modify and check 199 + * vma->vm_lock_seq under vma->vm_refcnt protection and mm->mm_lock_seq 200 + * modification invalidates all existing locks. 201 + * 202 + * We must use ACQUIRE semantics for the mm_lock_seq so that if we are 203 + * racing with vma_end_write_all(), we only start reading from the VMA 204 + * after it has been unlocked. 205 + * This pairs with RELEASE semantics in vma_end_write_all(). 206 + */ 207 + if (unlikely(vma->vm_lock_seq == raw_read_seqcount(&mm->mm_lock_seq))) { 208 + vma_refcount_put(vma); 209 + return NULL; 210 + } 211 + 212 + return vma; 213 + } 214 + 215 + /* 131 216 * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be 132 217 * stable and not isolated. If the VMA is not found or is being modified the 133 218 * function returns NULL.