Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/vma: document possible vma->vm_refcnt values and reference comment

The possible vma->vm_refcnt values are confusing and vague, explain in
detail what these can be in a comment describing the vma->vm_refcnt field
and reference this comment in various places that read/write this field.

No functional change intended.

[akpm@linux-foundation.org: fix typo, per Suren]
Link: https://lkml.kernel.org/r/d462e7678c6cc7461f94e5b26c776547d80a67e8.1769198904.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Waiman Long <longman@redhat.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes and committed by
Andrew Morton
ef4c0cea 25faccd6

+53 -2
+40 -2
include/linux/mm_types.h
··· 758 758 * set the VM_REFCNT_EXCLUDE_READERS_FLAG in vma->vm_refcnt to indiciate to 759 759 * vma_start_read() that the reference count should be left alone. 760 760 * 761 - * Once the operation is complete, this value is subtracted from vma->vm_refcnt. 761 + * See the comment describing vm_refcnt in vm_area_struct for details as to 762 + * which values the VMA reference count can be. 762 763 */ 763 764 #define VM_REFCNT_EXCLUDE_READERS_BIT (30) 764 765 #define VM_REFCNT_EXCLUDE_READERS_FLAG (1U << VM_REFCNT_EXCLUDE_READERS_BIT) ··· 990 989 struct vma_numab_state *numab_state; /* NUMA Balancing state */ 991 990 #endif 992 991 #ifdef CONFIG_PER_VMA_LOCK 993 - /* Unstable RCU readers are allowed to read this. */ 992 + /* 993 + * Used to keep track of firstly, whether the VMA is attached, secondly, 994 + * if attached, how many read locks are taken, and thirdly, if the 995 + * VM_REFCNT_EXCLUDE_READERS_FLAG is set, whether any read locks held 996 + * are currently in the process of being excluded. 997 + * 998 + * This value can be equal to: 999 + * 1000 + * 0 - Detached. IMPORTANT: when the refcnt is zero, readers cannot 1001 + * increment it. 1002 + * 1003 + * 1 - Attached and either unlocked or write-locked. Write locks are 1004 + * identified via __is_vma_write_locked() which checks for equality of 1005 + * vma->vm_lock_seq and mm->mm_lock_seq. 1006 + * 1007 + * >1, < VM_REFCNT_EXCLUDE_READERS_FLAG - Read-locked or (unlikely) 1008 + * write-locked with other threads having temporarily incremented the 1009 + * reference count prior to determining it is write-locked and 1010 + * decrementing it again. 1011 + * 1012 + * VM_REFCNT_EXCLUDE_READERS_FLAG - Detached, pending 1013 + * __vma_exit_locked() completion which will decrement the reference 1014 + * count to zero. IMPORTANT - at this stage no further readers can 1015 + * increment the reference count. It can only be reduced. 1016 + * 1017 + * VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread is either write-locking 1018 + * an attached VMA and has yet to invoke __vma_exit_locked(), OR a 1019 + * thread is detaching a VMA and is waiting on a single spurious reader 1020 + * in order to decrement the reference count. IMPORTANT - as above, no 1021 + * further readers can increment the reference count. 1022 + * 1023 + * > VM_REFCNT_EXCLUDE_READERS_FLAG + 1 - A thread is either 1024 + * write-locking or detaching a VMA is waiting on readers to 1025 + * exit. IMPORTANT - as above, no further readers can increment the 1026 + * reference count. 1027 + * 1028 + * NOTE: Unstable RCU readers are allowed to read this. 1029 + */ 994 1030 refcount_t vm_refcnt ____cacheline_aligned_in_smp; 995 1031 #ifdef CONFIG_DEBUG_LOCK_ALLOC 996 1032 struct lockdep_map vmlock_dep_map;
+7
include/linux/mmap_lock.h
··· 130 130 * attached. Waiting on a detached vma happens only in 131 131 * vma_mark_detached() and is a rare case, therefore most of the time 132 132 * there will be no unnecessary wakeup. 133 + * 134 + * See the comment describing the vm_area_struct->vm_refcnt field for 135 + * details of possible refcnt values. 133 136 */ 134 137 return (refcnt & VM_REFCNT_EXCLUDE_READERS_FLAG) && 135 138 refcnt <= VM_REFCNT_EXCLUDE_READERS_FLAG + 1; ··· 252 249 { 253 250 unsigned int mm_lock_seq; 254 251 252 + /* 253 + * See the comment describing the vm_area_struct->vm_refcnt field for 254 + * details of possible refcnt values. 255 + */ 255 256 VM_BUG_ON_VMA(refcount_read(&vma->vm_refcnt) <= 1 && 256 257 !__is_vma_write_locked(vma, &mm_lock_seq), vma); 257 258 }
+6
mm/mmap_lock.c
··· 65 65 /* 66 66 * If vma is detached then only vma_mark_attached() can raise the 67 67 * vm_refcnt. mmap_write_lock prevents racing with vma_mark_attached(). 68 + * 69 + * See the comment describing the vm_area_struct->vm_refcnt field for 70 + * details of possible refcnt values. 68 71 */ 69 72 if (!refcount_add_not_zero(VM_REFCNT_EXCLUDE_READERS_FLAG, &vma->vm_refcnt)) 70 73 return 0; ··· 140 137 * before they check vm_lock_seq, realize the vma is locked and drop 141 138 * back the vm_refcnt. That is a narrow window for observing a raised 142 139 * vm_refcnt. 140 + * 141 + * See the comment describing the vm_area_struct->vm_refcnt field for 142 + * details of possible refcnt values. 143 143 */ 144 144 if (unlikely(!refcount_dec_and_test(&vma->vm_refcnt))) { 145 145 /* Wait until vma is detached with no readers. */