Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/mremap: check map count under mmap write lock and abstract

We are checking the mmap count in check_mremap_params(), prior to
obtaining an mmap write lock, which means that accesses to
current->mm->map_count might race with this field being updated.

Resolve this by only checking this field after the mmap write lock is held.

Additionally, abstract this check into a helper function with extensive
ASCII documentation of what's going on.

Link: https://lkml.kernel.org/r/18be0b48eaa8e8804eb745974ee729c3ade0c687.1773249037.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Reported-by: Jianzhou Zhao <luckd0g@163.com>
Closes: https://lore.kernel.org/all/1a7d4c26.6b46.19cdbe7eaf0.Coremail.luckd0g@163.com/
Reviewed-by: Pedro Falcato <pfalcato@suse.de>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Lorenzo Stoakes (Oracle) and committed by
Andrew Morton
0289955f 2d1e54aa

+75 -13
+75 -13
mm/mremap.c
··· 1028 1028 mm->locked_vm += pages; 1029 1029 } 1030 1030 1031 + static bool __check_map_count_against_split(struct mm_struct *mm, 1032 + bool before_unmaps) 1033 + { 1034 + const int sys_map_count = get_sysctl_max_map_count(); 1035 + int map_count = mm->map_count; 1036 + 1037 + mmap_assert_write_locked(mm); 1038 + 1039 + /* 1040 + * At the point of shrinking the VMA, if new_len < old_len, we unmap 1041 + * thusly in the worst case: 1042 + * 1043 + * old_addr+old_len old_addr+old_len 1044 + * |---------------.----.---------| |---------------| |---------| 1045 + * | . . | -> | +1 | -1 | +1 | 1046 + * |---------------.----.---------| |---------------| |---------| 1047 + * old_addr+new_len old_addr+new_len 1048 + * 1049 + * At the point of removing the portion of an existing VMA to make space 1050 + * for the moved VMA if MREMAP_FIXED, we unmap thusly in the worst case: 1051 + * 1052 + * new_addr new_addr+new_len new_addr new_addr+new_len 1053 + * |----.---------------.---------| |----| |---------| 1054 + * | . . | -> | +1 | -1 | +1 | 1055 + * |----.---------------.---------| |----| |---------| 1056 + * 1057 + * Therefore, before we consider the move anything, we have to account 1058 + * for 2 additional VMAs possibly being created upon these unmappings. 1059 + */ 1060 + if (before_unmaps) 1061 + map_count += 2; 1062 + 1063 + /* 1064 + * At the point of MOVING the VMA: 1065 + * 1066 + * We start by copying a VMA, which creates an additional VMA if no 1067 + * merge occurs, then if not MREMAP_DONTUNMAP, we unmap the source VMA. 1068 + * In the worst case we might then observe: 1069 + * 1070 + * new_addr new_addr+new_len new_addr new_addr+new_len 1071 + * |----| |---------| |----|---------------|---------| 1072 + * | | | | -> | | +1 | | 1073 + * |----| |---------| |----|---------------|---------| 1074 + * 1075 + * old_addr old_addr+old_len old_addr old_addr+old_len 1076 + * |----.---------------.---------| |----| |---------| 1077 + * | . . | -> | +1 | -1 | +1 | 1078 + * |----.---------------.---------| |----| |---------| 1079 + * 1080 + * Therefore we must check to ensure we have headroom of 2 additional 1081 + * VMAs. 1082 + */ 1083 + return map_count + 2 <= sys_map_count; 1084 + } 1085 + 1086 + /* Do we violate the map count limit if we split VMAs when moving the VMA? */ 1087 + static bool check_map_count_against_split(void) 1088 + { 1089 + return __check_map_count_against_split(current->mm, 1090 + /*before_unmaps=*/false); 1091 + } 1092 + 1093 + /* Do we violate the map count limit if we split VMAs prior to early unmaps? */ 1094 + static bool check_map_count_against_split_early(void) 1095 + { 1096 + return __check_map_count_against_split(current->mm, 1097 + /*before_unmaps=*/true); 1098 + } 1099 + 1031 1100 /* 1032 1101 * Perform checks before attempting to write a VMA prior to it being 1033 1102 * moved. ··· 1114 1045 * which may not merge, then (if MREMAP_DONTUNMAP is not set) unmap the 1115 1046 * source, which may split, causing a net increase of 2 mappings. 1116 1047 */ 1117 - if (current->mm->map_count + 2 > get_sysctl_max_map_count()) 1048 + if (!check_map_count_against_split()) 1118 1049 return -ENOMEM; 1119 1050 1120 1051 if (vma->vm_ops && vma->vm_ops->may_split) { ··· 1873 1804 if (vrm_overlaps(vrm)) 1874 1805 return -EINVAL; 1875 1806 1876 - /* 1877 - * We may unmap twice before invoking move_vma(), that is if new_len < 1878 - * old_len (shrinking), and in the MREMAP_FIXED case, unmapping part of 1879 - * a VMA located at the destination. 1880 - * 1881 - * In the worst case, both unmappings will cause splits, resulting in a 1882 - * net increased map count of 2. In move_vma() we check for headroom of 1883 - * 2 additional mappings, so check early to avoid bailing out then. 1884 - */ 1885 - if (current->mm->map_count + 4 > get_sysctl_max_map_count()) 1886 - return -ENOMEM; 1887 - 1888 1807 return 0; 1889 1808 } 1890 1809 ··· 1981 1924 if (mmap_write_lock_killable(mm)) 1982 1925 return -EINTR; 1983 1926 vrm->mmap_locked = true; 1927 + 1928 + if (!check_map_count_against_split_early()) { 1929 + mmap_write_unlock(mm); 1930 + return -ENOMEM; 1931 + } 1984 1932 1985 1933 if (vrm_move_only(vrm)) { 1986 1934 res = remap_move(vrm);