Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: fix swapops.h:131 bug if remap_file_pages raced migration

Add remove_linear_migration_ptes_from_nonlinear(), to fix an interesting
little include/linux/swapops.h:131 BUG_ON(!PageLocked) found by trinity:
indicating that remove_migration_ptes() failed to find one of the
migration entries that was temporarily inserted.

The problem comes from remap_file_pages()'s switch from vma_interval_tree
(good for inserting the migration entry) to i_mmap_nonlinear list (no good
for locating it again); but can only be a problem if the remap_file_pages()
range does not cover the whole of the vma (zap_pte() clears the range).

remove_migration_ptes() needs a file_nonlinear method to go down the
i_mmap_nonlinear list, applying linear location to look for migration
entries in those vmas too, just in case there was this race.

The file_nonlinear method does need rmap_walk_control.arg to do this;
but it never needed vma passed in - vma comes from its own iteration.

Reported-and-tested-by: Dave Jones <davej@redhat.com>
Reported-and-tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Hugh Dickins and committed by
Linus Torvalds
7e09e738 3fb725c4

+36 -4
+1 -2
include/linux/rmap.h
··· 250 250 int (*rmap_one)(struct page *page, struct vm_area_struct *vma, 251 251 unsigned long addr, void *arg); 252 252 int (*done)(struct page *page); 253 - int (*file_nonlinear)(struct page *, struct address_space *, 254 - struct vm_area_struct *vma); 253 + int (*file_nonlinear)(struct page *, struct address_space *, void *arg); 255 254 struct anon_vma *(*anon_lock)(struct page *page); 256 255 bool (*invalid_vma)(struct vm_area_struct *vma, void *arg); 257 256 };
+32
mm/migrate.c
··· 178 178 } 179 179 180 180 /* 181 + * Congratulations to trinity for discovering this bug. 182 + * mm/fremap.c's remap_file_pages() accepts any range within a single vma to 183 + * convert that vma to VM_NONLINEAR; and generic_file_remap_pages() will then 184 + * replace the specified range by file ptes throughout (maybe populated after). 185 + * If page migration finds a page within that range, while it's still located 186 + * by vma_interval_tree rather than lost to i_mmap_nonlinear list, no problem: 187 + * zap_pte() clears the temporary migration entry before mmap_sem is dropped. 188 + * But if the migrating page is in a part of the vma outside the range to be 189 + * remapped, then it will not be cleared, and remove_migration_ptes() needs to 190 + * deal with it. Fortunately, this part of the vma is of course still linear, 191 + * so we just need to use linear location on the nonlinear list. 192 + */ 193 + static int remove_linear_migration_ptes_from_nonlinear(struct page *page, 194 + struct address_space *mapping, void *arg) 195 + { 196 + struct vm_area_struct *vma; 197 + /* hugetlbfs does not support remap_pages, so no huge pgoff worries */ 198 + pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); 199 + unsigned long addr; 200 + 201 + list_for_each_entry(vma, 202 + &mapping->i_mmap_nonlinear, shared.nonlinear) { 203 + 204 + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); 205 + if (addr >= vma->vm_start && addr < vma->vm_end) 206 + remove_migration_pte(page, vma, addr, arg); 207 + } 208 + return SWAP_AGAIN; 209 + } 210 + 211 + /* 181 212 * Get rid of all migration entries and replace them by 182 213 * references to the indicated page. 183 214 */ ··· 217 186 struct rmap_walk_control rwc = { 218 187 .rmap_one = remove_migration_pte, 219 188 .arg = old, 189 + .file_nonlinear = remove_linear_migration_ptes_from_nonlinear, 220 190 }; 221 191 222 192 rmap_walk(new, &rwc);
+3 -2
mm/rmap.c
··· 1360 1360 } 1361 1361 1362 1362 static int try_to_unmap_nonlinear(struct page *page, 1363 - struct address_space *mapping, struct vm_area_struct *vma) 1363 + struct address_space *mapping, void *arg) 1364 1364 { 1365 + struct vm_area_struct *vma; 1365 1366 int ret = SWAP_AGAIN; 1366 1367 unsigned long cursor; 1367 1368 unsigned long max_nl_cursor = 0; ··· 1664 1663 if (list_empty(&mapping->i_mmap_nonlinear)) 1665 1664 goto done; 1666 1665 1667 - ret = rwc->file_nonlinear(page, mapping, vma); 1666 + ret = rwc->file_nonlinear(page, mapping, rwc->arg); 1668 1667 1669 1668 done: 1670 1669 mutex_unlock(&mapping->i_mmap_mutex);