Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: fix assertion mapping->nrpages == 0 in end_writeback()

Under heavy memory and filesystem load, users observe the assertion
mapping->nrpages == 0 in end_writeback() trigger. This can be caused by
page reclaim reclaiming the last page from a mapping in the following
race:

CPU0 CPU1
...
shrink_page_list()
__remove_mapping()
__delete_from_page_cache()
radix_tree_delete()
evict_inode()
truncate_inode_pages()
truncate_inode_pages_range()
pagevec_lookup() - finds nothing
end_writeback()
mapping->nrpages != 0 -> BUG
page->mapping = NULL
mapping->nrpages--

Fix the problem by doing a reliable check of mapping->nrpages under
mapping->tree_lock in end_writeback().

Analyzed by Jay <jinshan.xiong@whamcloud.com>, lost in LKML, and dug out
by Miklos Szeredi <mszeredi@suse.de>.

Cc: Jay <jinshan.xiong@whamcloud.com>
Cc: Miklos Szeredi <mszeredi@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Jan Kara and committed by
Linus Torvalds
08142579 9b679320

+13
+7
fs/inode.c
··· 423 423 void end_writeback(struct inode *inode) 424 424 { 425 425 might_sleep(); 426 + /* 427 + * We have to cycle tree_lock here because reclaim can be still in the 428 + * process of removing the last page (in __delete_from_page_cache()) 429 + * and we must not free mapping under it. 430 + */ 431 + spin_lock_irq(&inode->i_data.tree_lock); 426 432 BUG_ON(inode->i_data.nrpages); 433 + spin_unlock_irq(&inode->i_data.tree_lock); 427 434 BUG_ON(!list_empty(&inode->i_data.private_list)); 428 435 BUG_ON(!(inode->i_state & I_FREEING)); 429 436 BUG_ON(inode->i_state & I_CLEAR);
+1
include/linux/fs.h
··· 639 639 struct prio_tree_root i_mmap; /* tree of private and shared mappings */ 640 640 struct list_head i_mmap_nonlinear;/*list VM_NONLINEAR mappings */ 641 641 struct mutex i_mmap_mutex; /* protect tree, count, list */ 642 + /* Protected by tree_lock together with the radix tree */ 642 643 unsigned long nrpages; /* number of total pages */ 643 644 pgoff_t writeback_index;/* writeback starts here */ 644 645 const struct address_space_operations *a_ops; /* methods */
+5
mm/truncate.c
··· 304 304 * @lstart: offset from which to truncate 305 305 * 306 306 * Called under (and serialised by) inode->i_mutex. 307 + * 308 + * Note: When this function returns, there can be a page in the process of 309 + * deletion (inside __delete_from_page_cache()) in the specified range. Thus 310 + * mapping->nrpages can be non-zero when this function returns even after 311 + * truncation of the whole mapping. 307 312 */ 308 313 void truncate_inode_pages(struct address_space *mapping, loff_t lstart) 309 314 {