Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

f2fs: fix IS_CHECKPOINTED flag inconsistency issue caused by concurrent atomic commit and checkpoint writes

During SPO tests, when mounting F2FS, an -EINVAL error was returned from
f2fs_recover_inode_page. The issue occurred under the following scenario

Thread A Thread B
f2fs_ioc_commit_atomic_write
- f2fs_do_sync_file // atomic = true
- f2fs_fsync_node_pages
: last_folio = inode folio
: schedule before folio_lock(last_folio) f2fs_write_checkpoint
- block_operations// writeback last_folio
- schedule before f2fs_flush_nat_entries
: set_fsync_mark(last_folio, 1)
: set_dentry_mark(last_folio, 1)
: folio_mark_dirty(last_folio)
- __write_node_folio(last_folio)
: f2fs_down_read(&sbi->node_write)//block
- f2fs_flush_nat_entries
: {struct nat_entry}->flag |= BIT(IS_CHECKPOINTED)
- unblock_operations
: f2fs_up_write(&sbi->node_write)
f2fs_write_checkpoint//return
: f2fs_do_write_node_page()
f2fs_ioc_commit_atomic_write//return
SPO

Thread A calls f2fs_need_dentry_mark(sbi, ino), and the last_folio has
already been written once. However, the {struct nat_entry}->flag did not
have the IS_CHECKPOINTED set, causing set_dentry_mark(last_folio, 1) and
write last_folio again after Thread B finishes f2fs_write_checkpoint.

After SPO and reboot, it was detected that {struct node_info}->blk_addr
was not NULL_ADDR because Thread B successfully write the checkpoint.

This issue only occurs in atomic write scenarios. For regular file
fsync operations, the folio must be dirty. If
block_operations->f2fs_sync_node_pages successfully submit the folio
write, this path will not be executed. Otherwise, the
f2fs_write_checkpoint will need to wait for the folio write submission
to complete, as sbi->nr_pages[F2FS_DIRTY_NODES] > 0. Therefore, the
situation where f2fs_need_dentry_mark checks that the {struct
nat_entry}->flag /wo the IS_CHECKPOINTED flag, but the folio write has
already been submitted, will not occur.

Therefore, for atomic file fsync, sbi->node_write should be acquired
through __write_node_folio to ensure that the IS_CHECKPOINTED flag
correctly indicates that the checkpoint write has been completed.

Fixes: 608514deba38 ("f2fs: set fsync mark only for the last dnode")
Cc: stable@kernel.org
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
Signed-off-by: Jinbao Liu <liujinbao1@xiaomi.com>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

authored by

Yongpeng Yang and committed by
Jaegeuk Kim
7633a738 071e50d6

+10 -4
+10 -4
fs/f2fs/node.c
··· 1786 1786 goto redirty_out; 1787 1787 } 1788 1788 1789 - if (atomic && !test_opt(sbi, NOBARRIER)) 1790 - fio.op_flags |= REQ_PREFLUSH | REQ_FUA; 1789 + if (atomic) { 1790 + if (!test_opt(sbi, NOBARRIER)) 1791 + fio.op_flags |= REQ_PREFLUSH | REQ_FUA; 1792 + if (IS_INODE(folio)) 1793 + set_dentry_mark(folio, 1794 + f2fs_need_dentry_mark(sbi, ino_of_node(folio))); 1795 + } 1791 1796 1792 1797 /* should add to global list before clearing PAGECACHE status */ 1793 1798 if (f2fs_in_warm_node_list(sbi, folio)) { ··· 1933 1928 if (is_inode_flag_set(inode, 1934 1929 FI_DIRTY_INODE)) 1935 1930 f2fs_update_inode(inode, folio); 1936 - set_dentry_mark(folio, 1937 - f2fs_need_dentry_mark(sbi, ino)); 1931 + if (!atomic) 1932 + set_dentry_mark(folio, 1933 + f2fs_need_dentry_mark(sbi, ino)); 1938 1934 } 1939 1935 /* may be written by other thread */ 1940 1936 if (!folio_test_dirty(folio))