Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

f2fs: fix data loss caused by incorrect use of nat_entry flag

Data loss can occur when fsync is performed on a newly created file
(before any checkpoint has been written) concurrently with a checkpoint
operation. The scenario is as follows:

create & write & fsync 'file A' write checkpoint
- f2fs_do_sync_file // inline inode
- f2fs_write_inode // inode folio is dirty
- f2fs_write_checkpoint
- f2fs_flush_merged_writes
- f2fs_sync_node_pages
- f2fs_flush_nat_entries
- f2fs_fsync_node_pages // no dirty node
- f2fs_need_inode_block_update // return false
SPO and lost 'file A'

f2fs_flush_nat_entries() sets the IS_CHECKPOINTED and HAS_LAST_FSYNC
flags for the nat_entry, but this does not mean that the checkpoint has
actually completed successfully. However, f2fs_need_inode_block_update()
checks these flags and incorrectly assumes that the checkpoint has
finished.

The root cause is that the semantics of IS_CHECKPOINTED and
HAS_LAST_FSYNC are only guaranteed after the checkpoint write fully
completes.

This patch modifies f2fs_need_inode_block_update() to acquire the
sbi->node_write lock before reading the nat_entry flags, ensuring that
once IS_CHECKPOINTED and HAS_LAST_FSYNC are observed to be set, the
checkpoint operation has already completed.

Fixes: e05df3b115e7 ("f2fs: add node operations")
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

authored by

Yongpeng Yang and committed by
Jaegeuk Kim
238e14eb dccd324f

+3
+3
fs/f2fs/node.c
··· 427 427 struct f2fs_nm_info *nm_i = NM_I(sbi); 428 428 struct nat_entry *e; 429 429 bool need_update = true; 430 + struct f2fs_lock_context lc; 430 431 432 + f2fs_down_read_trace(&sbi->node_write, &lc); 431 433 f2fs_down_read(&nm_i->nat_tree_lock); 432 434 e = __lookup_nat_cache(nm_i, ino, false); 433 435 if (e && get_nat_flag(e, HAS_LAST_FSYNC) && ··· 437 435 get_nat_flag(e, HAS_FSYNCED_INODE))) 438 436 need_update = false; 439 437 f2fs_up_read(&nm_i->nat_tree_lock); 438 + f2fs_up_read_trace(&sbi->node_write, &lc); 440 439 return need_update; 441 440 } 442 441