Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

jbd2: ensure that all ongoing I/O complete before freeing blocks

When releasing file system metadata blocks in jbd2_journal_forget(), if
this buffer has not yet been checkpointed, it may have already been
written back, currently be in the process of being written back, or has
not yet written back. jbd2_journal_forget() calls
jbd2_journal_try_remove_checkpoint() to check the buffer's status and
add it to the current transaction if it has not been written back. This
buffer can only be reallocated after the transaction is committed.

jbd2_journal_try_remove_checkpoint() attempts to lock the buffer and
check its dirty status while holding the buffer lock. If the buffer has
already been written back, everything proceeds normally. However, there
are two issues. First, the function returns immediately if the buffer is
locked by the write-back process. It does not wait for the write-back to
complete. Consequently, until the current transaction is committed and
the block is reallocated, there is no guarantee that the I/O will
complete. This means that ongoing I/O could write stale metadata to the
newly allocated block, potentially corrupting data. Second, the function
unlocks the buffer as soon as it detects that the buffer is still dirty.
If a concurrent write-back occurs immediately after this unlocking and
before clear_buffer_dirty() is called in jbd2_journal_forget(), data
corruption can theoretically still occur.

Although these two issues are unlikely to occur in practice since the
undergoing metadata writeback I/O does not take this long to complete,
it's better to explicitly ensure that all ongoing I/O operations are
completed.

Fixes: 597599268e3b ("jbd2: discard dirty data when forgetting an un-journalled buffer")
Cc: stable@kernel.org
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Message-ID: <20250916093337.3161016-2-yi.zhang@huaweicloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>

authored by

Zhang Yi and committed by
Theodore Ts'o
3c652c3a acf943e9

+9 -4
+9 -4
fs/jbd2/transaction.c
··· 1659 1659 int drop_reserve = 0; 1660 1660 int err = 0; 1661 1661 int was_modified = 0; 1662 + int wait_for_writeback = 0; 1662 1663 1663 1664 if (is_handle_aborted(handle)) 1664 1665 return -EROFS; ··· 1783 1782 } 1784 1783 1785 1784 /* 1786 - * The buffer is still not written to disk, we should 1787 - * attach this buffer to current transaction so that the 1788 - * buffer can be checkpointed only after the current 1789 - * transaction commits. 1785 + * The buffer has not yet been written to disk. We should 1786 + * either clear the buffer or ensure that the ongoing I/O 1787 + * is completed, and attach this buffer to current 1788 + * transaction so that the buffer can be checkpointed only 1789 + * after the current transaction commits. 1790 1790 */ 1791 1791 clear_buffer_dirty(bh); 1792 + wait_for_writeback = 1; 1792 1793 __jbd2_journal_file_buffer(jh, transaction, BJ_Forget); 1793 1794 spin_unlock(&journal->j_list_lock); 1794 1795 } 1795 1796 drop: 1796 1797 __brelse(bh); 1797 1798 spin_unlock(&jh->b_state_lock); 1799 + if (wait_for_writeback) 1800 + wait_on_buffer(bh); 1798 1801 jbd2_journal_put_journal_head(jh); 1799 1802 if (drop_reserve) { 1800 1803 /* no need to reserve log space for this block -bzzz */