Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

ext4: fix off-by-one errors in fast-commit block filling

Due to several different off-by-one errors, or perhaps due to a late
change in design that wasn't fully reflected in the code that was
actually merged, there are several very strange constraints on how
fast-commit blocks are filled with tlv entries:

- tlvs must start at least 10 bytes before the end of the block, even
though the minimum tlv length is 8. Otherwise, the replay code will
ignore them. (BUG: ext4_fc_reserve_space() could violate this
requirement if called with a len of blocksize - 9 or blocksize - 8.
Fortunately, this doesn't seem to happen currently.)

- tlvs must end at least 1 byte before the end of the block. Otherwise
the replay code will consider them to be invalid. This quirk
contributed to a bug (fixed by an earlier commit) where uninitialized
memory was being leaked to disk in the last byte of blocks.

Also, strangely these constraints don't apply to the replay code in
e2fsprogs, which will accept any tlvs in the blocks (with no bounds
checks at all, but that is a separate issue...).

Given that this all seems to be a bug, let's fix it by just filling
blocks with tlv entries in the natural way.

Note that old kernels will be unable to replay fast-commit journals
created by kernels that have this commit.

Fixes: aa75f4d3daae ("ext4: main fast-commit commit path")
Cc: <stable@vger.kernel.org> # v5.10+
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20221106224841.279231-7-ebiggers@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>

authored by

Eric Biggers and committed by
Theodore Ts'o
48a6a66d 8415ce07

+34 -34
+34 -34
fs/ext4/fast_commit.c
··· 714 714 struct buffer_head *bh; 715 715 int bsize = sbi->s_journal->j_blocksize; 716 716 int ret, off = sbi->s_fc_bytes % bsize; 717 - int pad_len; 717 + int remaining; 718 718 u8 *dst; 719 719 720 720 /* 721 - * After allocating len, we should have space at least for a 0 byte 722 - * padding. 721 + * If 'len' is too long to fit in any block alongside a PAD tlv, then we 722 + * cannot fulfill the request. 723 723 */ 724 - if (len + EXT4_FC_TAG_BASE_LEN > bsize) 724 + if (len > bsize - EXT4_FC_TAG_BASE_LEN) 725 725 return NULL; 726 726 727 - if (bsize - off - 1 > len + EXT4_FC_TAG_BASE_LEN) { 728 - /* 729 - * Only allocate from current buffer if we have enough space for 730 - * this request AND we have space to add a zero byte padding. 731 - */ 732 - if (!sbi->s_fc_bh) { 733 - ret = jbd2_fc_get_buf(EXT4_SB(sb)->s_journal, &bh); 734 - if (ret) 735 - return NULL; 736 - sbi->s_fc_bh = bh; 737 - } 738 - sbi->s_fc_bytes += len; 739 - return sbi->s_fc_bh->b_data + off; 727 + if (!sbi->s_fc_bh) { 728 + ret = jbd2_fc_get_buf(EXT4_SB(sb)->s_journal, &bh); 729 + if (ret) 730 + return NULL; 731 + sbi->s_fc_bh = bh; 740 732 } 741 - /* Need to add PAD tag */ 742 733 dst = sbi->s_fc_bh->b_data + off; 743 - tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); 744 - pad_len = bsize - off - 1 - EXT4_FC_TAG_BASE_LEN; 745 - tl.fc_len = cpu_to_le16(pad_len); 746 - ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); 747 - dst += EXT4_FC_TAG_BASE_LEN; 748 - if (pad_len > 0) { 749 - ext4_fc_memzero(sb, dst, pad_len, crc); 750 - dst += pad_len; 734 + 735 + /* 736 + * Allocate the bytes in the current block if we can do so while still 737 + * leaving enough space for a PAD tlv. 738 + */ 739 + remaining = bsize - EXT4_FC_TAG_BASE_LEN - off; 740 + if (len <= remaining) { 741 + sbi->s_fc_bytes += len; 742 + return dst; 751 743 } 752 - /* Don't leak uninitialized memory in the unused last byte. */ 753 - *dst = 0; 744 + 745 + /* 746 + * Else, terminate the current block with a PAD tlv, then allocate a new 747 + * block and allocate the bytes at the start of that new block. 748 + */ 749 + 750 + tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD); 751 + tl.fc_len = cpu_to_le16(remaining); 752 + ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc); 753 + ext4_fc_memzero(sb, dst + EXT4_FC_TAG_BASE_LEN, remaining, crc); 754 754 755 755 ext4_fc_submit_bh(sb, false); 756 756 ··· 758 758 if (ret) 759 759 return NULL; 760 760 sbi->s_fc_bh = bh; 761 - sbi->s_fc_bytes = (sbi->s_fc_bytes / bsize + 1) * bsize + len; 761 + sbi->s_fc_bytes += bsize - off + len; 762 762 return sbi->s_fc_bh->b_data; 763 763 } 764 764 ··· 789 789 off = sbi->s_fc_bytes % bsize; 790 790 791 791 tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_TAIL); 792 - tl.fc_len = cpu_to_le16(bsize - off - 1 + sizeof(struct ext4_fc_tail)); 792 + tl.fc_len = cpu_to_le16(bsize - off + sizeof(struct ext4_fc_tail)); 793 793 sbi->s_fc_bytes = round_up(sbi->s_fc_bytes, bsize); 794 794 795 795 ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, &crc); ··· 2056 2056 state = &sbi->s_fc_replay_state; 2057 2057 2058 2058 start = (u8 *)bh->b_data; 2059 - end = (__u8 *)bh->b_data + journal->j_blocksize - 1; 2059 + end = start + journal->j_blocksize; 2060 2060 2061 2061 if (state->fc_replay_expected_off == 0) { 2062 2062 state->fc_cur_tag = 0; ··· 2077 2077 } 2078 2078 2079 2079 state->fc_replay_expected_off++; 2080 - for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; 2080 + for (cur = start; cur <= end - EXT4_FC_TAG_BASE_LEN; 2081 2081 cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { 2082 2082 ext4_fc_get_tl(&tl, cur); 2083 2083 val = cur + EXT4_FC_TAG_BASE_LEN; ··· 2195 2195 #endif 2196 2196 2197 2197 start = (u8 *)bh->b_data; 2198 - end = (__u8 *)bh->b_data + journal->j_blocksize - 1; 2198 + end = start + journal->j_blocksize; 2199 2199 2200 - for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN; 2200 + for (cur = start; cur <= end - EXT4_FC_TAG_BASE_LEN; 2201 2201 cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) { 2202 2202 ext4_fc_get_tl(&tl, cur); 2203 2203 val = cur + EXT4_FC_TAG_BASE_LEN;