Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'ext4_for_linus-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 fixes from Ted Ts'o:

- Update the MAINTAINERS file to add reviewers for the ext4 file system

- Add a test issue an ext4 warning (not a WARN_ON) if there are still
dirty pages attached to an evicted inode.

- Fix a number of Syzkaller issues

- Fix memory leaks on error paths

- Replace some BUG and WARN with EFSCORRUPTED reporting

- Fix a potential crash when disabling discard via remount followed by
an immediate unmount. (Found by Sashiko)

- Fix a corner case which could lead to allocating blocks for an
indirect-mapped inode block numbers > 2**32

- Fix a race when reallocating a freed inode that could result in a
deadlock

- Fix a user-after-free in update_super_work when racing with umount

- Fix build issues when trying to build ext4's kunit tests as a module

- Fix a bug where ext4_split_extent_zeroout() could fail to pass back
an error from ext4_ext_dirty()

- Avoid allocating blocks from a corrupted block group in
ext4_mb_find_by_goal()

- Fix a percpu_counters list corruption BUG triggered by an ext4
extents kunit

- Fix a potetial crash caused by the fast commit flush path potentially
accessing the jinode structure before it is fully initialized

- Fix fsync(2) in no-journal mode to make sure the dirtied inode is
write to storage

- Fix a bug when in no-journal mode, when ext4 tries to avoid using
recently deleted inodes, if lazy itable initialization is enabled,
can lead to an unitialized inode getting skipped and triggering an
e2fsck complaint

- Fix journal credit calculation when setting an xattr when both the
encryption and ea_inode feeatures are enabled

- Fix corner cases which could result in stale xarray tags after
writeback

- Fix generic/475 failures caused by ENOSPC errors while creating a
symlink when the system crashes resulting to a file system
inconsistency when replaying the fast commit journal

* tag 'ext4_for_linus-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (27 commits)
ext4: always drain queued discard work in ext4_mb_release()
ext4: handle wraparound when searching for blocks for indirect mapped blocks
ext4: skip split extent recovery on corruption
ext4: fix iloc.bh leak in ext4_fc_replay_inode() error paths
ext4: fix deadlock on inode reallocation
ext4: fix use-after-free in update_super_work when racing with umount
ext4: fix the might_sleep() warnings in kvfree()
ext4: reject mount if bigalloc with s_first_data_block != 0
ext4: fix extents-test.c is not compiled when EXT4_KUNIT_TESTS=M
ext4: fix mballoc-test.c is not compiled when EXT4_KUNIT_TESTS=M
ext4: introduce EXPORT_SYMBOL_FOR_EXT4_TEST() helper
jbd2: gracefully abort on checkpointing state corruptions
ext4: avoid infinite loops caused by residual data
ext4: validate p_idx bounds in ext4_ext_correct_indexes
ext4: test if inode's all dirty pages are submitted to disk
ext4: minor fix for ext4_split_extent_zeroout()
ext4: avoid allocate block from corrupted group in ext4_mb_find_by_goal()
ext4: kunit: extents-test: lix percpu_counters list corruption
ext4: publish jinode after initialization
ext4: replace BUG_ON with proper error handling in ext4_read_inline_folio
...

+455 -115
+6 -1
MAINTAINERS
··· 9619 9619 9620 9620 EXT4 FILE SYSTEM 9621 9621 M: "Theodore Ts'o" <tytso@mit.edu> 9622 - M: Andreas Dilger <adilger.kernel@dilger.ca> 9622 + R: Andreas Dilger <adilger.kernel@dilger.ca> 9623 + R: Baokun Li <libaokun@linux.alibaba.com> 9624 + R: Jan Kara <jack@suse.cz> 9625 + R: Ojaswin Mujoo <ojaswin@linux.ibm.com> 9626 + R: Ritesh Harjani (IBM) <ritesh.list@gmail.com> 9627 + R: Zhang Yi <yi.zhang@huawei.com> 9623 9628 L: linux-ext4@vger.kernel.org 9624 9629 S: Maintained 9625 9630 W: http://ext4.wiki.kernel.org
+3 -2
fs/ext4/Makefile
··· 14 14 15 15 ext4-$(CONFIG_EXT4_FS_POSIX_ACL) += acl.o 16 16 ext4-$(CONFIG_EXT4_FS_SECURITY) += xattr_security.o 17 - ext4-inode-test-objs += inode-test.o 18 - obj-$(CONFIG_EXT4_KUNIT_TESTS) += ext4-inode-test.o 17 + ext4-test-objs += inode-test.o mballoc-test.o \ 18 + extents-test.o 19 + obj-$(CONFIG_EXT4_KUNIT_TESTS) += ext4-test.o 19 20 ext4-$(CONFIG_FS_VERITY) += verity.o 20 21 ext4-$(CONFIG_FS_ENCRYPTION) += crypto.o
+8 -1
fs/ext4/crypto.c
··· 163 163 */ 164 164 165 165 if (handle) { 166 + /* 167 + * Since the inode is new it is ok to pass the 168 + * XATTR_CREATE flag. This is necessary to match the 169 + * remaining journal credits check in the set_handle 170 + * function with the credits allocated for the new 171 + * inode. 172 + */ 166 173 res = ext4_xattr_set_handle(handle, inode, 167 174 EXT4_XATTR_INDEX_ENCRYPTION, 168 175 EXT4_XATTR_NAME_ENCRYPTION_CONTEXT, 169 - ctx, len, 0); 176 + ctx, len, XATTR_CREATE); 170 177 if (!res) { 171 178 ext4_set_inode_flag(inode, EXT4_INODE_ENCRYPT); 172 179 ext4_clear_inode_state(inode,
+6
fs/ext4/ext4.h
··· 1570 1570 struct proc_dir_entry *s_proc; 1571 1571 struct kobject s_kobj; 1572 1572 struct completion s_kobj_unregister; 1573 + struct mutex s_error_notify_mutex; /* protects sysfs_notify vs kobject_del */ 1573 1574 struct super_block *s_sb; 1574 1575 struct buffer_head *s_mmp_bh; 1575 1576 ··· 3945 3944 extern int ext4_block_write_begin(handle_t *handle, struct folio *folio, 3946 3945 loff_t pos, unsigned len, 3947 3946 get_block_t *get_block); 3947 + 3948 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 3949 + #define EXPORT_SYMBOL_FOR_EXT4_TEST(sym) \ 3950 + EXPORT_SYMBOL_FOR_MODULES(sym, "ext4-test") 3951 + #endif 3948 3952 #endif /* __KERNEL__ */ 3949 3953 3950 3954 #endif /* _EXT4_H */
+12
fs/ext4/ext4_extents.h
··· 264 264 0xffff); 265 265 } 266 266 267 + extern int __ext4_ext_dirty(const char *where, unsigned int line, 268 + handle_t *handle, struct inode *inode, 269 + struct ext4_ext_path *path); 270 + extern int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex); 271 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 272 + extern int ext4_ext_space_root_idx_test(struct inode *inode, int check); 273 + extern struct ext4_ext_path *ext4_split_convert_extents_test( 274 + handle_t *handle, struct inode *inode, 275 + struct ext4_map_blocks *map, 276 + struct ext4_ext_path *path, 277 + int flags, unsigned int *allocated); 278 + #endif 267 279 #endif /* _EXT4_EXTENTS */ 268 280
+7 -5
fs/ext4/extents-test.c
··· 142 142 143 143 static void extents_kunit_exit(struct kunit *test) 144 144 { 145 - struct ext4_sb_info *sbi = k_ctx.k_ei->vfs_inode.i_sb->s_fs_info; 145 + struct super_block *sb = k_ctx.k_ei->vfs_inode.i_sb; 146 + struct ext4_sb_info *sbi = sb->s_fs_info; 146 147 148 + ext4_es_unregister_shrinker(sbi); 147 149 kfree(sbi); 148 150 kfree(k_ctx.k_ei); 149 151 kfree(k_ctx.k_data); ··· 282 280 eh->eh_depth = 0; 283 281 eh->eh_entries = cpu_to_le16(1); 284 282 eh->eh_magic = EXT4_EXT_MAGIC; 285 - eh->eh_max = 286 - cpu_to_le16(ext4_ext_space_root_idx(&k_ctx.k_ei->vfs_inode, 0)); 283 + eh->eh_max = cpu_to_le16(ext4_ext_space_root_idx_test( 284 + &k_ctx.k_ei->vfs_inode, 0)); 287 285 eh->eh_generation = 0; 288 286 289 287 /* ··· 386 384 387 385 switch (param->type) { 388 386 case TEST_SPLIT_CONVERT: 389 - path = ext4_split_convert_extents(NULL, inode, &map, path, 390 - param->split_flags, NULL); 387 + path = ext4_split_convert_extents_test(NULL, inode, &map, 388 + path, param->split_flags, NULL); 391 389 break; 392 390 case TEST_CREATE_BLOCKS: 393 391 ext4_map_create_blocks_helper(test, inode, &map, param->split_flags);
+68 -12
fs/ext4/extents.c
··· 184 184 * - ENOMEM 185 185 * - EIO 186 186 */ 187 - static int __ext4_ext_dirty(const char *where, unsigned int line, 188 - handle_t *handle, struct inode *inode, 189 - struct ext4_ext_path *path) 187 + int __ext4_ext_dirty(const char *where, unsigned int line, 188 + handle_t *handle, struct inode *inode, 189 + struct ext4_ext_path *path) 190 190 { 191 191 int err; 192 192 ··· 1736 1736 err = ext4_ext_get_access(handle, inode, path + k); 1737 1737 if (err) 1738 1738 return err; 1739 + if (unlikely(path[k].p_idx > EXT_LAST_INDEX(path[k].p_hdr))) { 1740 + EXT4_ERROR_INODE(inode, 1741 + "path[%d].p_idx %p > EXT_LAST_INDEX %p", 1742 + k, path[k].p_idx, 1743 + EXT_LAST_INDEX(path[k].p_hdr)); 1744 + return -EFSCORRUPTED; 1745 + } 1739 1746 path[k].p_idx->ei_block = border; 1740 1747 err = ext4_ext_dirty(handle, inode, path + k); 1741 1748 if (err) ··· 1755 1748 err = ext4_ext_get_access(handle, inode, path + k); 1756 1749 if (err) 1757 1750 goto clean; 1751 + if (unlikely(path[k].p_idx > EXT_LAST_INDEX(path[k].p_hdr))) { 1752 + EXT4_ERROR_INODE(inode, 1753 + "path[%d].p_idx %p > EXT_LAST_INDEX %p", 1754 + k, path[k].p_idx, 1755 + EXT_LAST_INDEX(path[k].p_hdr)); 1756 + err = -EFSCORRUPTED; 1757 + goto clean; 1758 + } 1758 1759 path[k].p_idx->ei_block = border; 1759 1760 err = ext4_ext_dirty(handle, inode, path + k); 1760 1761 if (err) ··· 3159 3144 } 3160 3145 3161 3146 /* FIXME!! we need to try to merge to left or right after zero-out */ 3162 - static int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex) 3147 + int ext4_ext_zeroout(struct inode *inode, struct ext4_extent *ex) 3163 3148 { 3164 3149 ext4_fsblk_t ee_pblock; 3165 3150 unsigned int ee_len; ··· 3254 3239 3255 3240 insert_err = PTR_ERR(path); 3256 3241 err = 0; 3242 + if (insert_err != -ENOSPC && insert_err != -EDQUOT && 3243 + insert_err != -ENOMEM) 3244 + goto out_path; 3257 3245 3258 3246 /* 3259 3247 * Get a new path to try to zeroout or fix the extent length. ··· 3273 3255 goto out_path; 3274 3256 } 3275 3257 3258 + depth = ext_depth(inode); 3259 + ex = path[depth].p_ext; 3260 + if (!ex) { 3261 + EXT4_ERROR_INODE(inode, 3262 + "bad extent address lblock: %lu, depth: %d pblock %llu", 3263 + (unsigned long)ee_block, depth, path[depth].p_block); 3264 + err = -EFSCORRUPTED; 3265 + goto out; 3266 + } 3267 + 3276 3268 err = ext4_ext_get_access(handle, inode, path + depth); 3277 3269 if (err) 3278 3270 goto out; 3279 - 3280 - depth = ext_depth(inode); 3281 - ex = path[depth].p_ext; 3282 3271 3283 3272 fix_extent_len: 3284 3273 ex->ee_len = orig_ex.ee_len; ··· 3388 3363 3389 3364 ext4_ext_mark_initialized(ex); 3390 3365 3391 - ext4_ext_dirty(handle, inode, path + depth); 3366 + err = ext4_ext_dirty(handle, inode, path + depth); 3392 3367 if (err) 3393 3368 return err; 3394 3369 ··· 4482 4457 path = ext4_ext_insert_extent(handle, inode, path, &newex, flags); 4483 4458 if (IS_ERR(path)) { 4484 4459 err = PTR_ERR(path); 4485 - if (allocated_clusters) { 4460 + /* 4461 + * Gracefully handle out of space conditions. If the filesystem 4462 + * is inconsistent, we'll just leak allocated blocks to avoid 4463 + * causing even more damage. 4464 + */ 4465 + if (allocated_clusters && (err == -EDQUOT || err == -ENOSPC)) { 4486 4466 int fb_flags = 0; 4487 - 4488 4467 /* 4489 4468 * free data blocks we just allocated. 4490 4469 * not a good idea to call discard here directly, ··· 6267 6238 return 0; 6268 6239 } 6269 6240 6270 - #ifdef CONFIG_EXT4_KUNIT_TESTS 6271 - #include "extents-test.c" 6241 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 6242 + int ext4_ext_space_root_idx_test(struct inode *inode, int check) 6243 + { 6244 + return ext4_ext_space_root_idx(inode, check); 6245 + } 6246 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_space_root_idx_test); 6247 + 6248 + struct ext4_ext_path *ext4_split_convert_extents_test(handle_t *handle, 6249 + struct inode *inode, struct ext4_map_blocks *map, 6250 + struct ext4_ext_path *path, int flags, 6251 + unsigned int *allocated) 6252 + { 6253 + return ext4_split_convert_extents(handle, inode, map, path, 6254 + flags, allocated); 6255 + } 6256 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_split_convert_extents_test); 6257 + 6258 + EXPORT_SYMBOL_FOR_EXT4_TEST(__ext4_ext_dirty); 6259 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_zeroout); 6260 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_register_shrinker); 6261 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_unregister_shrinker); 6262 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_map_create_blocks); 6263 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_init_tree); 6264 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_lookup_extent); 6265 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_es_insert_extent); 6266 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_ext_insert_extent); 6267 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_find_extent); 6268 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_issue_zeroout); 6269 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_map_query_blocks); 6272 6270 #endif
+10 -7
fs/ext4/fast_commit.c
··· 975 975 int ret = 0; 976 976 977 977 list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) { 978 - ret = jbd2_submit_inode_data(journal, ei->jinode); 978 + ret = jbd2_submit_inode_data(journal, READ_ONCE(ei->jinode)); 979 979 if (ret) 980 980 return ret; 981 981 } 982 982 983 983 list_for_each_entry(ei, &sbi->s_fc_q[FC_Q_MAIN], i_fc_list) { 984 - ret = jbd2_wait_inode_data(journal, ei->jinode); 984 + ret = jbd2_wait_inode_data(journal, READ_ONCE(ei->jinode)); 985 985 if (ret) 986 986 return ret; 987 987 } ··· 1613 1613 /* Immediately update the inode on disk. */ 1614 1614 ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh); 1615 1615 if (ret) 1616 - goto out; 1616 + goto out_brelse; 1617 1617 ret = sync_dirty_buffer(iloc.bh); 1618 1618 if (ret) 1619 - goto out; 1619 + goto out_brelse; 1620 1620 ret = ext4_mark_inode_used(sb, ino); 1621 1621 if (ret) 1622 - goto out; 1622 + goto out_brelse; 1623 1623 1624 1624 /* Given that we just wrote the inode on disk, this SHOULD succeed. */ 1625 1625 inode = ext4_iget(sb, ino, EXT4_IGET_NORMAL); 1626 1626 if (IS_ERR(inode)) { 1627 1627 ext4_debug("Inode not found."); 1628 - return -EFSCORRUPTED; 1628 + inode = NULL; 1629 + ret = -EFSCORRUPTED; 1630 + goto out_brelse; 1629 1631 } 1630 1632 1631 1633 /* ··· 1644 1642 ext4_inode_csum_set(inode, ext4_raw_inode(&iloc), EXT4_I(inode)); 1645 1643 ret = ext4_handle_dirty_metadata(NULL, NULL, iloc.bh); 1646 1644 sync_dirty_buffer(iloc.bh); 1645 + out_brelse: 1647 1646 brelse(iloc.bh); 1648 1647 out: 1649 1648 iput(inode); 1650 1649 if (!ret) 1651 1650 blkdev_issue_flush(sb->s_bdev); 1652 1651 1653 - return 0; 1652 + return ret; 1654 1653 } 1655 1654 1656 1655 /*
+14 -2
fs/ext4/fsync.c
··· 83 83 int datasync, bool *needs_barrier) 84 84 { 85 85 struct inode *inode = file->f_inode; 86 + struct writeback_control wbc = { 87 + .sync_mode = WB_SYNC_ALL, 88 + .nr_to_write = 0, 89 + }; 86 90 int ret; 87 91 88 92 ret = generic_buffers_fsync_noflush(file, start, end, datasync); 89 - if (!ret) 90 - ret = ext4_sync_parent(inode); 93 + if (ret) 94 + return ret; 95 + 96 + /* Force writeout of inode table buffer to disk */ 97 + ret = ext4_write_inode(inode, &wbc); 98 + if (ret) 99 + return ret; 100 + 101 + ret = ext4_sync_parent(inode); 102 + 91 103 if (test_opt(inode->i_sb, BARRIER)) 92 104 *needs_barrier = true; 93 105
+6
fs/ext4/ialloc.c
··· 686 686 if (unlikely(!gdp)) 687 687 return 0; 688 688 689 + /* Inode was never used in this filesystem? */ 690 + if (ext4_has_group_desc_csum(sb) && 691 + (gdp->bg_flags & cpu_to_le16(EXT4_BG_INODE_UNINIT) || 692 + ino >= EXT4_INODES_PER_GROUP(sb) - ext4_itable_unused_count(sb, gdp))) 693 + return 0; 694 + 689 695 bh = sb_find_get_block(sb, ext4_inode_table(sb, gdp) + 690 696 (ino / inodes_per_block)); 691 697 if (!bh || !buffer_uptodate(bh))
+9 -1
fs/ext4/inline.c
··· 522 522 goto out; 523 523 524 524 len = min_t(size_t, ext4_get_inline_size(inode), i_size_read(inode)); 525 - BUG_ON(len > PAGE_SIZE); 525 + 526 + if (len > PAGE_SIZE) { 527 + ext4_error_inode(inode, __func__, __LINE__, 0, 528 + "inline size %zu exceeds PAGE_SIZE", len); 529 + ret = -EFSCORRUPTED; 530 + brelse(iloc.bh); 531 + goto out; 532 + } 533 + 526 534 kaddr = kmap_local_folio(folio, 0); 527 535 ret = ext4_read_inline_data(inode, kaddr, len, &iloc); 528 536 kaddr = folio_zero_tail(folio, len, kaddr + len);
+60 -15
fs/ext4/inode.c
··· 128 128 static inline int ext4_begin_ordered_truncate(struct inode *inode, 129 129 loff_t new_size) 130 130 { 131 + struct jbd2_inode *jinode = READ_ONCE(EXT4_I(inode)->jinode); 132 + 131 133 trace_ext4_begin_ordered_truncate(inode, new_size); 132 134 /* 133 135 * If jinode is zero, then we never opened the file for ··· 137 135 * jbd2_journal_begin_ordered_truncate() since there's no 138 136 * outstanding writes we need to flush. 139 137 */ 140 - if (!EXT4_I(inode)->jinode) 138 + if (!jinode) 141 139 return 0; 142 140 return jbd2_journal_begin_ordered_truncate(EXT4_JOURNAL(inode), 143 - EXT4_I(inode)->jinode, 141 + jinode, 144 142 new_size); 145 143 } 146 144 ··· 186 184 if (EXT4_I(inode)->i_flags & EXT4_EA_INODE_FL) 187 185 ext4_evict_ea_inode(inode); 188 186 if (inode->i_nlink) { 187 + /* 188 + * If there's dirty page will lead to data loss, user 189 + * could see stale data. 190 + */ 191 + if (unlikely(!ext4_emergency_state(inode->i_sb) && 192 + mapping_tagged(&inode->i_data, PAGECACHE_TAG_DIRTY))) 193 + ext4_warning_inode(inode, "data will be lost"); 194 + 189 195 truncate_inode_pages_final(&inode->i_data); 190 196 191 197 goto no_delete; ··· 4461 4451 spin_unlock(&inode->i_lock); 4462 4452 return -ENOMEM; 4463 4453 } 4464 - ei->jinode = jinode; 4465 - jbd2_journal_init_jbd_inode(ei->jinode, inode); 4454 + jbd2_journal_init_jbd_inode(jinode, inode); 4455 + /* 4456 + * Publish ->jinode only after it is fully initialized so that 4457 + * readers never observe a partially initialized jbd2_inode. 4458 + */ 4459 + smp_wmb(); 4460 + WRITE_ONCE(ei->jinode, jinode); 4466 4461 jinode = NULL; 4467 4462 } 4468 4463 spin_unlock(&inode->i_lock); ··· 5416 5401 inode->i_op = &ext4_encrypted_symlink_inode_operations; 5417 5402 } else if (ext4_inode_is_fast_symlink(inode)) { 5418 5403 inode->i_op = &ext4_fast_symlink_inode_operations; 5419 - if (inode->i_size == 0 || 5420 - inode->i_size >= sizeof(ei->i_data) || 5421 - strnlen((char *)ei->i_data, inode->i_size + 1) != 5422 - inode->i_size) { 5423 - ext4_error_inode(inode, function, line, 0, 5424 - "invalid fast symlink length %llu", 5425 - (unsigned long long)inode->i_size); 5426 - ret = -EFSCORRUPTED; 5427 - goto bad_inode; 5404 + 5405 + /* 5406 + * Orphan cleanup can see inodes with i_size == 0 5407 + * and i_data uninitialized. Skip size checks in 5408 + * that case. This is safe because the first thing 5409 + * ext4_evict_inode() does for fast symlinks is 5410 + * clearing of i_data and i_size. 5411 + */ 5412 + if ((EXT4_SB(sb)->s_mount_state & EXT4_ORPHAN_FS)) { 5413 + if (inode->i_nlink != 0) { 5414 + ext4_error_inode(inode, function, line, 0, 5415 + "invalid orphan symlink nlink %d", 5416 + inode->i_nlink); 5417 + ret = -EFSCORRUPTED; 5418 + goto bad_inode; 5419 + } 5420 + } else { 5421 + if (inode->i_size == 0 || 5422 + inode->i_size >= sizeof(ei->i_data) || 5423 + strnlen((char *)ei->i_data, inode->i_size + 1) != 5424 + inode->i_size) { 5425 + ext4_error_inode(inode, function, line, 0, 5426 + "invalid fast symlink length %llu", 5427 + (unsigned long long)inode->i_size); 5428 + ret = -EFSCORRUPTED; 5429 + goto bad_inode; 5430 + } 5431 + inode_set_cached_link(inode, (char *)ei->i_data, 5432 + inode->i_size); 5428 5433 } 5429 - inode_set_cached_link(inode, (char *)ei->i_data, 5430 - inode->i_size); 5431 5434 } else { 5432 5435 inode->i_op = &ext4_symlink_inode_operations; 5433 5436 } ··· 5881 5848 5882 5849 if (attr->ia_size == inode->i_size) 5883 5850 inc_ivers = false; 5851 + 5852 + /* 5853 + * If file has inline data but new size exceeds inline capacity, 5854 + * convert to extent-based storage first to prevent inconsistent 5855 + * state (inline flag set but size exceeds inline capacity). 5856 + */ 5857 + if (ext4_has_inline_data(inode) && 5858 + attr->ia_size > EXT4_I(inode)->i_inline_size) { 5859 + error = ext4_convert_inline_data(inode); 5860 + if (error) 5861 + goto err_out; 5862 + } 5884 5863 5885 5864 if (shrink) { 5886 5865 if (ext4_should_order_data(inode)) {
+41 -40
fs/ext4/mballoc-test.c
··· 8 8 #include <linux/random.h> 9 9 10 10 #include "ext4.h" 11 + #include "mballoc.h" 11 12 12 13 struct mbt_grp_ctx { 13 14 struct buffer_head bitmap_bh; ··· 337 336 if (state) 338 337 mb_set_bits(bitmap_bh->b_data, blkoff, len); 339 338 else 340 - mb_clear_bits(bitmap_bh->b_data, blkoff, len); 339 + mb_clear_bits_test(bitmap_bh->b_data, blkoff, len); 341 340 342 341 return 0; 343 342 } ··· 414 413 415 414 /* get block at goal */ 416 415 ar.goal = ext4_group_first_block_no(sb, goal_group); 417 - found = ext4_mb_new_blocks_simple(&ar, &err); 416 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 418 417 KUNIT_ASSERT_EQ_MSG(test, ar.goal, found, 419 418 "failed to alloc block at goal, expected %llu found %llu", 420 419 ar.goal, found); 421 420 422 421 /* get block after goal in goal group */ 423 422 ar.goal = ext4_group_first_block_no(sb, goal_group); 424 - found = ext4_mb_new_blocks_simple(&ar, &err); 423 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 425 424 KUNIT_ASSERT_EQ_MSG(test, ar.goal + EXT4_C2B(sbi, 1), found, 426 425 "failed to alloc block after goal in goal group, expected %llu found %llu", 427 426 ar.goal + 1, found); ··· 429 428 /* get block after goal group */ 430 429 mbt_ctx_mark_used(sb, goal_group, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 431 430 ar.goal = ext4_group_first_block_no(sb, goal_group); 432 - found = ext4_mb_new_blocks_simple(&ar, &err); 431 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 433 432 KUNIT_ASSERT_EQ_MSG(test, 434 433 ext4_group_first_block_no(sb, goal_group + 1), found, 435 434 "failed to alloc block after goal group, expected %llu found %llu", ··· 439 438 for (i = goal_group; i < ext4_get_groups_count(sb); i++) 440 439 mbt_ctx_mark_used(sb, i, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 441 440 ar.goal = ext4_group_first_block_no(sb, goal_group); 442 - found = ext4_mb_new_blocks_simple(&ar, &err); 441 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 443 442 KUNIT_ASSERT_EQ_MSG(test, 444 443 ext4_group_first_block_no(sb, 0) + EXT4_C2B(sbi, 1), found, 445 444 "failed to alloc block before goal group, expected %llu found %llu", ··· 449 448 for (i = 0; i < ext4_get_groups_count(sb); i++) 450 449 mbt_ctx_mark_used(sb, i, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 451 450 ar.goal = ext4_group_first_block_no(sb, goal_group); 452 - found = ext4_mb_new_blocks_simple(&ar, &err); 451 + found = ext4_mb_new_blocks_simple_test(&ar, &err); 453 452 KUNIT_ASSERT_NE_MSG(test, err, 0, 454 453 "unexpectedly get block when no block is available"); 455 454 } ··· 493 492 continue; 494 493 495 494 bitmap = mbt_ctx_bitmap(sb, i); 496 - bit = mb_find_next_zero_bit(bitmap, max, 0); 495 + bit = mb_find_next_zero_bit_test(bitmap, max, 0); 497 496 KUNIT_ASSERT_EQ_MSG(test, bit, max, 498 497 "free block on unexpected group %d", i); 499 498 } 500 499 501 500 bitmap = mbt_ctx_bitmap(sb, goal_group); 502 - bit = mb_find_next_zero_bit(bitmap, max, 0); 501 + bit = mb_find_next_zero_bit_test(bitmap, max, 0); 503 502 KUNIT_ASSERT_EQ(test, bit, start); 504 503 505 - bit = mb_find_next_bit(bitmap, max, bit + 1); 504 + bit = mb_find_next_bit_test(bitmap, max, bit + 1); 506 505 KUNIT_ASSERT_EQ(test, bit, start + len); 507 506 } 508 507 ··· 525 524 526 525 block = ext4_group_first_block_no(sb, goal_group) + 527 526 EXT4_C2B(sbi, start); 528 - ext4_free_blocks_simple(inode, block, len); 527 + ext4_free_blocks_simple_test(inode, block, len); 529 528 validate_free_blocks_simple(test, sb, goal_group, start, len); 530 529 mbt_ctx_mark_used(sb, goal_group, 0, EXT4_CLUSTERS_PER_GROUP(sb)); 531 530 } ··· 567 566 568 567 bitmap = mbt_ctx_bitmap(sb, TEST_GOAL_GROUP); 569 568 memset(bitmap, 0, sb->s_blocksize); 570 - ret = ext4_mb_mark_diskspace_used(ac, NULL); 569 + ret = ext4_mb_mark_diskspace_used_test(ac, NULL); 571 570 KUNIT_ASSERT_EQ(test, ret, 0); 572 571 573 572 max = EXT4_CLUSTERS_PER_GROUP(sb); 574 - i = mb_find_next_bit(bitmap, max, 0); 573 + i = mb_find_next_bit_test(bitmap, max, 0); 575 574 KUNIT_ASSERT_EQ(test, i, start); 576 - i = mb_find_next_zero_bit(bitmap, max, i + 1); 575 + i = mb_find_next_zero_bit_test(bitmap, max, i + 1); 577 576 KUNIT_ASSERT_EQ(test, i, start + len); 578 - i = mb_find_next_bit(bitmap, max, i + 1); 577 + i = mb_find_next_bit_test(bitmap, max, i + 1); 579 578 KUNIT_ASSERT_EQ(test, max, i); 580 579 } 581 580 ··· 618 617 max = EXT4_CLUSTERS_PER_GROUP(sb); 619 618 bb_h = buddy + sbi->s_mb_offsets[1]; 620 619 621 - off = mb_find_next_zero_bit(bb, max, 0); 620 + off = mb_find_next_zero_bit_test(bb, max, 0); 622 621 grp->bb_first_free = off; 623 622 while (off < max) { 624 623 grp->bb_counters[0]++; 625 624 grp->bb_free++; 626 625 627 - if (!(off & 1) && !mb_test_bit(off + 1, bb)) { 626 + if (!(off & 1) && !mb_test_bit_test(off + 1, bb)) { 628 627 grp->bb_free++; 629 628 grp->bb_counters[0]--; 630 - mb_clear_bit(off >> 1, bb_h); 629 + mb_clear_bit_test(off >> 1, bb_h); 631 630 grp->bb_counters[1]++; 632 631 grp->bb_largest_free_order = 1; 633 632 off++; 634 633 } 635 634 636 - off = mb_find_next_zero_bit(bb, max, off + 1); 635 + off = mb_find_next_zero_bit_test(bb, max, off + 1); 637 636 } 638 637 639 638 for (order = 1; order < MB_NUM_ORDERS(sb) - 1; order++) { 640 639 bb = buddy + sbi->s_mb_offsets[order]; 641 640 bb_h = buddy + sbi->s_mb_offsets[order + 1]; 642 641 max = max >> 1; 643 - off = mb_find_next_zero_bit(bb, max, 0); 642 + off = mb_find_next_zero_bit_test(bb, max, 0); 644 643 645 644 while (off < max) { 646 - if (!(off & 1) && !mb_test_bit(off + 1, bb)) { 645 + if (!(off & 1) && !mb_test_bit_test(off + 1, bb)) { 647 646 mb_set_bits(bb, off, 2); 648 647 grp->bb_counters[order] -= 2; 649 - mb_clear_bit(off >> 1, bb_h); 648 + mb_clear_bit_test(off >> 1, bb_h); 650 649 grp->bb_counters[order + 1]++; 651 650 grp->bb_largest_free_order = order + 1; 652 651 off++; 653 652 } 654 653 655 - off = mb_find_next_zero_bit(bb, max, off + 1); 654 + off = mb_find_next_zero_bit_test(bb, max, off + 1); 656 655 } 657 656 } 658 657 659 658 max = EXT4_CLUSTERS_PER_GROUP(sb); 660 - off = mb_find_next_zero_bit(bitmap, max, 0); 659 + off = mb_find_next_zero_bit_test(bitmap, max, 0); 661 660 while (off < max) { 662 661 grp->bb_fragments++; 663 662 664 - off = mb_find_next_bit(bitmap, max, off + 1); 663 + off = mb_find_next_bit_test(bitmap, max, off + 1); 665 664 if (off + 1 >= max) 666 665 break; 667 666 668 - off = mb_find_next_zero_bit(bitmap, max, off + 1); 667 + off = mb_find_next_zero_bit_test(bitmap, max, off + 1); 669 668 } 670 669 } 671 670 ··· 707 706 /* needed by validation in ext4_mb_generate_buddy */ 708 707 ext4_grp->bb_free = mbt_grp->bb_free; 709 708 memset(ext4_buddy, 0xff, sb->s_blocksize); 710 - ext4_mb_generate_buddy(sb, ext4_buddy, bitmap, TEST_GOAL_GROUP, 709 + ext4_mb_generate_buddy_test(sb, ext4_buddy, bitmap, TEST_GOAL_GROUP, 711 710 ext4_grp); 712 711 713 712 KUNIT_ASSERT_EQ(test, memcmp(mbt_buddy, ext4_buddy, sb->s_blocksize), ··· 761 760 ex.fe_group = TEST_GOAL_GROUP; 762 761 763 762 ext4_lock_group(sb, TEST_GOAL_GROUP); 764 - mb_mark_used(e4b, &ex); 763 + mb_mark_used_test(e4b, &ex); 765 764 ext4_unlock_group(sb, TEST_GOAL_GROUP); 766 765 767 766 mb_set_bits(bitmap, start, len); ··· 770 769 memset(buddy, 0xff, sb->s_blocksize); 771 770 for (i = 0; i < MB_NUM_ORDERS(sb); i++) 772 771 grp->bb_counters[i] = 0; 773 - ext4_mb_generate_buddy(sb, buddy, bitmap, 0, grp); 772 + ext4_mb_generate_buddy_test(sb, buddy, bitmap, 0, grp); 774 773 775 774 KUNIT_ASSERT_EQ(test, memcmp(buddy, e4b->bd_buddy, sb->s_blocksize), 776 775 0); ··· 799 798 bb_counters[MB_NUM_ORDERS(sb)]), GFP_KERNEL); 800 799 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, grp); 801 800 802 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 801 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 803 802 KUNIT_ASSERT_EQ(test, ret, 0); 804 803 805 804 grp->bb_free = EXT4_CLUSTERS_PER_GROUP(sb); ··· 810 809 test_mb_mark_used_range(test, &e4b, ranges[i].start, 811 810 ranges[i].len, bitmap, buddy, grp); 812 811 813 - ext4_mb_unload_buddy(&e4b); 812 + ext4_mb_unload_buddy_test(&e4b); 814 813 } 815 814 816 815 static void ··· 826 825 return; 827 826 828 827 ext4_lock_group(sb, e4b->bd_group); 829 - mb_free_blocks(NULL, e4b, start, len); 828 + mb_free_blocks_test(NULL, e4b, start, len); 830 829 ext4_unlock_group(sb, e4b->bd_group); 831 830 832 - mb_clear_bits(bitmap, start, len); 831 + mb_clear_bits_test(bitmap, start, len); 833 832 /* bypass bb_free validatoin in ext4_mb_generate_buddy */ 834 833 grp->bb_free += len; 835 834 memset(buddy, 0xff, sb->s_blocksize); 836 835 for (i = 0; i < MB_NUM_ORDERS(sb); i++) 837 836 grp->bb_counters[i] = 0; 838 - ext4_mb_generate_buddy(sb, buddy, bitmap, 0, grp); 837 + ext4_mb_generate_buddy_test(sb, buddy, bitmap, 0, grp); 839 838 840 839 KUNIT_ASSERT_EQ(test, memcmp(buddy, e4b->bd_buddy, sb->s_blocksize), 841 840 0); ··· 866 865 bb_counters[MB_NUM_ORDERS(sb)]), GFP_KERNEL); 867 866 KUNIT_ASSERT_NOT_ERR_OR_NULL(test, grp); 868 867 869 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 868 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 870 869 KUNIT_ASSERT_EQ(test, ret, 0); 871 870 872 871 ex.fe_start = 0; ··· 874 873 ex.fe_group = TEST_GOAL_GROUP; 875 874 876 875 ext4_lock_group(sb, TEST_GOAL_GROUP); 877 - mb_mark_used(&e4b, &ex); 876 + mb_mark_used_test(&e4b, &ex); 878 877 ext4_unlock_group(sb, TEST_GOAL_GROUP); 879 878 880 879 grp->bb_free = 0; ··· 887 886 test_mb_free_blocks_range(test, &e4b, ranges[i].start, 888 887 ranges[i].len, bitmap, buddy, grp); 889 888 890 - ext4_mb_unload_buddy(&e4b); 889 + ext4_mb_unload_buddy_test(&e4b); 891 890 } 892 891 893 892 #define COUNT_FOR_ESTIMATE 100000 ··· 905 904 if (sb->s_blocksize > PAGE_SIZE) 906 905 kunit_skip(test, "blocksize exceeds pagesize"); 907 906 908 - ret = ext4_mb_load_buddy(sb, TEST_GOAL_GROUP, &e4b); 907 + ret = ext4_mb_load_buddy_test(sb, TEST_GOAL_GROUP, &e4b); 909 908 KUNIT_ASSERT_EQ(test, ret, 0); 910 909 911 910 ex.fe_group = TEST_GOAL_GROUP; ··· 919 918 ex.fe_start = ranges[i].start; 920 919 ex.fe_len = ranges[i].len; 921 920 ext4_lock_group(sb, TEST_GOAL_GROUP); 922 - mb_mark_used(&e4b, &ex); 921 + mb_mark_used_test(&e4b, &ex); 923 922 ext4_unlock_group(sb, TEST_GOAL_GROUP); 924 923 } 925 924 end = jiffies; ··· 930 929 continue; 931 930 932 931 ext4_lock_group(sb, TEST_GOAL_GROUP); 933 - mb_free_blocks(NULL, &e4b, ranges[i].start, 932 + mb_free_blocks_test(NULL, &e4b, ranges[i].start, 934 933 ranges[i].len); 935 934 ext4_unlock_group(sb, TEST_GOAL_GROUP); 936 935 } 937 936 } 938 937 939 938 kunit_info(test, "costed jiffies %lu\n", all); 940 - ext4_mb_unload_buddy(&e4b); 939 + ext4_mb_unload_buddy_test(&e4b); 941 940 } 942 941 943 942 static const struct mbt_ext4_block_layout mbt_test_layouts[] = {
+114 -18
fs/ext4/mballoc.c
··· 1199 1199 1200 1200 /* searching for the right group start from the goal value specified */ 1201 1201 start = ac->ac_g_ex.fe_group; 1202 + if (start >= ngroups) 1203 + start = 0; 1202 1204 ac->ac_prefetch_grp = start; 1203 1205 ac->ac_prefetch_nr = 0; 1204 1206 ··· 2445 2443 return 0; 2446 2444 2447 2445 err = ext4_mb_load_buddy(ac->ac_sb, group, e4b); 2448 - if (err) 2446 + if (err) { 2447 + if (EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info) && 2448 + !(ac->ac_flags & EXT4_MB_HINT_GOAL_ONLY)) 2449 + return 0; 2449 2450 return err; 2451 + } 2450 2452 2451 2453 ext4_lock_group(ac->ac_sb, group); 2452 2454 if (unlikely(EXT4_MB_GRP_BBITMAP_CORRUPT(e4b->bd_info))) ··· 3586 3580 rcu_read_unlock(); 3587 3581 iput(sbi->s_buddy_cache); 3588 3582 err_freesgi: 3589 - rcu_read_lock(); 3590 - kvfree(rcu_dereference(sbi->s_group_info)); 3591 - rcu_read_unlock(); 3583 + kvfree(rcu_access_pointer(sbi->s_group_info)); 3592 3584 return -ENOMEM; 3593 3585 } 3594 3586 ··· 3893 3889 struct kmem_cache *cachep = get_groupinfo_cache(sb->s_blocksize_bits); 3894 3890 int count; 3895 3891 3896 - if (test_opt(sb, DISCARD)) { 3897 - /* 3898 - * wait the discard work to drain all of ext4_free_data 3899 - */ 3900 - flush_work(&sbi->s_discard_work); 3901 - WARN_ON_ONCE(!list_empty(&sbi->s_discard_list)); 3902 - } 3892 + /* 3893 + * wait the discard work to drain all of ext4_free_data 3894 + */ 3895 + flush_work(&sbi->s_discard_work); 3896 + WARN_ON_ONCE(!list_empty(&sbi->s_discard_list)); 3903 3897 3904 - if (sbi->s_group_info) { 3898 + group_info = rcu_access_pointer(sbi->s_group_info); 3899 + if (group_info) { 3905 3900 for (i = 0; i < ngroups; i++) { 3906 3901 cond_resched(); 3907 3902 grinfo = ext4_get_group_info(sb, i); ··· 3918 3915 num_meta_group_infos = (ngroups + 3919 3916 EXT4_DESC_PER_BLOCK(sb) - 1) >> 3920 3917 EXT4_DESC_PER_BLOCK_BITS(sb); 3921 - rcu_read_lock(); 3922 - group_info = rcu_dereference(sbi->s_group_info); 3923 3918 for (i = 0; i < num_meta_group_infos; i++) 3924 3919 kfree(group_info[i]); 3925 3920 kvfree(group_info); 3926 - rcu_read_unlock(); 3927 3921 } 3928 3922 ext4_mb_avg_fragment_size_destroy(sbi); 3929 3923 ext4_mb_largest_free_orders_destroy(sbi); ··· 4084 4084 4085 4085 #define EXT4_MB_BITMAP_MARKED_CHECK 0x0001 4086 4086 #define EXT4_MB_SYNC_UPDATE 0x0002 4087 - static int 4087 + int 4088 4088 ext4_mb_mark_context(handle_t *handle, struct super_block *sb, bool state, 4089 4089 ext4_group_t group, ext4_grpblk_t blkoff, 4090 4090 ext4_grpblk_t len, int flags, ext4_grpblk_t *ret_changed) ··· 7188 7188 return error; 7189 7189 } 7190 7190 7191 - #ifdef CONFIG_EXT4_KUNIT_TESTS 7192 - #include "mballoc-test.c" 7191 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 7192 + void mb_clear_bits_test(void *bm, int cur, int len) 7193 + { 7194 + mb_clear_bits(bm, cur, len); 7195 + } 7196 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_clear_bits_test); 7197 + 7198 + ext4_fsblk_t 7199 + ext4_mb_new_blocks_simple_test(struct ext4_allocation_request *ar, 7200 + int *errp) 7201 + { 7202 + return ext4_mb_new_blocks_simple(ar, errp); 7203 + } 7204 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_new_blocks_simple_test); 7205 + 7206 + int mb_find_next_zero_bit_test(void *addr, int max, int start) 7207 + { 7208 + return mb_find_next_zero_bit(addr, max, start); 7209 + } 7210 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_find_next_zero_bit_test); 7211 + 7212 + int mb_find_next_bit_test(void *addr, int max, int start) 7213 + { 7214 + return mb_find_next_bit(addr, max, start); 7215 + } 7216 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_find_next_bit_test); 7217 + 7218 + void mb_clear_bit_test(int bit, void *addr) 7219 + { 7220 + mb_clear_bit(bit, addr); 7221 + } 7222 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_clear_bit_test); 7223 + 7224 + int mb_test_bit_test(int bit, void *addr) 7225 + { 7226 + return mb_test_bit(bit, addr); 7227 + } 7228 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_test_bit_test); 7229 + 7230 + int ext4_mb_mark_diskspace_used_test(struct ext4_allocation_context *ac, 7231 + handle_t *handle) 7232 + { 7233 + return ext4_mb_mark_diskspace_used(ac, handle); 7234 + } 7235 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_mark_diskspace_used_test); 7236 + 7237 + int mb_mark_used_test(struct ext4_buddy *e4b, struct ext4_free_extent *ex) 7238 + { 7239 + return mb_mark_used(e4b, ex); 7240 + } 7241 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_mark_used_test); 7242 + 7243 + void ext4_mb_generate_buddy_test(struct super_block *sb, void *buddy, 7244 + void *bitmap, ext4_group_t group, 7245 + struct ext4_group_info *grp) 7246 + { 7247 + ext4_mb_generate_buddy(sb, buddy, bitmap, group, grp); 7248 + } 7249 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_generate_buddy_test); 7250 + 7251 + int ext4_mb_load_buddy_test(struct super_block *sb, ext4_group_t group, 7252 + struct ext4_buddy *e4b) 7253 + { 7254 + return ext4_mb_load_buddy(sb, group, e4b); 7255 + } 7256 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_load_buddy_test); 7257 + 7258 + void ext4_mb_unload_buddy_test(struct ext4_buddy *e4b) 7259 + { 7260 + ext4_mb_unload_buddy(e4b); 7261 + } 7262 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_unload_buddy_test); 7263 + 7264 + void mb_free_blocks_test(struct inode *inode, struct ext4_buddy *e4b, 7265 + int first, int count) 7266 + { 7267 + mb_free_blocks(inode, e4b, first, count); 7268 + } 7269 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_free_blocks_test); 7270 + 7271 + void ext4_free_blocks_simple_test(struct inode *inode, ext4_fsblk_t block, 7272 + unsigned long count) 7273 + { 7274 + return ext4_free_blocks_simple(inode, block, count); 7275 + } 7276 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_free_blocks_simple_test); 7277 + 7278 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_wait_block_bitmap); 7279 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_init); 7280 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_get_group_desc); 7281 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_count_free_clusters); 7282 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_get_group_info); 7283 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_free_group_clusters_set); 7284 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_release); 7285 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_read_block_bitmap_nowait); 7286 + EXPORT_SYMBOL_FOR_EXT4_TEST(mb_set_bits); 7287 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_fc_init_inode); 7288 + EXPORT_SYMBOL_FOR_EXT4_TEST(ext4_mb_mark_context); 7193 7289 #endif
+30
fs/ext4/mballoc.h
··· 270 270 ext4_mballoc_query_range_fn formatter, 271 271 void *priv); 272 272 273 + extern int ext4_mb_mark_context(handle_t *handle, 274 + struct super_block *sb, bool state, 275 + ext4_group_t group, ext4_grpblk_t blkoff, 276 + ext4_grpblk_t len, int flags, 277 + ext4_grpblk_t *ret_changed); 278 + #if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) 279 + extern void mb_clear_bits_test(void *bm, int cur, int len); 280 + extern ext4_fsblk_t 281 + ext4_mb_new_blocks_simple_test(struct ext4_allocation_request *ar, 282 + int *errp); 283 + extern int mb_find_next_zero_bit_test(void *addr, int max, int start); 284 + extern int mb_find_next_bit_test(void *addr, int max, int start); 285 + extern void mb_clear_bit_test(int bit, void *addr); 286 + extern int mb_test_bit_test(int bit, void *addr); 287 + extern int 288 + ext4_mb_mark_diskspace_used_test(struct ext4_allocation_context *ac, 289 + handle_t *handle); 290 + extern int mb_mark_used_test(struct ext4_buddy *e4b, 291 + struct ext4_free_extent *ex); 292 + extern void ext4_mb_generate_buddy_test(struct super_block *sb, 293 + void *buddy, void *bitmap, ext4_group_t group, 294 + struct ext4_group_info *grp); 295 + extern int ext4_mb_load_buddy_test(struct super_block *sb, 296 + ext4_group_t group, struct ext4_buddy *e4b); 297 + extern void ext4_mb_unload_buddy_test(struct ext4_buddy *e4b); 298 + extern void mb_free_blocks_test(struct inode *inode, 299 + struct ext4_buddy *e4b, int first, int count); 300 + extern void ext4_free_blocks_simple_test(struct inode *inode, 301 + ext4_fsblk_t block, unsigned long count); 302 + #endif 273 303 #endif
+8 -2
fs/ext4/page-io.c
··· 524 524 nr_to_submit++; 525 525 } while ((bh = bh->b_this_page) != head); 526 526 527 - /* Nothing to submit? Just unlock the folio... */ 528 - if (!nr_to_submit) 527 + if (!nr_to_submit) { 528 + /* 529 + * We have nothing to submit. Just cycle the folio through 530 + * writeback state to properly update xarray tags. 531 + */ 532 + __folio_start_writeback(folio, keep_towrite); 533 + folio_end_writeback(folio); 529 534 return 0; 535 + } 530 536 531 537 bh = head = folio_buffers(folio); 532 538
+31 -6
fs/ext4/super.c
··· 1254 1254 struct buffer_head **group_desc; 1255 1255 int i; 1256 1256 1257 - rcu_read_lock(); 1258 - group_desc = rcu_dereference(sbi->s_group_desc); 1257 + group_desc = rcu_access_pointer(sbi->s_group_desc); 1259 1258 for (i = 0; i < sbi->s_gdb_count; i++) 1260 1259 brelse(group_desc[i]); 1261 1260 kvfree(group_desc); 1262 - rcu_read_unlock(); 1263 1261 } 1264 1262 1265 1263 static void ext4_flex_groups_free(struct ext4_sb_info *sbi) ··· 1265 1267 struct flex_groups **flex_groups; 1266 1268 int i; 1267 1269 1268 - rcu_read_lock(); 1269 - flex_groups = rcu_dereference(sbi->s_flex_groups); 1270 + flex_groups = rcu_access_pointer(sbi->s_flex_groups); 1270 1271 if (flex_groups) { 1271 1272 for (i = 0; i < sbi->s_flex_groups_allocated; i++) 1272 1273 kvfree(flex_groups[i]); 1273 1274 kvfree(flex_groups); 1274 1275 } 1275 - rcu_read_unlock(); 1276 1276 } 1277 1277 1278 1278 static void ext4_put_super(struct super_block *sb) ··· 1523 1527 invalidate_inode_buffers(inode); 1524 1528 clear_inode(inode); 1525 1529 ext4_discard_preallocations(inode); 1530 + /* 1531 + * We must remove the inode from the hash before ext4_free_inode() 1532 + * clears the bit in inode bitmap as otherwise another process reusing 1533 + * the inode will block in insert_inode_hash() waiting for inode 1534 + * eviction to complete while holding transaction handle open, but 1535 + * ext4_evict_inode() still running for that inode could block waiting 1536 + * for transaction commit if the inode is marked as IS_SYNC => deadlock. 1537 + * 1538 + * Removing the inode from the hash here is safe. There are two cases 1539 + * to consider: 1540 + * 1) The inode still has references to it (i_nlink > 0). In that case 1541 + * we are keeping the inode and once we remove the inode from the hash, 1542 + * iget() can create the new inode structure for the same inode number 1543 + * and we are fine with that as all IO on behalf of the inode is 1544 + * finished. 1545 + * 2) We are deleting the inode (i_nlink == 0). In that case inode 1546 + * number cannot be reused until ext4_free_inode() clears the bit in 1547 + * the inode bitmap, at which point all IO is done and reuse is fine 1548 + * again. 1549 + */ 1550 + remove_inode_hash(inode); 1526 1551 ext4_es_remove_extent(inode, 0, EXT_MAX_BLOCKS); 1527 1552 dquot_drop(inode); 1528 1553 if (EXT4_I(inode)->jinode) { ··· 3650 3633 "extents feature\n"); 3651 3634 return 0; 3652 3635 } 3636 + if (ext4_has_feature_bigalloc(sb) && 3637 + le32_to_cpu(EXT4_SB(sb)->s_es->s_first_data_block)) { 3638 + ext4_msg(sb, KERN_WARNING, 3639 + "bad geometry: bigalloc file system with non-zero " 3640 + "first_data_block\n"); 3641 + return 0; 3642 + } 3653 3643 3654 3644 #if !IS_ENABLED(CONFIG_QUOTA) || !IS_ENABLED(CONFIG_QFMT_V2) 3655 3645 if (!readonly && (ext4_has_feature_quota(sb) || ··· 5427 5403 5428 5404 timer_setup(&sbi->s_err_report, print_daily_error_info, 0); 5429 5405 spin_lock_init(&sbi->s_error_lock); 5406 + mutex_init(&sbi->s_error_notify_mutex); 5430 5407 INIT_WORK(&sbi->s_sb_upd_work, update_super_work); 5431 5408 5432 5409 err = ext4_group_desc_init(sb, es, logical_sb_block, &first_not_zeroed);
+9 -1
fs/ext4/sysfs.c
··· 597 597 598 598 void ext4_notify_error_sysfs(struct ext4_sb_info *sbi) 599 599 { 600 - sysfs_notify(&sbi->s_kobj, NULL, "errors_count"); 600 + mutex_lock(&sbi->s_error_notify_mutex); 601 + if (sbi->s_kobj.state_in_sysfs) 602 + sysfs_notify(&sbi->s_kobj, NULL, "errors_count"); 603 + mutex_unlock(&sbi->s_error_notify_mutex); 601 604 } 602 605 603 606 static struct kobject *ext4_root; ··· 613 610 int err; 614 611 615 612 init_completion(&sbi->s_kobj_unregister); 613 + mutex_lock(&sbi->s_error_notify_mutex); 616 614 err = kobject_init_and_add(&sbi->s_kobj, &ext4_sb_ktype, ext4_root, 617 615 "%s", sb->s_id); 616 + mutex_unlock(&sbi->s_error_notify_mutex); 618 617 if (err) { 619 618 kobject_put(&sbi->s_kobj); 620 619 wait_for_completion(&sbi->s_kobj_unregister); ··· 649 644 650 645 if (sbi->s_proc) 651 646 remove_proc_subtree(sb->s_id, ext4_proc_root); 647 + 648 + mutex_lock(&sbi->s_error_notify_mutex); 652 649 kobject_del(&sbi->s_kobj); 650 + mutex_unlock(&sbi->s_error_notify_mutex); 653 651 } 654 652 655 653 int __init ext4_init_sysfs(void)
+13 -2
fs/jbd2/checkpoint.c
··· 267 267 */ 268 268 BUFFER_TRACE(bh, "queue"); 269 269 get_bh(bh); 270 - J_ASSERT_BH(bh, !buffer_jwrite(bh)); 270 + if (WARN_ON_ONCE(buffer_jwrite(bh))) { 271 + put_bh(bh); /* drop the ref we just took */ 272 + spin_unlock(&journal->j_list_lock); 273 + /* Clean up any previously batched buffers */ 274 + if (batch_count) 275 + __flush_batch(journal, &batch_count); 276 + jbd2_journal_abort(journal, -EFSCORRUPTED); 277 + return -EFSCORRUPTED; 278 + } 271 279 journal->j_chkpt_bhs[batch_count++] = bh; 272 280 transaction->t_chp_stats.cs_written++; 273 281 transaction->t_checkpoint_list = jh->b_cpnext; ··· 333 325 334 326 if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr)) 335 327 return 1; 336 - J_ASSERT(blocknr != 0); 328 + if (WARN_ON_ONCE(blocknr == 0)) { 329 + jbd2_journal_abort(journal, -EFSCORRUPTED); 330 + return -EFSCORRUPTED; 331 + } 337 332 338 333 /* 339 334 * We need to make sure that any blocks that were recently written out