Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'f2fs-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull f2fs updates from Jaegeuk Kim:
"In this round, the changes primarily focus on resolving race
conditions, memory safety issues (UAF), and improving the robustness
of garbage collection (GC), and folio management.

Enhancements:
- add page-order information for large folio reads in iostat
- add defrag_blocks sysfs node

Bug fixes:
- fix uninitialized kobject put in f2fs_init_sysfs()
- disallow setting an extension to both cold and hot
- fix node_cnt race between extent node destroy and writeback
- preserve previous reserve_{blocks,node} value when remount
- freeze GC and discard threads quickly
- fix false alarm of lockdep on cp_global_sem lock
- fix data loss caused by incorrect use of nat_entry flag
- skip empty sections in f2fs_get_victim
- fix inline data not being written to disk in writeback path
- fix fsck inconsistency caused by FGGC of node block
- fix fsck inconsistency caused by incorrect nat_entry flag usage
- call f2fs_handle_critical_error() to set cp_error flag
- fix fiemap boundary handling when read extent cache is incomplete
- fix use-after-free of sbi in f2fs_compress_write_end_io()
- fix UAF caused by decrementing sbi->nr_pages[] in f2fs_write_end_io()
- fix incorrect file address mapping when inline inode is unwritten
- fix incomplete search range in f2fs_get_victim when f2fs_need_rand_seg is enabled
- avoid memory leak in f2fs_rename()"

* tag 'f2fs-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (35 commits)
f2fs: add page-order information for large folio reads in iostat
f2fs: do not support mmap write for large folio
f2fs: fix uninitialized kobject put in f2fs_init_sysfs()
f2fs: protect extension_list reading with sb_lock in f2fs_sbi_show()
f2fs: disallow setting an extension to both cold and hot
f2fs: fix node_cnt race between extent node destroy and writeback
f2fs: allow empty mount string for Opt_usr|grp|projjquota
f2fs: fix to preserve previous reserve_{blocks,node} value when remount
f2fs: invalidate block device page cache on umount
f2fs: fix to freeze GC and discard threads quickly
f2fs: fix to avoid uninit-value access in f2fs_sanity_check_node_footer
f2fs: fix false alarm of lockdep on cp_global_sem lock
f2fs: fix data loss caused by incorrect use of nat_entry flag
f2fs: fix to skip empty sections in f2fs_get_victim
f2fs: fix inline data not being written to disk in writeback path
f2fs: fix fsck inconsistency caused by FGGC of node block
f2fs: fix fsck inconsistency caused by incorrect nat_entry flag usage
f2fs: fix to do sanity check on dcc->discard_cmd_cnt conditionally
f2fs: refactor node footer flag setting related code
f2fs: refactor f2fs_move_node_folio function
...

+387 -152
+6
Documentation/ABI/testing/sysfs-fs-f2fs
··· 407 407 Description: Average number of valid blocks. 408 408 Available when CONFIG_F2FS_STAT_FS=y. 409 409 410 + What: /sys/fs/f2fs/<disk>/defrag_blocks 411 + Date: February 2026 412 + Contact: "Jinbao Liu" <liujinbao1@xiaomi.com> 413 + Description: Number of blocks moved by defragment. 414 + Available when CONFIG_F2FS_STAT_FS=y. 415 + 410 416 What: /sys/fs/f2fs/<disk>/mounted_time_sec 411 417 Date: February 2020 412 418 Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
-9
fs/f2fs/checkpoint.c
··· 232 232 static struct kmem_cache *ino_entry_slab; 233 233 struct kmem_cache *f2fs_inode_entry_slab; 234 234 235 - void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io, 236 - unsigned char reason) 237 - { 238 - f2fs_build_fault_attr(sbi, 0, 0, FAULT_ALL); 239 - if (!end_io) 240 - f2fs_flush_merged_writes(sbi); 241 - f2fs_handle_critical_error(sbi, reason); 242 - } 243 - 244 235 /* 245 236 * We guarantee no failure on the returned page. 246 237 */
+11 -3
fs/f2fs/compress.c
··· 1491 1491 1492 1492 f2fs_compress_free_page(page); 1493 1493 1494 - dec_page_count(sbi, type); 1495 - 1496 - if (atomic_dec_return(&cic->pending_pages)) 1494 + if (atomic_dec_return(&cic->pending_pages)) { 1495 + dec_page_count(sbi, type); 1497 1496 return; 1497 + } 1498 1498 1499 1499 for (i = 0; i < cic->nr_rpages; i++) { 1500 1500 WARN_ON(!cic->rpages[i]); ··· 1504 1504 1505 1505 page_array_free(sbi, cic->rpages, cic->nr_rpages); 1506 1506 kmem_cache_free(cic_entry_slab, cic); 1507 + 1508 + /* 1509 + * Make sure dec_page_count() is the last access to sbi. 1510 + * Once it drops the F2FS_WB_CP_DATA counter to zero, the 1511 + * unmount thread can proceed to destroy sbi and 1512 + * sbi->page_array_slab. 1513 + */ 1514 + dec_page_count(sbi, type); 1507 1515 } 1508 1516 1509 1517 static int f2fs_write_raw_pages(struct compress_ctx *cc,
+34 -19
fs/f2fs/data.c
··· 173 173 while (nr_pages--) 174 174 dec_page_count(F2FS_F_SB(folio), __read_io_type(folio)); 175 175 176 - if (F2FS_F_SB(folio)->node_inode && is_node_folio(folio) && 176 + if (bio->bi_status == BLK_STS_OK && 177 + F2FS_F_SB(folio)->node_inode && is_node_folio(folio) && 177 178 f2fs_sanity_check_node_footer(F2FS_F_SB(folio), 178 179 folio, folio->index, NODE_TYPE_REGULAR, true)) 179 180 bio->bi_status = BLK_STS_IOERR; ··· 387 386 folio->index, NODE_TYPE_REGULAR, true); 388 387 f2fs_bug_on(sbi, folio->index != nid_of_node(folio)); 389 388 } 389 + if (f2fs_in_warm_node_list(folio)) 390 + f2fs_del_fsync_node_entry(sbi, folio); 390 391 391 392 dec_page_count(sbi, type); 392 393 ··· 400 397 wq_has_sleeper(&sbi->cp_wait)) 401 398 wake_up(&sbi->cp_wait); 402 399 403 - if (f2fs_in_warm_node_list(sbi, folio)) 404 - f2fs_del_fsync_node_entry(sbi, folio); 405 400 folio_clear_f2fs_gcing(folio); 406 401 folio_end_writeback(folio); 407 402 } ··· 1579 1578 f2fs_wait_on_block_writeback_range(inode, 1580 1579 map->m_pblk, map->m_len); 1581 1580 1582 - if (f2fs_allow_multi_device_dio(sbi, flag)) { 1581 + map->m_multidev_dio = f2fs_allow_multi_device_dio(sbi, flag); 1582 + if (map->m_multidev_dio) { 1583 1583 int bidx = f2fs_target_device_index(sbi, map->m_pblk); 1584 1584 struct f2fs_dev_info *dev = &sbi->devs[bidx]; 1585 1585 ··· 1640 1638 lfs_dio_write = (flag == F2FS_GET_BLOCK_DIO && f2fs_lfs_mode(sbi) && 1641 1639 map->m_may_create); 1642 1640 1643 - if (!map->m_may_create && f2fs_map_blocks_cached(inode, map, flag)) 1644 - goto out; 1641 + if (!map->m_may_create && f2fs_map_blocks_cached(inode, map, flag)) { 1642 + struct extent_info ei; 1643 + 1644 + /* 1645 + * 1. If map->m_multidev_dio is true, map->m_pblk cannot be 1646 + * waitted by f2fs_wait_on_block_writeback_range() and are not 1647 + * mergeable. 1648 + * 2. If pgofs hits the read extent cache, it means the mapping 1649 + * is already cached in the extent cache, but it is not 1650 + * mergeable, and there is no need to query the mapping again 1651 + * via f2fs_get_dnode_of_data(). 1652 + */ 1653 + pgofs = (pgoff_t)map->m_lblk + map->m_len; 1654 + if (map->m_len == maxblocks || 1655 + map->m_multidev_dio || 1656 + f2fs_lookup_read_extent_cache(inode, pgofs, &ei)) 1657 + goto out; 1658 + ofs = map->m_len; 1659 + goto map_more; 1660 + } 1645 1661 1646 1662 map->m_bdev = inode->i_sb->s_bdev; 1647 1663 map->m_multidev_dio = ··· 1670 1650 1671 1651 /* it only supports block size == page size */ 1672 1652 pgofs = (pgoff_t)map->m_lblk; 1673 - end = pgofs + maxblocks; 1653 + map_more: 1654 + end = (pgoff_t)map->m_lblk + maxblocks; 1674 1655 1675 1656 if (flag == F2FS_GET_BLOCK_PRECACHE) 1676 1657 mode = LOOKUP_NODE_RA; ··· 2511 2490 if (!folio) 2512 2491 goto out; 2513 2492 2493 + f2fs_update_read_folio_count(F2FS_I_SB(inode), folio); 2494 + 2514 2495 folio_in_bio = false; 2515 2496 index = folio->index; 2516 2497 offset = 0; ··· 2687 2664 prefetchw(&folio->flags); 2688 2665 } 2689 2666 2667 + f2fs_update_read_folio_count(F2FS_I_SB(inode), folio); 2668 + 2690 2669 #ifdef CONFIG_F2FS_FS_COMPRESSION 2691 2670 index = folio->index; 2692 2671 ··· 2815 2790 struct inode *inode = fio_inode(fio); 2816 2791 struct folio *mfolio; 2817 2792 struct page *page; 2818 - gfp_t gfp_flags = GFP_NOFS; 2819 2793 2820 2794 if (!f2fs_encrypted_file(inode)) 2821 2795 return 0; ··· 2824 2800 if (fscrypt_inode_uses_inline_crypto(inode)) 2825 2801 return 0; 2826 2802 2827 - retry_encrypt: 2828 2803 fio->encrypted_page = fscrypt_encrypt_pagecache_blocks(page_folio(page), 2829 - PAGE_SIZE, 0, gfp_flags); 2830 - if (IS_ERR(fio->encrypted_page)) { 2831 - /* flush pending IOs and wait for a while in the ENOMEM case */ 2832 - if (PTR_ERR(fio->encrypted_page) == -ENOMEM) { 2833 - f2fs_flush_merged_writes(fio->sbi); 2834 - memalloc_retry_wait(GFP_NOFS); 2835 - gfp_flags |= __GFP_NOFAIL; 2836 - goto retry_encrypt; 2837 - } 2804 + PAGE_SIZE, 0, GFP_NOFS); 2805 + if (IS_ERR(fio->encrypted_page)) 2838 2806 return PTR_ERR(fio->encrypted_page); 2839 - } 2840 2807 2841 2808 mfolio = filemap_lock_folio(META_MAPPING(fio->sbi), fio->old_blkaddr); 2842 2809 if (!IS_ERR(mfolio)) {
+1
fs/f2fs/debug.c
··· 659 659 si->bg_node_blks); 660 660 seq_printf(s, "BG skip : IO: %u, Other: %u\n", 661 661 si->io_skip_bggc, si->other_skip_bggc); 662 + seq_printf(s, "defrag blocks : %u\n", si->defrag_blks); 662 663 seq_puts(s, "\nExtent Cache (Read):\n"); 663 664 seq_printf(s, " - Hit Count: L1-1:%llu L1-2:%llu L2:%llu\n", 664 665 si->hit_largest, si->hit_cached[EX_READ],
+10 -7
fs/f2fs/extent_cache.c
··· 119 119 if (!__init_may_extent_tree(inode, type)) 120 120 return false; 121 121 122 + if (is_inode_flag_set(inode, FI_NO_EXTENT)) 123 + return false; 124 + 122 125 if (type == EX_READ) { 123 - if (is_inode_flag_set(inode, FI_NO_EXTENT)) 124 - return false; 125 126 if (is_inode_flag_set(inode, FI_COMPRESSED_FILE) && 126 127 !f2fs_sb_has_readonly(F2FS_I_SB(inode))) 127 128 return false; ··· 645 644 646 645 while (atomic_read(&et->node_cnt)) { 647 646 write_lock(&et->lock); 647 + if (!is_inode_flag_set(inode, FI_NO_EXTENT)) 648 + set_inode_flag(inode, FI_NO_EXTENT); 648 649 node_cnt += __free_extent_tree(sbi, et, nr_shrink); 649 650 write_unlock(&et->lock); 650 651 } ··· 691 688 692 689 write_lock(&et->lock); 693 690 694 - if (type == EX_READ) { 695 - if (is_inode_flag_set(inode, FI_NO_EXTENT)) { 696 - write_unlock(&et->lock); 697 - return; 698 - } 691 + if (is_inode_flag_set(inode, FI_NO_EXTENT)) { 692 + write_unlock(&et->lock); 693 + return; 694 + } 699 695 696 + if (type == EX_READ) { 700 697 prev = et->largest; 701 698 dei.len = 0; 702 699
+35 -6
fs/f2fs/f2fs.h
··· 10 10 11 11 #include <linux/uio.h> 12 12 #include <linux/types.h> 13 + #include <linux/mmzone.h> 13 14 #include <linux/page-flags.h> 14 15 #include <linux/slab.h> 15 16 #include <linux/crc32.h> ··· 2033 2032 unsigned long long iostat_count[NR_IO_TYPE]; 2034 2033 unsigned long long iostat_bytes[NR_IO_TYPE]; 2035 2034 unsigned long long prev_iostat_bytes[NR_IO_TYPE]; 2035 + unsigned long long iostat_read_folio_count[NR_PAGE_ORDERS]; 2036 + unsigned long long prev_iostat_read_folio_count[NR_PAGE_ORDERS]; 2036 2037 bool iostat_enable; 2037 2038 unsigned long iostat_next_period; 2038 2039 unsigned int iostat_period_ms; ··· 2042 2039 /* For io latency related statistics info in one iostat period */ 2043 2040 spinlock_t iostat_lat_lock; 2044 2041 struct iostat_lat_info *iostat_io_lat; 2042 + #endif 2043 + #ifdef CONFIG_DEBUG_LOCK_ALLOC 2044 + struct lock_class_key cp_global_sem_key; 2045 2045 #endif 2046 2046 }; 2047 2047 ··· 3906 3900 loff_t max_file_blocks(struct inode *inode); 3907 3901 void f2fs_quota_off_umount(struct super_block *sb); 3908 3902 void f2fs_save_errors(struct f2fs_sb_info *sbi, unsigned char flag); 3909 - void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, unsigned char reason); 3910 3903 void f2fs_handle_error(struct f2fs_sb_info *sbi, unsigned char error); 3911 3904 int f2fs_commit_super(struct f2fs_sb_info *sbi, bool recover); 3912 3905 int f2fs_sync_fs(struct super_block *sb, int sync); ··· 3924 3919 3925 3920 int f2fs_check_nid_range(struct f2fs_sb_info *sbi, nid_t nid); 3926 3921 bool f2fs_available_free_memory(struct f2fs_sb_info *sbi, int type); 3927 - bool f2fs_in_warm_node_list(struct f2fs_sb_info *sbi, struct folio *folio); 3922 + bool f2fs_in_warm_node_list(struct folio *folio); 3928 3923 void f2fs_init_fsync_node_info(struct f2fs_sb_info *sbi); 3929 3924 void f2fs_del_fsync_node_entry(struct f2fs_sb_info *sbi, struct folio *folio); 3930 3925 void f2fs_reset_fsync_node_info(struct f2fs_sb_info *sbi); 3931 - int f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid); 3926 + bool f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid); 3932 3927 bool f2fs_is_checkpointed_node(struct f2fs_sb_info *sbi, nid_t nid); 3933 3928 bool f2fs_need_inode_block_update(struct f2fs_sb_info *sbi, nid_t ino); 3934 3929 int f2fs_get_node_info(struct f2fs_sb_info *sbi, nid_t nid, ··· 3950 3945 enum node_type ntype, bool in_irq); 3951 3946 struct folio *f2fs_get_inode_folio(struct f2fs_sb_info *sbi, pgoff_t ino); 3952 3947 struct folio *f2fs_get_xnode_folio(struct f2fs_sb_info *sbi, pgoff_t xnid); 3948 + int f2fs_write_single_node_folio(struct folio *node_folio, int sync_mode, 3949 + bool mark_dirty, enum iostat_type io_type); 3953 3950 int f2fs_move_node_folio(struct folio *node_folio, int gc_type); 3954 3951 void f2fs_flush_inline_data(struct f2fs_sb_info *sbi); 3955 3952 int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, ··· 3994 3987 int f2fs_start_discard_thread(struct f2fs_sb_info *sbi); 3995 3988 void f2fs_drop_discard_cmd(struct f2fs_sb_info *sbi); 3996 3989 void f2fs_stop_discard_thread(struct f2fs_sb_info *sbi); 3997 - bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi); 3990 + bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi, bool need_check); 3998 3991 void f2fs_clear_prefree_segments(struct f2fs_sb_info *sbi, 3999 3992 struct cp_control *cpc); 4000 3993 void f2fs_dirty_to_prefree(struct f2fs_sb_info *sbi); ··· 4293 4286 int gc_secs[2][2]; 4294 4287 int tot_blks, data_blks, node_blks; 4295 4288 int bg_data_blks, bg_node_blks; 4289 + unsigned int defrag_blks; 4296 4290 int blkoff[NR_CURSEG_TYPE]; 4297 4291 int curseg[NR_CURSEG_TYPE]; 4298 4292 int cursec[NR_CURSEG_TYPE]; ··· 4428 4420 si->bg_node_blks += ((gc_type) == BG_GC) ? (blks) : 0; \ 4429 4421 } while (0) 4430 4422 4423 + #define stat_inc_defrag_blk_count(sbi, blks) \ 4424 + (F2FS_STAT(sbi)->defrag_blks += (blks)) 4425 + 4431 4426 int f2fs_build_stats(struct f2fs_sb_info *sbi); 4432 4427 void f2fs_destroy_stats(struct f2fs_sb_info *sbi); 4433 4428 void __init f2fs_create_root_stats(void); ··· 4472 4461 #define stat_inc_tot_blk_count(si, blks) do { } while (0) 4473 4462 #define stat_inc_data_blk_count(sbi, blks, gc_type) do { } while (0) 4474 4463 #define stat_inc_node_blk_count(sbi, blks, gc_type) do { } while (0) 4464 + #define stat_inc_defrag_blk_count(sbi, blks) do { } while (0) 4475 4465 4476 4466 static inline int f2fs_build_stats(struct f2fs_sb_info *sbi) { return 0; } 4477 4467 static inline void f2fs_destroy_stats(struct f2fs_sb_info *sbi) { } ··· 5075 5063 return; 5076 5064 5077 5065 if (ofs == sbi->page_eio_ofs[type]) { 5078 - if (sbi->page_eio_cnt[type]++ == MAX_RETRY_PAGE_EIO) 5079 - set_ckpt_flags(sbi, CP_ERROR_FLAG); 5066 + if (sbi->page_eio_cnt[type]++ == MAX_RETRY_PAGE_EIO) { 5067 + enum stop_cp_reason stop_reason; 5068 + 5069 + switch (type) { 5070 + case META: 5071 + stop_reason = STOP_CP_REASON_READ_META; 5072 + break; 5073 + case NODE: 5074 + stop_reason = STOP_CP_REASON_READ_NODE; 5075 + break; 5076 + case DATA: 5077 + stop_reason = STOP_CP_REASON_READ_DATA; 5078 + break; 5079 + default: 5080 + f2fs_bug_on(sbi, 1); 5081 + return; 5082 + } 5083 + f2fs_stop_checkpoint(sbi, false, stop_reason); 5084 + } 5080 5085 } else { 5081 5086 sbi->page_eio_ofs[type] = ofs; 5082 5087 sbi->page_eio_cnt[type] = 0;
+13 -2
fs/f2fs/file.c
··· 81 81 int err = 0; 82 82 vm_fault_t ret; 83 83 84 - if (unlikely(IS_IMMUTABLE(inode))) 84 + /* 85 + * We only support large folio on the read case. 86 + * Don't make any dirty pages. 87 + */ 88 + if (unlikely(IS_IMMUTABLE(inode)) || 89 + mapping_large_folio_support(inode->i_mapping)) { 90 + f2fs_err(sbi, "Not expected: immutable: %d large_folio: %d", 91 + IS_IMMUTABLE(inode), 92 + mapping_large_folio_support(inode->i_mapping)); 85 93 return VM_FAULT_SIGBUS; 94 + } 86 95 87 96 if (is_inode_flag_set(inode, FI_COMPRESS_RELEASED)) { 88 97 err = -EIO; ··· 3051 3042 clear_inode_flag(inode, FI_OPU_WRITE); 3052 3043 unlock_out: 3053 3044 inode_unlock(inode); 3054 - if (!err) 3045 + if (!err) { 3055 3046 range->len = (u64)total << PAGE_SHIFT; 3047 + stat_inc_defrag_blk_count(sbi, total); 3048 + } 3056 3049 return err; 3057 3050 } 3058 3051
+20 -3
fs/f2fs/gc.c
··· 316 316 p->max_search = sbi->max_victim_search; 317 317 318 318 /* let's select beginning hot/small space first. */ 319 - if (f2fs_need_rand_seg(sbi)) 319 + if (f2fs_need_rand_seg(sbi)) { 320 320 p->offset = get_random_u32_below(MAIN_SECS(sbi) * 321 321 SEGS_PER_SEC(sbi)); 322 - else if (type == CURSEG_HOT_DATA || IS_NODESEG(type)) 322 + SIT_I(sbi)->last_victim[p->gc_mode] = p->offset; 323 + } else if (type == CURSEG_HOT_DATA || IS_NODESEG(type)) 323 324 p->offset = 0; 324 325 else 325 326 p->offset = SIT_I(sbi)->last_victim[p->gc_mode]; ··· 910 909 if (!f2fs_segment_has_free_slot(sbi, segno)) 911 910 goto next; 912 911 } 912 + 913 + if (!get_valid_blocks(sbi, segno, true)) 914 + goto next; 913 915 } 914 916 915 917 if (gc_type == BG_GC && test_bit(secno, dirty_i->victim_secmap)) ··· 1234 1230 .encrypted_page = NULL, 1235 1231 .in_list = 0, 1236 1232 }; 1237 - int err; 1233 + int err = 0; 1238 1234 1239 1235 folio = f2fs_grab_cache_folio(mapping, index, true); 1240 1236 if (IS_ERR(folio)) ··· 1286 1282 } 1287 1283 1288 1284 fio.encrypted_page = &efolio->page; 1285 + 1286 + if (folio_test_uptodate(efolio)) 1287 + goto put_encrypted_page; 1289 1288 1290 1289 err = f2fs_submit_page_bio(&fio); 1291 1290 if (err) ··· 1895 1888 sbi->next_victim_seg[gc_type] = 1896 1889 (cur_segno + 1 < sec_end_segno) ? 1897 1890 cur_segno + 1 : NULL_SEGNO; 1891 + 1892 + if (unlikely(freezing(current))) { 1893 + folio_put_refs(sum_folio, 2); 1894 + goto stop; 1895 + } 1898 1896 } 1899 1897 next_block: 1900 1898 folio_put_refs(sum_folio, 2); 1901 1899 segno = block_end_segno; 1902 1900 } 1903 1901 1902 + stop: 1904 1903 if (submitted) 1905 1904 f2fs_submit_merged_write(sbi, data_type); 1906 1905 ··· 1980 1967 goto stop; 1981 1968 } 1982 1969 retry: 1970 + if (unlikely(freezing(current))) { 1971 + ret = 0; 1972 + goto stop; 1973 + } 1983 1974 ret = __get_victim(sbi, &segno, gc_type, gc_control->one_time); 1984 1975 if (ret) { 1985 1976 /* allow to search victim from sections has pinned data */
+18 -4
fs/f2fs/inline.c
··· 792 792 int f2fs_inline_data_fiemap(struct inode *inode, 793 793 struct fiemap_extent_info *fieinfo, __u64 start, __u64 len) 794 794 { 795 - __u64 byteaddr, ilen; 795 + __u64 byteaddr = 0, ilen; 796 796 __u32 flags = FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_NOT_ALIGNED | 797 797 FIEMAP_EXTENT_LAST; 798 798 struct node_info ni; ··· 814 814 goto out; 815 815 } 816 816 817 + if (fieinfo->fi_flags & FIEMAP_FLAG_SYNC) { 818 + err = f2fs_write_single_node_folio(ifolio, true, false, FS_NODE_IO); 819 + if (err) 820 + return err; 821 + ifolio = f2fs_get_inode_folio(F2FS_I_SB(inode), inode->i_ino); 822 + if (IS_ERR(ifolio)) 823 + return PTR_ERR(ifolio); 824 + f2fs_folio_wait_writeback(ifolio, NODE, true, true); 825 + } 817 826 ilen = min_t(size_t, MAX_INLINE_DATA(inode), i_size_read(inode)); 818 827 if (start >= ilen) 819 828 goto out; ··· 834 825 if (err) 835 826 goto out; 836 827 837 - byteaddr = (__u64)ni.blk_addr << inode->i_sb->s_blocksize_bits; 838 - byteaddr += (char *)inline_data_addr(inode, ifolio) - 839 - (char *)F2FS_INODE(ifolio); 828 + if (__is_valid_data_blkaddr(ni.blk_addr)) { 829 + byteaddr = (__u64)ni.blk_addr << inode->i_sb->s_blocksize_bits; 830 + byteaddr += (char *)inline_data_addr(inode, ifolio) - 831 + (char *)F2FS_INODE(ifolio); 832 + } else { 833 + f2fs_bug_on(F2FS_I_SB(inode), ni.blk_addr != NEW_ADDR); 834 + flags |= FIEMAP_EXTENT_DELALLOC | FIEMAP_EXTENT_UNKNOWN; 835 + } 840 836 err = fiemap_fill_next_extent(fieinfo, start, byteaddr, ilen, flags); 841 837 trace_f2fs_fiemap(inode, start, byteaddr, ilen, flags, err); 842 838 out:
+1 -1
fs/f2fs/inode.c
··· 687 687 ri->i_uid = cpu_to_le32(i_uid_read(inode)); 688 688 ri->i_gid = cpu_to_le32(i_gid_read(inode)); 689 689 ri->i_links = cpu_to_le32(inode->i_nlink); 690 - ri->i_blocks = cpu_to_le64(SECTOR_TO_BLOCK(inode->i_blocks) + 1); 690 + ri->i_blocks = cpu_to_le64(SECTOR_TO_BLOCK(READ_ONCE(inode->i_blocks)) + 1); 691 691 692 692 if (!f2fs_is_atomic_file(inode) || 693 693 is_inode_flag_set(inode, FI_ATOMIC_COMMITTED))
+37 -1
fs/f2fs/iostat.c
··· 34 34 { 35 35 struct super_block *sb = seq->private; 36 36 struct f2fs_sb_info *sbi = F2FS_SB(sb); 37 + int i; 37 38 38 39 if (!sbi->iostat_enable) 39 40 return 0; ··· 77 76 IOSTAT_INFO_SHOW("fs node", FS_NODE_READ_IO); 78 77 IOSTAT_INFO_SHOW("fs meta", FS_META_READ_IO); 79 78 79 + /* print read folio order stats */ 80 + seq_printf(seq, "%-23s", "fs read folio order:"); 81 + for (i = 0; i < NR_PAGE_ORDERS; i++) 82 + seq_printf(seq, " %llu", sbi->iostat_read_folio_count[i]); 83 + seq_putc(seq, '\n'); 84 + 80 85 /* print other IOs */ 81 86 seq_puts(seq, "[OTHER]\n"); 82 87 IOSTAT_INFO_SHOW("fs discard", FS_DISCARD_IO); ··· 120 113 static inline void f2fs_record_iostat(struct f2fs_sb_info *sbi) 121 114 { 122 115 unsigned long long iostat_diff[NR_IO_TYPE]; 116 + unsigned long long read_folio_count_diff[NR_PAGE_ORDERS]; 123 117 int i; 124 118 unsigned long flags; 125 119 ··· 141 133 sbi->prev_iostat_bytes[i]; 142 134 sbi->prev_iostat_bytes[i] = sbi->iostat_bytes[i]; 143 135 } 136 + 137 + for (i = 0; i < NR_PAGE_ORDERS; i++) { 138 + read_folio_count_diff[i] = sbi->iostat_read_folio_count[i] - 139 + sbi->prev_iostat_read_folio_count[i]; 140 + sbi->prev_iostat_read_folio_count[i] = sbi->iostat_read_folio_count[i]; 141 + } 144 142 spin_unlock_irqrestore(&sbi->iostat_lock, flags); 145 143 146 - trace_f2fs_iostat(sbi, iostat_diff); 144 + trace_f2fs_iostat(sbi, iostat_diff, read_folio_count_diff); 147 145 148 146 __record_iostat_latency(sbi); 149 147 } ··· 165 151 sbi->iostat_bytes[i] = 0; 166 152 sbi->prev_iostat_bytes[i] = 0; 167 153 } 154 + for (i = 0; i < NR_PAGE_ORDERS; i++) { 155 + sbi->iostat_read_folio_count[i] = 0; 156 + sbi->prev_iostat_read_folio_count[i] = 0; 157 + } 168 158 spin_unlock_irq(&sbi->iostat_lock); 169 159 170 160 spin_lock_irq(&sbi->iostat_lat_lock); ··· 181 163 { 182 164 sbi->iostat_bytes[type] += io_bytes; 183 165 sbi->iostat_count[type]++; 166 + } 167 + 168 + void f2fs_update_read_folio_count(struct f2fs_sb_info *sbi, struct folio *folio) 169 + { 170 + unsigned int order = folio_order(folio); 171 + unsigned long flags; 172 + 173 + if (!sbi->iostat_enable) 174 + return; 175 + 176 + if (order >= NR_PAGE_ORDERS) 177 + order = NR_PAGE_ORDERS - 1; 178 + 179 + spin_lock_irqsave(&sbi->iostat_lock, flags); 180 + sbi->iostat_read_folio_count[order]++; 181 + spin_unlock_irqrestore(&sbi->iostat_lock, flags); 182 + 183 + f2fs_record_iostat(sbi); 184 184 } 185 185 186 186 void f2fs_update_iostat(struct f2fs_sb_info *sbi, struct inode *inode,
+4
fs/f2fs/iostat.h
··· 34 34 extern void f2fs_reset_iostat(struct f2fs_sb_info *sbi); 35 35 extern void f2fs_update_iostat(struct f2fs_sb_info *sbi, struct inode *inode, 36 36 enum iostat_type type, unsigned long long io_bytes); 37 + extern void f2fs_update_read_folio_count(struct f2fs_sb_info *sbi, 38 + struct folio *folio); 37 39 38 40 struct bio_iostat_ctx { 39 41 struct f2fs_sb_info *sbi; ··· 70 68 #else 71 69 static inline void f2fs_update_iostat(struct f2fs_sb_info *sbi, struct inode *inode, 72 70 enum iostat_type type, unsigned long long io_bytes) {} 71 + static inline void f2fs_update_read_folio_count(struct f2fs_sb_info *sbi, 72 + struct folio *folio) {} 73 73 static inline void iostat_update_and_unbind_ctx(struct bio *bio) {} 74 74 static inline void iostat_alloc_and_bind_ctx(struct f2fs_sb_info *sbi, 75 75 struct bio *bio, struct bio_post_read_ctx *ctx) {}
+16
fs/f2fs/namei.c
··· 83 83 if (set) { 84 84 if (total_count == F2FS_MAX_EXTENSION) 85 85 return -EINVAL; 86 + 87 + if (hot) { 88 + start = 0; 89 + count = cold_count; 90 + } else { 91 + start = cold_count; 92 + count = total_count; 93 + } 94 + for (i = start; i < count; i++) { 95 + if (!strcmp(name, extlist[i])) { 96 + f2fs_warn(sbi, "extension '%s' already exists in %s list", 97 + name, hot ? "cold" : "hot"); 98 + return -EINVAL; 99 + } 100 + } 86 101 } else { 87 102 if (!hot && !cold_count) 88 103 return -EINVAL; ··· 979 964 return err; 980 965 981 966 err = f2fs_create_whiteout(idmap, old_dir, &whiteout, &fname); 967 + f2fs_free_filename(&fname); 982 968 if (err) 983 969 return err; 984 970 }
+60 -54
fs/f2fs/node.c
··· 325 325 start, nr); 326 326 } 327 327 328 - bool f2fs_in_warm_node_list(struct f2fs_sb_info *sbi, struct folio *folio) 328 + bool f2fs_in_warm_node_list(struct folio *folio) 329 329 { 330 330 return is_node_folio(folio) && IS_DNODE(folio) && is_cold_node(folio); 331 331 } ··· 391 391 spin_unlock_irqrestore(&sbi->fsync_node_lock, flags); 392 392 } 393 393 394 - int f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid) 394 + bool f2fs_need_dentry_mark(struct f2fs_sb_info *sbi, nid_t nid) 395 395 { 396 396 struct f2fs_nm_info *nm_i = NM_I(sbi); 397 397 struct nat_entry *e; ··· 427 427 struct f2fs_nm_info *nm_i = NM_I(sbi); 428 428 struct nat_entry *e; 429 429 bool need_update = true; 430 + struct f2fs_lock_context lc; 430 431 432 + f2fs_down_read_trace(&sbi->node_write, &lc); 431 433 f2fs_down_read(&nm_i->nat_tree_lock); 432 434 e = __lookup_nat_cache(nm_i, ino, false); 433 435 if (e && get_nat_flag(e, HAS_LAST_FSYNC) && ··· 437 435 get_nat_flag(e, HAS_FSYNCED_INODE))) 438 436 need_update = false; 439 437 f2fs_up_read(&nm_i->nat_tree_lock); 438 + f2fs_up_read_trace(&sbi->node_write, &lc); 440 439 return need_update; 441 440 } 442 441 ··· 1116 1113 } 1117 1114 1118 1115 static int truncate_partial_nodes(struct dnode_of_data *dn, 1119 - struct f2fs_inode *ri, int *offset, int depth) 1116 + int *offset, int depth) 1120 1117 { 1121 1118 struct folio *folios[2]; 1122 1119 nid_t nid[3]; ··· 1187 1184 int err = 0, cont = 1; 1188 1185 int level, offset[4], noffset[4]; 1189 1186 unsigned int nofs = 0; 1190 - struct f2fs_inode *ri; 1191 1187 struct dnode_of_data dn; 1192 1188 struct folio *folio; 1193 1189 ··· 1214 1212 set_new_dnode(&dn, inode, folio, NULL, 0); 1215 1213 folio_unlock(folio); 1216 1214 1217 - ri = F2FS_INODE(folio); 1218 1215 switch (level) { 1219 1216 case 0: 1220 1217 case 1: ··· 1223 1222 nofs = noffset[1]; 1224 1223 if (!offset[level - 1]) 1225 1224 goto skip_partial; 1226 - err = truncate_partial_nodes(&dn, ri, offset, level); 1225 + err = truncate_partial_nodes(&dn, offset, level); 1227 1226 if (err < 0 && err != -ENOENT) 1228 1227 goto fail; 1229 1228 nofs += 1 + NIDS_PER_BLOCK; ··· 1232 1231 nofs = 5 + 2 * NIDS_PER_BLOCK; 1233 1232 if (!offset[level - 1]) 1234 1233 goto skip_partial; 1235 - err = truncate_partial_nodes(&dn, ri, offset, level); 1234 + err = truncate_partial_nodes(&dn, offset, level); 1236 1235 if (err < 0 && err != -ENOENT) 1237 1236 goto fail; 1238 1237 break; ··· 1730 1729 return last_folio; 1731 1730 } 1732 1731 1733 - static bool __write_node_folio(struct folio *folio, bool atomic, bool *submitted, 1734 - struct writeback_control *wbc, bool do_balance, 1735 - enum iostat_type io_type, unsigned int *seq_id) 1732 + static bool __write_node_folio(struct folio *folio, bool atomic, bool do_fsync, 1733 + bool *submitted, struct writeback_control *wbc, 1734 + bool do_balance, enum iostat_type io_type, 1735 + unsigned int *seq_id) 1736 1736 { 1737 1737 struct f2fs_sb_info *sbi = F2FS_F_SB(folio); 1738 1738 nid_t nid; ··· 1778 1776 1779 1777 if (f2fs_sanity_check_node_footer(sbi, folio, nid, 1780 1778 NODE_TYPE_REGULAR, false)) { 1781 - f2fs_handle_critical_error(sbi, STOP_CP_REASON_CORRUPTED_NID); 1779 + f2fs_stop_checkpoint(sbi, false, STOP_CP_REASON_CORRUPTED_NID); 1782 1780 goto redirty_out; 1783 1781 } 1784 1782 ··· 1803 1801 goto redirty_out; 1804 1802 } 1805 1803 1806 - if (atomic) { 1807 - if (!test_opt(sbi, NOBARRIER)) 1808 - fio.op_flags |= REQ_PREFLUSH | REQ_FUA; 1809 - if (IS_INODE(folio)) 1810 - set_dentry_mark(folio, 1804 + if (atomic && !test_opt(sbi, NOBARRIER)) 1805 + fio.op_flags |= REQ_PREFLUSH | REQ_FUA; 1806 + 1807 + set_dentry_mark(folio, false); 1808 + set_fsync_mark(folio, do_fsync); 1809 + if (IS_INODE(folio) && (atomic || is_fsync_dnode(folio))) 1810 + set_dentry_mark(folio, 1811 1811 f2fs_need_dentry_mark(sbi, ino_of_node(folio))); 1812 - } 1813 1812 1814 1813 /* should add to global list before clearing PAGECACHE status */ 1815 - if (f2fs_in_warm_node_list(sbi, folio)) { 1814 + if (f2fs_in_warm_node_list(folio)) { 1816 1815 seq = f2fs_add_fsync_node_entry(sbi, folio); 1817 1816 if (seq_id) 1818 1817 *seq_id = seq; ··· 1846 1843 return false; 1847 1844 } 1848 1845 1849 - int f2fs_move_node_folio(struct folio *node_folio, int gc_type) 1846 + int f2fs_write_single_node_folio(struct folio *node_folio, int sync_mode, 1847 + bool mark_dirty, enum iostat_type io_type) 1850 1848 { 1851 1849 int err = 0; 1850 + struct writeback_control wbc = { 1851 + .sync_mode = WB_SYNC_ALL, 1852 + .nr_to_write = 1, 1853 + }; 1852 1854 1853 - if (gc_type == FG_GC) { 1854 - struct writeback_control wbc = { 1855 - .sync_mode = WB_SYNC_ALL, 1856 - .nr_to_write = 1, 1857 - }; 1858 - 1859 - f2fs_folio_wait_writeback(node_folio, NODE, true, true); 1860 - 1861 - folio_mark_dirty(node_folio); 1862 - 1863 - if (!folio_clear_dirty_for_io(node_folio)) { 1864 - err = -EAGAIN; 1865 - goto out_page; 1866 - } 1867 - 1868 - if (!__write_node_folio(node_folio, false, NULL, 1869 - &wbc, false, FS_GC_NODE_IO, NULL)) 1870 - err = -EAGAIN; 1871 - goto release_page; 1872 - } else { 1855 + if (!sync_mode) { 1873 1856 /* set page dirty and write it */ 1874 1857 if (!folio_test_writeback(node_folio)) 1875 1858 folio_mark_dirty(node_folio); 1859 + goto out_folio; 1876 1860 } 1877 - out_page: 1861 + 1862 + f2fs_folio_wait_writeback(node_folio, NODE, true, true); 1863 + 1864 + if (mark_dirty) 1865 + folio_mark_dirty(node_folio); 1866 + else if (!folio_test_dirty(node_folio)) 1867 + goto out_folio; 1868 + 1869 + if (!folio_clear_dirty_for_io(node_folio)) { 1870 + err = -EAGAIN; 1871 + goto out_folio; 1872 + } 1873 + 1874 + if (!__write_node_folio(node_folio, false, false, NULL, 1875 + &wbc, false, FS_GC_NODE_IO, NULL)) 1876 + err = -EAGAIN; 1877 + goto release_folio; 1878 + out_folio: 1878 1879 folio_unlock(node_folio); 1879 - release_page: 1880 + release_folio: 1880 1881 f2fs_folio_put(node_folio, false); 1881 1882 return err; 1883 + } 1884 + 1885 + int f2fs_move_node_folio(struct folio *node_folio, int gc_type) 1886 + { 1887 + return f2fs_write_single_node_folio(node_folio, gc_type == FG_GC, 1888 + true, FS_GC_NODE_IO); 1882 1889 } 1883 1890 1884 1891 int f2fs_fsync_node_pages(struct f2fs_sb_info *sbi, struct inode *inode, ··· 1921 1908 for (i = 0; i < nr_folios; i++) { 1922 1909 struct folio *folio = fbatch.folios[i]; 1923 1910 bool submitted = false; 1911 + bool do_fsync = false; 1924 1912 1925 1913 if (unlikely(f2fs_cp_error(sbi))) { 1926 1914 f2fs_folio_put(last_folio, false); ··· 1952 1938 1953 1939 f2fs_folio_wait_writeback(folio, NODE, true, true); 1954 1940 1955 - set_fsync_mark(folio, 0); 1956 - set_dentry_mark(folio, 0); 1957 - 1958 1941 if (!atomic || folio == last_folio) { 1959 - set_fsync_mark(folio, 1); 1942 + do_fsync = true; 1960 1943 percpu_counter_inc(&sbi->rf_node_block_count); 1961 1944 if (IS_INODE(folio)) { 1962 1945 if (is_inode_flag_set(inode, 1963 1946 FI_DIRTY_INODE)) 1964 1947 f2fs_update_inode(inode, folio); 1965 - if (!atomic) 1966 - set_dentry_mark(folio, 1967 - f2fs_need_dentry_mark(sbi, ino)); 1968 1948 } 1969 1949 /* may be written by other thread */ 1970 1950 if (!folio_test_dirty(folio)) ··· 1970 1962 1971 1963 if (!__write_node_folio(folio, atomic && 1972 1964 folio == last_folio, 1973 - &submitted, wbc, true, 1974 - FS_NODE_IO, seq_id)) { 1965 + do_fsync, &submitted, 1966 + wbc, true, FS_NODE_IO, 1967 + seq_id)) { 1975 1968 f2fs_folio_put(last_folio, false); 1976 1969 folio_batch_release(&fbatch); 1977 1970 ret = -EIO; ··· 2172 2163 if (!folio_clear_dirty_for_io(folio)) 2173 2164 goto continue_unlock; 2174 2165 2175 - set_fsync_mark(folio, 0); 2176 - set_dentry_mark(folio, 0); 2177 - 2178 - if (!__write_node_folio(folio, false, &submitted, 2166 + if (!__write_node_folio(folio, false, false, &submitted, 2179 2167 wbc, do_balance, io_type, NULL)) { 2180 2168 folio_batch_release(&fbatch); 2181 2169 ret = -EIO;
+11 -12
fs/f2fs/node.h
··· 400 400 #define is_fsync_dnode(folio) is_node(folio, FSYNC_BIT_SHIFT) 401 401 #define is_dent_dnode(folio) is_node(folio, DENT_BIT_SHIFT) 402 402 403 - static inline void set_cold_node(const struct folio *folio, bool is_dir) 403 + static inline void __set_mark(const struct folio *folio, bool mark, int type) 404 404 { 405 405 struct f2fs_node *rn = F2FS_NODE(folio); 406 406 unsigned int flag = le32_to_cpu(rn->footer.flag); 407 407 408 - if (is_dir) 409 - flag &= ~BIT(COLD_BIT_SHIFT); 410 - else 411 - flag |= BIT(COLD_BIT_SHIFT); 412 - rn->footer.flag = cpu_to_le32(flag); 413 - } 414 - 415 - static inline void set_mark(struct folio *folio, int mark, int type) 416 - { 417 - struct f2fs_node *rn = F2FS_NODE(folio); 418 - unsigned int flag = le32_to_cpu(rn->footer.flag); 419 408 if (mark) 420 409 flag |= BIT(type); 421 410 else 422 411 flag &= ~BIT(type); 423 412 rn->footer.flag = cpu_to_le32(flag); 413 + } 414 + 415 + static inline void set_cold_node(const struct folio *folio, bool is_dir) 416 + { 417 + __set_mark(folio, !is_dir, COLD_BIT_SHIFT); 418 + } 419 + 420 + static inline void set_mark(struct folio *folio, bool mark, int type) 421 + { 422 + __set_mark(folio, mark, type); 424 423 425 424 #ifdef CONFIG_F2FS_CHECK_FS 426 425 f2fs_inode_chksum_set(F2FS_F_SB(folio), folio);
+15 -5
fs/f2fs/segment.c
··· 1606 1606 if (dc->state != D_PREP) 1607 1607 goto next; 1608 1608 1609 + if (*issued > 0 && unlikely(freezing(current))) 1610 + break; 1611 + 1609 1612 if (dpolicy->io_aware && !is_idle(sbi, DISCARD_TIME)) { 1610 1613 io_interrupted = true; 1611 1614 break; ··· 1648 1645 struct blk_plug plug; 1649 1646 int i, issued; 1650 1647 bool io_interrupted = false; 1648 + bool suspended = false; 1651 1649 1652 1650 if (dpolicy->timeout) 1653 1651 f2fs_update_time(sbi, UMOUNT_DISCARD_TIMEOUT); ··· 1679 1675 list_for_each_entry_safe(dc, tmp, pend_list, list) { 1680 1676 f2fs_bug_on(sbi, dc->state != D_PREP); 1681 1677 1678 + if (issued > 0 && unlikely(freezing(current))) { 1679 + suspended = true; 1680 + break; 1681 + } 1682 + 1682 1683 if (dpolicy->timeout && 1683 1684 f2fs_time_over(sbi, UMOUNT_DISCARD_TIMEOUT)) 1684 1685 break; ··· 1703 1694 next: 1704 1695 mutex_unlock(&dcc->cmd_lock); 1705 1696 1706 - if (issued >= dpolicy->max_requests || io_interrupted) 1697 + if (issued >= dpolicy->max_requests || io_interrupted || 1698 + suspended) 1707 1699 break; 1708 1700 } 1709 1701 ··· 1890 1880 * 1891 1881 * Return true if issued all discard cmd or no discard cmd need issue, otherwise return false. 1892 1882 */ 1893 - bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi) 1883 + bool f2fs_issue_discard_timeout(struct f2fs_sb_info *sbi, bool need_check) 1894 1884 { 1895 1885 struct discard_cmd_control *dcc = SM_I(sbi)->dcc_info; 1896 1886 struct discard_policy dpolicy; ··· 1907 1897 /* just to make sure there is no pending discard commands */ 1908 1898 __wait_all_discard_cmd(sbi, NULL); 1909 1899 1910 - f2fs_bug_on(sbi, atomic_read(&dcc->discard_cmd_cnt)); 1900 + f2fs_bug_on(sbi, need_check && atomic_read(&dcc->discard_cmd_cnt)); 1911 1901 return !dropped; 1912 1902 } 1913 1903 ··· 2377 2367 * Recovery can cache discard commands, so in error path of 2378 2368 * fill_super(), it needs to give a chance to handle them. 2379 2369 */ 2380 - f2fs_issue_discard_timeout(sbi); 2370 + f2fs_issue_discard_timeout(sbi, true); 2381 2371 2382 2372 kfree(dcc); 2383 2373 SM_I(sbi)->dcc_info = NULL; ··· 3990 3980 if (fscrypt_inode_uses_fs_layer_crypto(folio->mapping->host)) 3991 3981 fscrypt_finalize_bounce_page(&fio->encrypted_page); 3992 3982 folio_end_writeback(folio); 3993 - if (f2fs_in_warm_node_list(fio->sbi, folio)) 3983 + if (f2fs_in_warm_node_list(folio)) 3994 3984 f2fs_del_fsync_node_entry(fio->sbi, folio); 3995 3985 f2fs_bug_on(fio->sbi, !is_set_ckpt_flags(fio->sbi, 3996 3986 CP_ERROR_FLAG));
+54 -16
fs/f2fs/super.c
··· 336 336 fsparam_flag("usrquota", Opt_usrquota), 337 337 fsparam_flag("grpquota", Opt_grpquota), 338 338 fsparam_flag("prjquota", Opt_prjquota), 339 - fsparam_string_empty("usrjquota", Opt_usrjquota), 340 - fsparam_string_empty("grpjquota", Opt_grpjquota), 341 - fsparam_string_empty("prjjquota", Opt_prjjquota), 339 + fsparam_string("usrjquota", Opt_usrjquota), 340 + fsparam_flag("usrjquota", Opt_usrjquota), 341 + fsparam_string("grpjquota", Opt_grpjquota), 342 + fsparam_flag("grpjquota", Opt_grpjquota), 343 + fsparam_string("prjjquota", Opt_prjjquota), 344 + fsparam_flag("prjjquota", Opt_prjjquota), 342 345 fsparam_flag("nat_bits", Opt_nat_bits), 343 346 fsparam_enum("jqfmt", Opt_jqfmt, f2fs_param_jqfmt), 344 347 fsparam_enum("alloc_mode", Opt_alloc, f2fs_param_alloc_mode), ··· 982 979 ctx_set_opt(ctx, F2FS_MOUNT_PRJQUOTA); 983 980 break; 984 981 case Opt_usrjquota: 985 - if (!*param->string) 986 - ret = f2fs_unnote_qf_name(fc, USRQUOTA); 987 - else 982 + if (param->type == fs_value_is_string && *param->string) 988 983 ret = f2fs_note_qf_name(fc, USRQUOTA, param); 984 + else 985 + ret = f2fs_unnote_qf_name(fc, USRQUOTA); 989 986 if (ret) 990 987 return ret; 991 988 break; 992 989 case Opt_grpjquota: 993 - if (!*param->string) 994 - ret = f2fs_unnote_qf_name(fc, GRPQUOTA); 995 - else 990 + if (param->type == fs_value_is_string && *param->string) 996 991 ret = f2fs_note_qf_name(fc, GRPQUOTA, param); 992 + else 993 + ret = f2fs_unnote_qf_name(fc, GRPQUOTA); 997 994 if (ret) 998 995 return ret; 999 996 break; 1000 997 case Opt_prjjquota: 1001 - if (!*param->string) 1002 - ret = f2fs_unnote_qf_name(fc, PRJQUOTA); 1003 - else 998 + if (param->type == fs_value_is_string && *param->string) 1004 999 ret = f2fs_note_qf_name(fc, PRJQUOTA, param); 1000 + else 1001 + ret = f2fs_unnote_qf_name(fc, PRJQUOTA); 1005 1002 if (ret) 1006 1003 return ret; 1007 1004 break; ··· 1518 1515 F2FS_OPTION(sbi).root_reserved_blocks); 1519 1516 ctx_clear_opt(ctx, F2FS_MOUNT_RESERVE_ROOT); 1520 1517 ctx->opt_mask &= ~BIT(F2FS_MOUNT_RESERVE_ROOT); 1518 + ctx->spec_mask &= ~F2FS_SPEC_reserve_root; 1521 1519 } 1522 1520 if (test_opt(sbi, RESERVE_NODE) && 1523 1521 (ctx->opt_mask & BIT(F2FS_MOUNT_RESERVE_NODE)) && ··· 1527 1523 F2FS_OPTION(sbi).root_reserved_nodes); 1528 1524 ctx_clear_opt(ctx, F2FS_MOUNT_RESERVE_NODE); 1529 1525 ctx->opt_mask &= ~BIT(F2FS_MOUNT_RESERVE_NODE); 1526 + ctx->spec_mask &= ~F2FS_SPEC_reserve_node; 1530 1527 } 1531 1528 1532 1529 err = f2fs_check_test_dummy_encryption(fc, sb); ··· 2014 2009 } 2015 2010 2016 2011 /* be sure to wait for any on-going discard commands */ 2017 - done = f2fs_issue_discard_timeout(sbi); 2012 + done = f2fs_issue_discard_timeout(sbi, true); 2018 2013 if (f2fs_realtime_discard_enable(sbi) && !sbi->discard_blks && done) { 2019 2014 struct cp_control cpc = { 2020 2015 .reason = CP_UMOUNT | CP_TRIMMED, ··· 2093 2088 #if IS_ENABLED(CONFIG_UNICODE) 2094 2089 utf8_unload(sb->s_encoding); 2095 2090 #endif 2091 + sync_blockdev(sb->s_bdev); 2092 + invalidate_bdev(sb->s_bdev); 2093 + for (i = 1; i < sbi->s_ndevs; i++) { 2094 + sync_blockdev(FDEV(i).bdev); 2095 + invalidate_bdev(FDEV(i).bdev); 2096 + } 2096 2097 } 2097 2098 2098 2099 int f2fs_sync_fs(struct super_block *sb, int sync) ··· 2163 2152 * will recover after removal of snapshot. 2164 2153 */ 2165 2154 if (test_opt(sbi, DISCARD) && !f2fs_hw_support_discard(sbi)) 2166 - f2fs_issue_discard_timeout(sbi); 2155 + f2fs_issue_discard_timeout(sbi, true); 2167 2156 2168 2157 clear_sbi_flag(F2FS_SB(sb), SBI_IS_FREEZING); 2169 2158 return 0; ··· 2968 2957 need_stop_discard = true; 2969 2958 } else { 2970 2959 f2fs_stop_discard_thread(sbi); 2971 - f2fs_issue_discard_timeout(sbi); 2960 + /* 2961 + * f2fs_ioc_fitrim() won't race w/ "remount ro" 2962 + * so it's safe to check discard_cmd_cnt in 2963 + * f2fs_issue_discard_timeout(). 2964 + */ 2965 + f2fs_issue_discard_timeout(sbi, flags & SB_RDONLY); 2972 2966 need_restart_discard = true; 2973 2967 } 2974 2968 } ··· 4666 4650 || system_state == SYSTEM_RESTART; 4667 4651 } 4668 4652 4669 - void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, unsigned char reason) 4653 + static void f2fs_handle_critical_error(struct f2fs_sb_info *sbi, 4654 + unsigned char reason) 4670 4655 { 4671 4656 struct super_block *sb = sbi->sb; 4672 4657 bool shutdown = reason == STOP_CP_REASON_SHUTDOWN; ··· 4723 4706 * freeze_super() which will lead to deadlocks and other problems. 4724 4707 */ 4725 4708 } 4709 + 4710 + void f2fs_stop_checkpoint(struct f2fs_sb_info *sbi, bool end_io, 4711 + unsigned char reason) 4712 + { 4713 + f2fs_build_fault_attr(sbi, 0, 0, FAULT_ALL); 4714 + if (!end_io) 4715 + f2fs_flush_merged_writes(sbi); 4716 + f2fs_handle_critical_error(sbi, reason); 4717 + } 4718 + 4726 4719 4727 4720 static void f2fs_record_error_work(struct work_struct *work) 4728 4721 { ··· 4975 4948 init_f2fs_rwsem_trace(&sbi->gc_lock, sbi, LOCK_NAME_GC_LOCK); 4976 4949 mutex_init(&sbi->writepages); 4977 4950 init_f2fs_rwsem_trace(&sbi->cp_global_sem, sbi, LOCK_NAME_CP_GLOBAL); 4951 + #ifdef CONFIG_DEBUG_LOCK_ALLOC 4952 + lockdep_register_key(&sbi->cp_global_sem_key); 4953 + lockdep_set_class(&sbi->cp_global_sem.internal_rwsem, 4954 + &sbi->cp_global_sem_key); 4955 + #endif 4978 4956 init_f2fs_rwsem_trace(&sbi->node_write, sbi, LOCK_NAME_NODE_WRITE); 4979 4957 init_f2fs_rwsem_trace(&sbi->node_change, sbi, LOCK_NAME_NODE_CHANGE); 4980 4958 spin_lock_init(&sbi->stat_lock); ··· 5451 5419 free_sb_buf: 5452 5420 kfree(raw_super); 5453 5421 free_sbi: 5422 + #ifdef CONFIG_DEBUG_LOCK_ALLOC 5423 + lockdep_unregister_key(&sbi->cp_global_sem_key); 5424 + #endif 5454 5425 kfree(sbi); 5455 5426 sb->s_fs_info = NULL; 5456 5427 ··· 5535 5500 /* Release block devices last, after fscrypt_destroy_keyring(). */ 5536 5501 if (sbi) { 5537 5502 destroy_device_list(sbi); 5503 + #ifdef CONFIG_DEBUG_LOCK_ALLOC 5504 + lockdep_unregister_key(&sbi->cp_global_sem_key); 5505 + #endif 5538 5506 kfree(sbi); 5539 5507 sb->s_fs_info = NULL; 5540 5508 }
+21 -6
fs/f2fs/sysfs.c
··· 338 338 f2fs_update_sit_info(sbi); 339 339 return sysfs_emit(buf, "%llu\n", (unsigned long long)(si->avg_vblocks)); 340 340 } 341 + 342 + static ssize_t defrag_blocks_show(struct f2fs_attr *a, 343 + struct f2fs_sb_info *sbi, char *buf) 344 + { 345 + struct f2fs_stat_info *si = F2FS_STAT(sbi); 346 + 347 + return sysfs_emit(buf, "%llu\n", (unsigned long long)(si->defrag_blks)); 348 + } 341 349 #endif 342 350 343 351 static ssize_t main_blkaddr_show(struct f2fs_attr *a, ··· 387 379 if (!strcmp(a->attr.name, "extension_list")) { 388 380 __u8 (*extlist)[F2FS_EXTENSION_LEN] = 389 381 sbi->raw_super->extension_list; 390 - int cold_count = le32_to_cpu(sbi->raw_super->extension_count); 391 - int hot_count = sbi->raw_super->hot_ext_count; 382 + int cold_count, hot_count; 392 383 int len = 0, i; 393 384 385 + f2fs_down_read(&sbi->sb_lock); 386 + cold_count = le32_to_cpu(sbi->raw_super->extension_count); 387 + hot_count = sbi->raw_super->hot_ext_count; 394 388 len += sysfs_emit_at(buf, len, "cold file extension:\n"); 395 389 for (i = 0; i < cold_count; i++) 396 390 len += sysfs_emit_at(buf, len, "%s\n", extlist[i]); ··· 400 390 len += sysfs_emit_at(buf, len, "hot file extension:\n"); 401 391 for (i = cold_count; i < cold_count + hot_count; i++) 402 392 len += sysfs_emit_at(buf, len, "%s\n", extlist[i]); 393 + f2fs_up_read(&sbi->sb_lock); 403 394 404 395 return len; 405 396 } ··· 1362 1351 F2FS_GENERAL_RO_ATTR(moved_blocks_background); 1363 1352 F2FS_GENERAL_RO_ATTR(moved_blocks_foreground); 1364 1353 F2FS_GENERAL_RO_ATTR(avg_vblocks); 1354 + F2FS_GENERAL_RO_ATTR(defrag_blocks); 1365 1355 #endif 1366 1356 1367 1357 #ifdef CONFIG_FS_ENCRYPTION ··· 1485 1473 ATTR_LIST(moved_blocks_foreground), 1486 1474 ATTR_LIST(moved_blocks_background), 1487 1475 ATTR_LIST(avg_vblocks), 1476 + ATTR_LIST(defrag_blocks), 1488 1477 #endif 1489 1478 #ifdef CONFIG_BLK_DEV_ZONED 1490 1479 ATTR_LIST(unusable_blocks_per_sec), ··· 1997 1984 ret = kobject_init_and_add(&f2fs_feat, &f2fs_feat_ktype, 1998 1985 NULL, "features"); 1999 1986 if (ret) 2000 - goto put_kobject; 1987 + goto unregister_kset; 2001 1988 2002 1989 ret = kobject_init_and_add(&f2fs_tune, &f2fs_tune_ktype, 2003 1990 NULL, "tuning"); 2004 1991 if (ret) 2005 - goto put_kobject; 1992 + goto put_feat; 2006 1993 2007 1994 f2fs_proc_root = proc_mkdir("fs/f2fs", NULL); 2008 1995 if (!f2fs_proc_root) { 2009 1996 ret = -ENOMEM; 2010 - goto put_kobject; 1997 + goto put_tune; 2011 1998 } 2012 1999 2013 2000 return 0; 2014 2001 2015 - put_kobject: 2002 + put_tune: 2016 2003 kobject_put(&f2fs_tune); 2004 + put_feat: 2017 2005 kobject_put(&f2fs_feat); 2006 + unregister_kset: 2018 2007 kset_unregister(&f2fs_kset); 2019 2008 return ret; 2020 2009 }
+3
include/linux/f2fs_fs.h
··· 80 80 STOP_CP_REASON_NO_SEGMENT, 81 81 STOP_CP_REASON_CORRUPTED_FREE_BITMAP, 82 82 STOP_CP_REASON_CORRUPTED_NID, 83 + STOP_CP_REASON_READ_META, 84 + STOP_CP_REASON_READ_NODE, 85 + STOP_CP_REASON_READ_DATA, 83 86 STOP_CP_REASON_MAX, 84 87 }; 85 88
+17 -4
include/trace/events/f2fs.h
··· 2116 2116 #ifdef CONFIG_F2FS_IOSTAT 2117 2117 TRACE_EVENT(f2fs_iostat, 2118 2118 2119 - TP_PROTO(struct f2fs_sb_info *sbi, unsigned long long *iostat), 2119 + TP_PROTO(struct f2fs_sb_info *sbi, unsigned long long *iostat, 2120 + unsigned long long *read_folio_count), 2120 2121 2121 - TP_ARGS(sbi, iostat), 2122 + TP_ARGS(sbi, iostat, read_folio_count), 2122 2123 2123 2124 TP_STRUCT__entry( 2124 2125 __field(dev_t, dev) ··· 2151 2150 __field(unsigned long long, fs_mrio) 2152 2151 __field(unsigned long long, fs_discard) 2153 2152 __field(unsigned long long, fs_reset_zone) 2153 + __array(unsigned long long, read_folio_count, 11) 2154 2154 ), 2155 2155 2156 2156 TP_fast_assign( ··· 2184 2182 __entry->fs_mrio = iostat[FS_META_READ_IO]; 2185 2183 __entry->fs_discard = iostat[FS_DISCARD_IO]; 2186 2184 __entry->fs_reset_zone = iostat[FS_ZONE_RESET_IO]; 2185 + memset(__entry->read_folio_count, 0, sizeof(__entry->read_folio_count)); 2186 + memcpy(__entry->read_folio_count, read_folio_count, 2187 + sizeof(unsigned long long) * min_t(int, NR_PAGE_ORDERS, 11)); 2187 2188 ), 2188 2189 2189 2190 TP_printk("dev = (%d,%d), " ··· 2199 2194 "app [read=%llu (direct=%llu, buffered=%llu), mapped=%llu], " 2200 2195 "compr(buffered=%llu, mapped=%llu)], " 2201 2196 "fs [data=%llu, (gc_data=%llu, cdata=%llu), " 2202 - "node=%llu, meta=%llu]", 2197 + "node=%llu, meta=%llu], " 2198 + "read_folio_count [0=%llu, 1=%llu, 2=%llu, 3=%llu, 4=%llu, " 2199 + "5=%llu, 6=%llu, 7=%llu, 8=%llu, 9=%llu, 10=%llu]", 2203 2200 show_dev(__entry->dev), __entry->app_wio, __entry->app_dio, 2204 2201 __entry->app_bio, __entry->app_mio, __entry->app_bcdio, 2205 2202 __entry->app_mcdio, __entry->fs_dio, __entry->fs_cdio, ··· 2212 2205 __entry->app_rio, __entry->app_drio, __entry->app_brio, 2213 2206 __entry->app_mrio, __entry->app_bcrio, __entry->app_mcrio, 2214 2207 __entry->fs_drio, __entry->fs_gdrio, 2215 - __entry->fs_cdrio, __entry->fs_nrio, __entry->fs_mrio) 2208 + __entry->fs_cdrio, __entry->fs_nrio, __entry->fs_mrio, 2209 + __entry->read_folio_count[0], __entry->read_folio_count[1], 2210 + __entry->read_folio_count[2], __entry->read_folio_count[3], 2211 + __entry->read_folio_count[4], __entry->read_folio_count[5], 2212 + __entry->read_folio_count[6], __entry->read_folio_count[7], 2213 + __entry->read_folio_count[8], __entry->read_folio_count[9], 2214 + __entry->read_folio_count[10]) 2216 2215 ); 2217 2216 2218 2217 #ifndef __F2FS_IOSTAT_LATENCY_TYPE