Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

Merge tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs fixes from Amir Goldstein:

- Fix regression in 'xino' feature detection

I clumsily introduced this regression myself when working on another
subsystem (fsnotify). Both the regression and the fix have almost no
visible impact on users except for some kmsg prints.

- Fix to performance regression in v6.12.

This regression was reported by Google COS developers.

It is not uncommon these days for the year-old mature LTS to get
adopted by distros and get exposed to many new workloads. We made a
sub-smart move of making a behavior change in v6.12 which could
impact performance, without making it opt-in. Fixing this mistake
retroactively, to be picked by LTS.

* tag 'ovl-fixes-7.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
ovl: make fsync after metadata copy-up opt-in mount option
ovl: fix wrong detection of 32bit inode numbers

+108 -16
+50
Documentation/filesystems/overlayfs.rst
··· 783 783 mounted with "uuid=on". 784 784 785 785 786 + Durability and copy up 787 + ---------------------- 788 + 789 + The fsync(2) system call ensures that the data and metadata of a file 790 + are safely written to the backing storage, which is expected to 791 + guarantee the existence of the information post system crash. 792 + 793 + Without an fsync(2) call, there is no guarantee that the observed 794 + data after a system crash will be either the old or the new data, but 795 + in practice, the observed data after crash is often the old or new data 796 + or a mix of both. 797 + 798 + When an overlayfs file is modified for the first time, copy up will 799 + create a copy of the lower file and its parent directories in the upper 800 + layer. Since the Linux filesystem API does not enforce any particular 801 + ordering on storing changes without explicit fsync(2) calls, in case 802 + of a system crash, the upper file could end up with no data at all 803 + (i.e. zeros), which would be an unusual outcome. To avoid this 804 + experience, overlayfs calls fsync(2) on the upper file before completing 805 + data copy up with rename(2) or link(2) to make the copy up "atomic". 806 + 807 + By default, overlayfs does not explicitly call fsync(2) on copied up 808 + directories or on metadata-only copy up, so it provides no guarantee to 809 + persist the user's modification unless the user calls fsync(2). 810 + The fsync during copy up only guarantees that if a copy up is observed 811 + after a crash, the observed data is not zeroes or intermediate values 812 + from the copy up staging area. 813 + 814 + On traditional local filesystems with a single journal (e.g. ext4, xfs), 815 + fsync on a file also persists the parent directory changes, because they 816 + are usually modified in the same transaction, so metadata durability during 817 + data copy up effectively comes for free. Overlayfs further limits risk by 818 + disallowing network filesystems as upper layer. 819 + 820 + Overlayfs can be tuned to prefer performance or durability when storing 821 + to the underlying upper layer. This is controlled by the "fsync" mount 822 + option, which supports these values: 823 + 824 + - "auto": (default) 825 + Call fsync(2) on upper file before completion of data copy up. 826 + No explicit fsync(2) on directory or metadata-only copy up. 827 + - "strict": 828 + Call fsync(2) on upper file and directories before completion of any 829 + copy up. 830 + - "volatile": [*] 831 + Prefer performance over durability (see `Volatile mount`_) 832 + 833 + [*] The mount option "volatile" is an alias to "fsync=volatile". 834 + 835 + 786 836 Volatile mount 787 837 -------------- 788 838
+3 -3
fs/overlayfs/copy_up.c
··· 1146 1146 return -EOVERFLOW; 1147 1147 1148 1148 /* 1149 - * With metacopy disabled, we fsync after final metadata copyup, for 1149 + * With "fsync=strict", we fsync after final metadata copyup, for 1150 1150 * both regular files and directories to get atomic copyup semantics 1151 1151 * on filesystems that do not use strict metadata ordering (e.g. ubifs). 1152 1152 * 1153 - * With metacopy enabled we want to avoid fsync on all meta copyup 1153 + * By default, we want to avoid fsync on all meta copyup, because 1154 1154 * that will hurt performance of workloads such as chown -R, so we 1155 1155 * only fsync on data copyup as legacy behavior. 1156 1156 */ 1157 - ctx.metadata_fsync = !OVL_FS(dentry->d_sb)->config.metacopy && 1157 + ctx.metadata_fsync = ovl_should_sync_metadata(OVL_FS(dentry->d_sb)) && 1158 1158 (S_ISREG(ctx.stat.mode) || S_ISDIR(ctx.stat.mode)); 1159 1159 ctx.metacopy = ovl_need_meta_copy_up(dentry, ctx.stat.mode, flags); 1160 1160
+21
fs/overlayfs/overlayfs.h
··· 99 99 OVL_VERITY_REQUIRE, 100 100 }; 101 101 102 + enum { 103 + OVL_FSYNC_VOLATILE, 104 + OVL_FSYNC_AUTO, 105 + OVL_FSYNC_STRICT, 106 + }; 107 + 102 108 /* 103 109 * The tuple (fh,uuid) is a universal unique identifier for a copy up origin, 104 110 * where: ··· 660 654 static inline bool ovl_xino_warn(struct ovl_fs *ofs) 661 655 { 662 656 return ofs->config.xino == OVL_XINO_ON; 657 + } 658 + 659 + static inline bool ovl_should_sync(struct ovl_fs *ofs) 660 + { 661 + return ofs->config.fsync_mode != OVL_FSYNC_VOLATILE; 662 + } 663 + 664 + static inline bool ovl_should_sync_metadata(struct ovl_fs *ofs) 665 + { 666 + return ofs->config.fsync_mode == OVL_FSYNC_STRICT; 667 + } 668 + 669 + static inline bool ovl_is_volatile(struct ovl_config *config) 670 + { 671 + return config->fsync_mode == OVL_FSYNC_VOLATILE; 663 672 } 664 673 665 674 /*
+1 -6
fs/overlayfs/ovl_entry.h
··· 18 18 int xino; 19 19 bool metacopy; 20 20 bool userxattr; 21 - bool ovl_volatile; 21 + int fsync_mode; 22 22 }; 23 23 24 24 struct ovl_sb { ··· 118 118 WARN_ON_ONCE(sb->s_type != &ovl_fs_type); 119 119 120 120 return (struct ovl_fs *)sb->s_fs_info; 121 - } 122 - 123 - static inline bool ovl_should_sync(struct ovl_fs *ofs) 124 - { 125 - return !ofs->config.ovl_volatile; 126 121 } 127 122 128 123 static inline unsigned int ovl_numlower(struct ovl_entry *oe)
+28 -5
fs/overlayfs/params.c
··· 58 58 Opt_xino, 59 59 Opt_metacopy, 60 60 Opt_verity, 61 + Opt_fsync, 61 62 Opt_volatile, 62 63 Opt_override_creds, 63 64 }; ··· 141 140 return OVL_VERITY_OFF; 142 141 } 143 142 143 + static const struct constant_table ovl_parameter_fsync[] = { 144 + { "volatile", OVL_FSYNC_VOLATILE }, 145 + { "auto", OVL_FSYNC_AUTO }, 146 + { "strict", OVL_FSYNC_STRICT }, 147 + {} 148 + }; 149 + 150 + static const char *ovl_fsync_mode(struct ovl_config *config) 151 + { 152 + return ovl_parameter_fsync[config->fsync_mode].name; 153 + } 154 + 155 + static int ovl_fsync_mode_def(void) 156 + { 157 + return OVL_FSYNC_AUTO; 158 + } 159 + 144 160 const struct fs_parameter_spec ovl_parameter_spec[] = { 145 161 fsparam_string_empty("lowerdir", Opt_lowerdir), 146 162 fsparam_file_or_string("lowerdir+", Opt_lowerdir_add), ··· 173 155 fsparam_enum("xino", Opt_xino, ovl_parameter_xino), 174 156 fsparam_enum("metacopy", Opt_metacopy, ovl_parameter_bool), 175 157 fsparam_enum("verity", Opt_verity, ovl_parameter_verity), 158 + fsparam_enum("fsync", Opt_fsync, ovl_parameter_fsync), 176 159 fsparam_flag("volatile", Opt_volatile), 177 160 fsparam_flag_no("override_creds", Opt_override_creds), 178 161 {} ··· 684 665 case Opt_verity: 685 666 config->verity_mode = result.uint_32; 686 667 break; 668 + case Opt_fsync: 669 + config->fsync_mode = result.uint_32; 670 + break; 687 671 case Opt_volatile: 688 - config->ovl_volatile = true; 672 + config->fsync_mode = OVL_FSYNC_VOLATILE; 689 673 break; 690 674 case Opt_userxattr: 691 675 config->userxattr = true; ··· 822 800 ofs->config.nfs_export = ovl_nfs_export_def; 823 801 ofs->config.xino = ovl_xino_def(); 824 802 ofs->config.metacopy = ovl_metacopy_def; 803 + ofs->config.fsync_mode = ovl_fsync_mode_def(); 825 804 826 805 fc->s_fs_info = ofs; 827 806 fc->fs_private = ctx; ··· 893 870 config->index = false; 894 871 } 895 872 896 - if (!config->upperdir && config->ovl_volatile) { 873 + if (!config->upperdir && ovl_is_volatile(config)) { 897 874 pr_info("option \"volatile\" is meaningless in a non-upper mount, ignoring it.\n"); 898 - config->ovl_volatile = false; 875 + config->fsync_mode = ovl_fsync_mode_def(); 899 876 } 900 877 901 878 if (!config->upperdir && config->uuid == OVL_UUID_ON) { ··· 1093 1070 seq_printf(m, ",xino=%s", ovl_xino_mode(&ofs->config)); 1094 1071 if (ofs->config.metacopy != ovl_metacopy_def) 1095 1072 seq_printf(m, ",metacopy=%s", str_on_off(ofs->config.metacopy)); 1096 - if (ofs->config.ovl_volatile) 1097 - seq_puts(m, ",volatile"); 1073 + if (ofs->config.fsync_mode != ovl_fsync_mode_def()) 1074 + seq_printf(m, ",fsync=%s", ovl_fsync_mode(&ofs->config)); 1098 1075 if (ofs->config.userxattr) 1099 1076 seq_puts(m, ",userxattr"); 1100 1077 if (ofs->config.verity_mode != ovl_verity_mode_def())
+1 -1
fs/overlayfs/super.c
··· 776 776 * For volatile mount, create a incompat/volatile/dirty file to keep 777 777 * track of it. 778 778 */ 779 - if (ofs->config.ovl_volatile) { 779 + if (ovl_is_volatile(&ofs->config)) { 780 780 err = ovl_create_volatile_dirty(ofs); 781 781 if (err < 0) { 782 782 pr_err("Failed to create volatile/dirty file.\n");
+4 -1
fs/overlayfs/util.c
··· 85 85 if (!exportfs_can_decode_fh(sb->s_export_op)) 86 86 return 0; 87 87 88 - return sb->s_export_op->encode_fh ? -1 : FILEID_INO32_GEN; 88 + if (sb->s_export_op->encode_fh == generic_encode_ino32_fh) 89 + return FILEID_INO32_GEN; 90 + 91 + return -1; 89 92 } 90 93 91 94 struct dentry *ovl_indexdir(struct super_block *sb)