Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

vfs: elide smp_mb in iversion handling in the common case

According to bpftrace on these routines most calls result in cmpxchg,
which already provides the same guarantee.

In inode_maybe_inc_iversion elision is possible because even if the
wrong value was read due to now missing smp_mb fence, the issue is going
to correct itself after cmpxchg. If it appears cmpxchg wont be issued,
the fence + reload are there bringing back previous behavior.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://lore.kernel.org/r/20240815083310.3865-1-mjguzik@gmail.com
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>

authored by

Mateusz Guzik and committed by
Christian Brauner
b381fbbc 433f9d76

+18 -10
+18 -10
fs/libfs.c
··· 2003 2003 * information, but the legacy inode_inc_iversion code used a spinlock 2004 2004 * to serialize increments. 2005 2005 * 2006 - * Here, we add full memory barriers to ensure that any de-facto 2007 - * ordering with other info is preserved. 2006 + * We add a full memory barrier to ensure that any de facto ordering 2007 + * with other state is preserved (either implicitly coming from cmpxchg 2008 + * or explicitly from smp_mb if we don't know upfront if we will execute 2009 + * the former). 2008 2010 * 2009 - * This barrier pairs with the barrier in inode_query_iversion() 2011 + * These barriers pair with inode_query_iversion(). 2010 2012 */ 2011 - smp_mb(); 2012 2013 cur = inode_peek_iversion_raw(inode); 2014 + if (!force && !(cur & I_VERSION_QUERIED)) { 2015 + smp_mb(); 2016 + cur = inode_peek_iversion_raw(inode); 2017 + } 2018 + 2013 2019 do { 2014 2020 /* If flag is clear then we needn't do anything */ 2015 2021 if (!force && !(cur & I_VERSION_QUERIED)) ··· 2044 2038 u64 inode_query_iversion(struct inode *inode) 2045 2039 { 2046 2040 u64 cur, new; 2041 + bool fenced = false; 2047 2042 2043 + /* 2044 + * Memory barriers (implicit in cmpxchg, explicit in smp_mb) pair with 2045 + * inode_maybe_inc_iversion(), see that routine for more details. 2046 + */ 2048 2047 cur = inode_peek_iversion_raw(inode); 2049 2048 do { 2050 2049 /* If flag is already set, then no need to swap */ 2051 2050 if (cur & I_VERSION_QUERIED) { 2052 - /* 2053 - * This barrier (and the implicit barrier in the 2054 - * cmpxchg below) pairs with the barrier in 2055 - * inode_maybe_inc_iversion(). 2056 - */ 2057 - smp_mb(); 2051 + if (!fenced) 2052 + smp_mb(); 2058 2053 break; 2059 2054 } 2060 2055 2056 + fenced = true; 2061 2057 new = cur | I_VERSION_QUERIED; 2062 2058 } while (!atomic64_try_cmpxchg(&inode->i_version, &cur, new)); 2063 2059 return cur >> I_VERSION_QUERIED_SHIFT;