Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

vfs: avoid non-forwarding large load after small store in path lookup

The performance regression that Josef Bacik reported in the pathname
lookup (see commit 99d263d4c5b2 "vfs: fix bad hashing of dentries") made
me look at performance stability of the dcache code, just to verify that
the problem was actually fixed. That turned up a few other problems in
this area.

There are a few cases where we exit RCU lookup mode and go to the slow
serializing case when we shouldn't, Al has fixed those and they'll come
in with the next VFS pull.

But my performance verification also shows that link_path_walk() turns
out to have a very unfortunate 32-bit store of the length and hash of
the name we look up, followed by a 64-bit read of the combined hash_len
field. That screws up the processor store to load forwarding, causing
an unnecessary hickup in this critical routine.

It's caused by the ugly calling convention for the "hash_name()"
function, and easily fixed by just making hash_name() fill in the whole
'struct qstr' rather than passing it a pointer to just the hash value.

With that, the profile for this function looks much smoother.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

+11 -9
+10 -9
fs/namei.c
··· 1669 1669 1670 1670 /* 1671 1671 * Calculate the length and hash of the path component, and 1672 - * return the length of the component; 1672 + * fill in the qstr. return the "len" as the result. 1673 1673 */ 1674 - static inline unsigned long hash_name(const char *name, unsigned int *hashp) 1674 + static inline unsigned long hash_name(const char *name, struct qstr *res) 1675 1675 { 1676 1676 unsigned long a, b, adata, bdata, mask, hash, len; 1677 1677 const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS; 1678 1678 1679 + res->name = name; 1679 1680 hash = a = 0; 1680 1681 len = -sizeof(unsigned long); 1681 1682 do { ··· 1692 1691 mask = create_zero_mask(adata | bdata); 1693 1692 1694 1693 hash += a & zero_bytemask(mask); 1695 - *hashp = fold_hash(hash); 1694 + len += find_zero(mask); 1695 + res->hash_len = hashlen_create(fold_hash(hash), len); 1696 1696 1697 - return len + find_zero(mask); 1697 + return len; 1698 1698 } 1699 1699 1700 1700 #else ··· 1713 1711 * We know there's a real path component here of at least 1714 1712 * one character. 1715 1713 */ 1716 - static inline unsigned long hash_name(const char *name, unsigned int *hashp) 1714 + static inline long hash_name(const char *name, struct qstr *res) 1717 1715 { 1718 1716 unsigned long hash = init_name_hash(); 1719 1717 unsigned long len = 0, c; 1720 1718 1719 + res->name = name; 1721 1720 c = (unsigned char)*name; 1722 1721 do { 1723 1722 len++; 1724 1723 hash = partial_name_hash(c, hash); 1725 1724 c = (unsigned char)name[len]; 1726 1725 } while (c && c != '/'); 1727 - *hashp = end_name_hash(hash); 1726 + res->hash_len = hashlen_create(end_name_hash(hash), len); 1728 1727 return len; 1729 1728 } 1730 1729 ··· 1759 1756 if (err) 1760 1757 break; 1761 1758 1762 - len = hash_name(name, &this.hash); 1763 - this.name = name; 1764 - this.len = len; 1759 + len = hash_name(name, &this); 1765 1760 1766 1761 type = LAST_NORM; 1767 1762 if (name[0] == '.') switch (len) {
+1
include/linux/dcache.h
··· 55 55 #define QSTR_INIT(n,l) { { { .len = l } }, .name = n } 56 56 #define hashlen_hash(hashlen) ((u32) (hashlen)) 57 57 #define hashlen_len(hashlen) ((u32)((hashlen) >> 32)) 58 + #define hashlen_create(hash,len) (((u64)(len)<<32)|(u32)(hash)) 58 59 59 60 struct dentry_stat_t { 60 61 long nr_dentry;