Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

xfs: do not tightly pack-write large files

When using a zoned realtime device, tightly packing of data blocks
belonging to multiple closed files into the same realtime group (RTG)
is very efficient at improving write performance. This is especially
true with SMR HDDs as this can reduce, and even suppress, disk head
seeks.

However, such tight packing does not make sense for large files that
require at least a full RTG. If tight packing placement is applied for
such files, the VM writeback thread switching between inodes result in
the large files to be fragmented, thus increasing the garbage collection
penalty later when the RTG needs to be reclaimed.

This problem can be avoided with a simple heuristic: if the size of the
inode being written back is at least equal to the RTG size, do not use
tight-packing. Modify xfs_zoned_pack_tight() to always return false in
this case.

With this change, a multi-writer workload writing files of 256 MB on a
file system backed by an SMR HDD with 256 MB zone size as a realtime
device sees all files occupying exactly one RTG (i.e. one device zone),
thus completely removing the heavy fragmentation observed without this
change.

Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Carlos Maiolino <cem@kernel.org>

authored by

Damien Le Moal and committed by
Carlos Maiolino
b00bcb19 914f3770

+15 -4
+15 -4
fs/xfs/xfs_zone_alloc.c
··· 614 614 } 615 615 616 616 /* 617 - * Try to pack inodes that are written back after they were closed tight instead 618 - * of trying to open new zones for them or spread them to the least recently 619 - * used zone. This optimizes the data layout for workloads that untar or copy 620 - * a lot of small files. Right now this does not separate multiple such 617 + * Try to tightly pack small files that are written back after they were closed 618 + * instead of trying to open new zones for them or spread them to the least 619 + * recently used zone. This optimizes the data layout for workloads that untar 620 + * or copy a lot of small files. Right now this does not separate multiple such 621 621 * streams. 622 622 */ 623 623 static inline bool xfs_zoned_pack_tight(struct xfs_inode *ip) 624 624 { 625 + struct xfs_mount *mp = ip->i_mount; 626 + size_t zone_capacity = 627 + XFS_FSB_TO_B(mp, mp->m_groups[XG_TYPE_RTG].blocks); 628 + 629 + /* 630 + * Do not pack write files that are already using a full zone to avoid 631 + * fragmentation. 632 + */ 633 + if (i_size_read(VFS_I(ip)) >= zone_capacity) 634 + return false; 635 + 625 636 return !inode_is_open_for_write(VFS_I(ip)) && 626 637 !(ip->i_diflags & XFS_DIFLAG_APPEND); 627 638 }