Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

blk-wbt: Fix detection of dirty-throttled tasks

The detection of dirty-throttled tasks in blk-wbt has been subtly broken
since its beginning in 2016. Namely if we are doing cgroup writeback and
the throttled task is not in the root cgroup, balance_dirty_pages() will
set dirty_sleep for the non-root bdi_writeback structure. However
blk-wbt checks dirty_sleep only in the root cgroup bdi_writeback
structure. Thus detection of recently throttled tasks is not working in
this case (we noticed this when we switched to cgroup v2 and suddently
writeback was slow).

Since blk-wbt has no easy way to get to proper bdi_writeback and
furthermore its intention has always been to work on the whole device
rather than on individual cgroups, just move the dirty_sleep timestamp
from bdi_writeback to backing_dev_info. That fixes the checking for
recently throttled task and saves memory for everybody as a bonus.

CC: stable@vger.kernel.org
Fixes: b57d74aff9ab ("writeback: track if we're sleeping on progress in balance_dirty_pages()")
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20240123175826.21452-1-jack@suse.cz
[axboe: fixup indentation errors]
Signed-off-by: Jens Axboe <axboe@kernel.dk>

authored by

Jan Kara and committed by
Jens Axboe
f814bdda f3c89983

+9 -6
+2 -2
block/blk-wbt.c
··· 163 163 */ 164 164 static bool wb_recent_wait(struct rq_wb *rwb) 165 165 { 166 - struct bdi_writeback *wb = &rwb->rqos.disk->bdi->wb; 166 + struct backing_dev_info *bdi = rwb->rqos.disk->bdi; 167 167 168 - return time_before(jiffies, wb->dirty_sleep + HZ); 168 + return time_before(jiffies, bdi->last_bdp_sleep + HZ); 169 169 } 170 170 171 171 static inline struct rq_wait *get_rq_wait(struct rq_wb *rwb,
+5 -2
include/linux/backing-dev-defs.h
··· 141 141 struct delayed_work dwork; /* work item used for writeback */ 142 142 struct delayed_work bw_dwork; /* work item used for bandwidth estimate */ 143 143 144 - unsigned long dirty_sleep; /* last wait */ 145 - 146 144 struct list_head bdi_node; /* anchored at bdi->wb_list */ 147 145 148 146 #ifdef CONFIG_CGROUP_WRITEBACK ··· 177 179 * any dirty wbs, which is depended upon by bdi_has_dirty(). 178 180 */ 179 181 atomic_long_t tot_write_bandwidth; 182 + /* 183 + * Jiffies when last process was dirty throttled on this bdi. Used by 184 + * blk-wbt. 185 + */ 186 + unsigned long last_bdp_sleep; 180 187 181 188 struct bdi_writeback wb; /* the root writeback info for this bdi */ 182 189 struct list_head wb_list; /* list of all wbs */
+1 -1
mm/backing-dev.c
··· 436 436 INIT_LIST_HEAD(&wb->work_list); 437 437 INIT_DELAYED_WORK(&wb->dwork, wb_workfn); 438 438 INIT_DELAYED_WORK(&wb->bw_dwork, wb_update_bandwidth_workfn); 439 - wb->dirty_sleep = jiffies; 440 439 441 440 err = fprop_local_init_percpu(&wb->completions, gfp); 442 441 if (err) ··· 920 921 INIT_LIST_HEAD(&bdi->bdi_list); 921 922 INIT_LIST_HEAD(&bdi->wb_list); 922 923 init_waitqueue_head(&bdi->wb_waitq); 924 + bdi->last_bdp_sleep = jiffies; 923 925 924 926 return cgwb_bdi_init(bdi); 925 927 }
+1 -1
mm/page-writeback.c
··· 1921 1921 break; 1922 1922 } 1923 1923 __set_current_state(TASK_KILLABLE); 1924 - wb->dirty_sleep = now; 1924 + bdi->last_bdp_sleep = jiffies; 1925 1925 io_schedule_timeout(pause); 1926 1926 1927 1927 current->dirty_paused_when = now + pause;