Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm/damon/core: avoid use of half-online-committed context

One major usage of damon_call() is online DAMON parameters update. It is
done by calling damon_commit_ctx() inside the damon_call() callback
function. damon_commit_ctx() can fail for two reasons: 1) invalid
parameters and 2) internal memory allocation failures. In case of
failures, the damon_ctx that attempted to be updated (commit destination)
can be partially updated (or, corrupted from a perspective), and therefore
shouldn't be used anymore. The function only ensures the damon_ctx object
can safely deallocated using damon_destroy_ctx().

The API callers are, however, calling damon_commit_ctx() only after
asserting the parameters are valid, to avoid damon_commit_ctx() fails due
to invalid input parameters. But it can still theoretically fail if the
internal memory allocation fails. In the case, DAMON may run with the
partially updated damon_ctx. This can result in unexpected behaviors
including even NULL pointer dereference in case of damos_commit_dests()
failure [1]. Such allocation failure is arguably too small to fail, so
the real world impact would be rare. But, given the bad consequence, this
needs to be fixed.

Avoid such partially-committed (maybe-corrupted) damon_ctx use by saving
the damon_commit_ctx() failure on the damon_ctx object. For this,
introduce damon_ctx->maybe_corrupted field. damon_commit_ctx() sets it
when it is failed. kdamond_call() checks if the field is set after each
damon_call_control->fn() is executed. If it is set, ignore remaining
callback requests and return. All kdamond_call() callers including
kdamond_fn() also check the maybe_corrupted field right after
kdamond_call() invocations. If the field is set, break the kdamond_fn()
main loop so that DAMON sill doesn't use the context that might be
corrupted.

[sj@kernel.org: let kdamond_call() with cancel regardless of maybe_corrupted]
Link: https://lkml.kernel.org/r/20260320031553.2479-1-sj@kernel.org
Link: https://sashiko.dev/#/patchset/20260319145218.86197-1-sj%40kernel.org
Link: https://lkml.kernel.org/r/20260319145218.86197-1-sj@kernel.org
Link: https://lore.kernel.org/20260319043309.97966-1-sj@kernel.org [1]
Fixes: 3301f1861d34 ("mm/damon/sysfs: handle commit command using damon_call()")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> [6.15+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

SeongJae Park and committed by
Andrew Morton
26f775a0 3a206a86

+14
+6
include/linux/damon.h
··· 810 810 struct damos_walk_control *walk_control; 811 811 struct mutex walk_control_lock; 812 812 813 + /* 814 + * indicate if this may be corrupted. Currentonly this is set only for 815 + * damon_commit_ctx() failure. 816 + */ 817 + bool maybe_corrupted; 818 + 813 819 /* Working thread of the given DAMON context */ 814 820 struct task_struct *kdamond; 815 821 /* Protects @kdamond field access */
+8
mm/damon/core.c
··· 1252 1252 { 1253 1253 int err; 1254 1254 1255 + dst->maybe_corrupted = true; 1255 1256 if (!is_power_of_2(src->min_region_sz)) 1256 1257 return -EINVAL; 1257 1258 ··· 1278 1277 dst->addr_unit = src->addr_unit; 1279 1278 dst->min_region_sz = src->min_region_sz; 1280 1279 1280 + dst->maybe_corrupted = false; 1281 1281 return 0; 1282 1282 } 1283 1283 ··· 2680 2678 complete(&control->completion); 2681 2679 else if (control->canceled && control->dealloc_on_cancel) 2682 2680 kfree(control); 2681 + if (!cancel && ctx->maybe_corrupted) 2682 + break; 2683 2683 } 2684 2684 2685 2685 mutex_lock(&ctx->call_controls_lock); ··· 2711 2707 kdamond_usleep(min_wait_time); 2712 2708 2713 2709 kdamond_call(ctx, false); 2710 + if (ctx->maybe_corrupted) 2711 + return -EINVAL; 2714 2712 damos_walk_cancel(ctx); 2715 2713 } 2716 2714 return -EBUSY; ··· 2796 2790 * kdamond_merge_regions() if possible, to reduce overhead 2797 2791 */ 2798 2792 kdamond_call(ctx, false); 2793 + if (ctx->maybe_corrupted) 2794 + break; 2799 2795 if (!list_empty(&ctx->schemes)) 2800 2796 kdamond_apply_schemes(ctx); 2801 2797 else