Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: move pgscan, pgsteal, pgrefill to node stats

There are situations where reclaim kicks in on a system with free memory.
One possible cause is a NUMA imbalance scenario where one or more nodes
are under pressure. It would help if we could easily identify such nodes.

Move the pgscan, pgsteal, and pgrefill counters from vm_event_item to
node_stat_item to provide per-node reclaim visibility. With these
counters as node stats, the values are now displayed in the per-node
section of /proc/zoneinfo, which allows for quick identification of the
affected nodes.

/proc/vmstat continues to report the same counters, aggregated across all
nodes. But the ordering of these items within the readout changes as they
move from the vm events section to the node stats section.

Memcg accounting of these counters is preserved. The relocated counters
remain visible in memory.stat alongside the existing aggregate pgscan and
pgsteal counters.

However, this change affects how the global counters are accumulated.
Previously, the global event count update was gated on !cgroup_reclaim(),
excluding memcg-based reclaim from /proc/vmstat. Now that
mod_lruvec_state() is being used to update the counters, the global
counters will include all reclaim. This is consistent with how pgdemote
counters are already tracked.

Finally, the virtio_balloon driver is updated to use
global_node_page_state() to fetch the counters, as they are no longer
accessible through the vm_events array.

Link: https://lkml.kernel.org/r/20260219235846.161910-1-jp.kobryn@linux.dev
Signed-off-by: JP Kobryn <jp.kobryn@linux.dev>
Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Byungchul Park <byungchul@sk.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Eugenio Pérez <eperezma@redhat.com>
Cc: Gregory Price <gourry@gourry.net>
Cc: "Huang, Ying" <ying.huang@linux.alibaba.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Joshua Hahn <joshua.hahnjy@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mathew Brost <matthew.brost@intel.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Rakie Kim <rakie.kim@sk.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Wei Xu <weixugc@google.com>
Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

JP Kobryn (Meta) and committed by
Andrew Morton
e4f4fc7a 54218f10

+82 -73
+4 -4
drivers/virtio/virtio_balloon.c
··· 369 369 update_stat(vb, idx++, VIRTIO_BALLOON_S_ALLOC_STALL, stall); 370 370 371 371 update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_SCAN, 372 - pages_to_bytes(events[PGSCAN_KSWAPD])); 372 + pages_to_bytes(global_node_page_state(PGSCAN_KSWAPD))); 373 373 update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_SCAN, 374 - pages_to_bytes(events[PGSCAN_DIRECT])); 374 + pages_to_bytes(global_node_page_state(PGSCAN_DIRECT))); 375 375 update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_RECLAIM, 376 - pages_to_bytes(events[PGSTEAL_KSWAPD])); 376 + pages_to_bytes(global_node_page_state(PGSTEAL_KSWAPD))); 377 377 update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_RECLAIM, 378 - pages_to_bytes(events[PGSTEAL_DIRECT])); 378 + pages_to_bytes(global_node_page_state(PGSTEAL_DIRECT))); 379 379 380 380 #ifdef CONFIG_HUGETLB_PAGE 381 381 update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC,
+13
include/linux/mmzone.h
··· 255 255 PGDEMOTE_DIRECT, 256 256 PGDEMOTE_KHUGEPAGED, 257 257 PGDEMOTE_PROACTIVE, 258 + PGSTEAL_KSWAPD, 259 + PGSTEAL_DIRECT, 260 + PGSTEAL_KHUGEPAGED, 261 + PGSTEAL_PROACTIVE, 262 + PGSTEAL_ANON, 263 + PGSTEAL_FILE, 264 + PGSCAN_KSWAPD, 265 + PGSCAN_DIRECT, 266 + PGSCAN_KHUGEPAGED, 267 + PGSCAN_PROACTIVE, 268 + PGSCAN_ANON, 269 + PGSCAN_FILE, 270 + PGREFILL, 258 271 #ifdef CONFIG_HUGETLB_PAGE 259 272 NR_HUGETLB, 260 273 #endif
-13
include/linux/vm_event_item.h
··· 38 38 PGFREE, PGACTIVATE, PGDEACTIVATE, PGLAZYFREE, 39 39 PGFAULT, PGMAJFAULT, 40 40 PGLAZYFREED, 41 - PGREFILL, 42 41 PGREUSE, 43 - PGSTEAL_KSWAPD, 44 - PGSTEAL_DIRECT, 45 - PGSTEAL_KHUGEPAGED, 46 - PGSTEAL_PROACTIVE, 47 - PGSCAN_KSWAPD, 48 - PGSCAN_DIRECT, 49 - PGSCAN_KHUGEPAGED, 50 - PGSCAN_PROACTIVE, 51 42 PGSCAN_DIRECT_THROTTLE, 52 - PGSCAN_ANON, 53 - PGSCAN_FILE, 54 - PGSTEAL_ANON, 55 - PGSTEAL_FILE, 56 43 #ifdef CONFIG_NUMA 57 44 PGSCAN_ZONE_RECLAIM_SUCCESS, 58 45 PGSCAN_ZONE_RECLAIM_FAILED,
+39 -17
mm/memcontrol.c
··· 330 330 PGDEMOTE_DIRECT, 331 331 PGDEMOTE_KHUGEPAGED, 332 332 PGDEMOTE_PROACTIVE, 333 + PGSTEAL_KSWAPD, 334 + PGSTEAL_DIRECT, 335 + PGSTEAL_KHUGEPAGED, 336 + PGSTEAL_PROACTIVE, 337 + PGSTEAL_ANON, 338 + PGSTEAL_FILE, 339 + PGSCAN_KSWAPD, 340 + PGSCAN_DIRECT, 341 + PGSCAN_KHUGEPAGED, 342 + PGSCAN_PROACTIVE, 343 + PGSCAN_ANON, 344 + PGSCAN_FILE, 345 + PGREFILL, 333 346 #ifdef CONFIG_HUGETLB_PAGE 334 347 NR_HUGETLB, 335 348 #endif ··· 456 443 #endif 457 444 PSWPIN, 458 445 PSWPOUT, 459 - PGSCAN_KSWAPD, 460 - PGSCAN_DIRECT, 461 - PGSCAN_KHUGEPAGED, 462 - PGSCAN_PROACTIVE, 463 - PGSTEAL_KSWAPD, 464 - PGSTEAL_DIRECT, 465 - PGSTEAL_KHUGEPAGED, 466 - PGSTEAL_PROACTIVE, 467 446 PGFAULT, 468 447 PGMAJFAULT, 469 - PGREFILL, 470 448 PGACTIVATE, 471 449 PGDEACTIVATE, 472 450 PGLAZYFREE, ··· 1404 1400 { "pgdemote_direct", PGDEMOTE_DIRECT }, 1405 1401 { "pgdemote_khugepaged", PGDEMOTE_KHUGEPAGED }, 1406 1402 { "pgdemote_proactive", PGDEMOTE_PROACTIVE }, 1403 + { "pgsteal_kswapd", PGSTEAL_KSWAPD }, 1404 + { "pgsteal_direct", PGSTEAL_DIRECT }, 1405 + { "pgsteal_khugepaged", PGSTEAL_KHUGEPAGED }, 1406 + { "pgsteal_proactive", PGSTEAL_PROACTIVE }, 1407 + { "pgscan_kswapd", PGSCAN_KSWAPD }, 1408 + { "pgscan_direct", PGSCAN_DIRECT }, 1409 + { "pgscan_khugepaged", PGSCAN_KHUGEPAGED }, 1410 + { "pgscan_proactive", PGSCAN_PROACTIVE }, 1411 + { "pgrefill", PGREFILL }, 1407 1412 #ifdef CONFIG_NUMA_BALANCING 1408 1413 { "pgpromote_success", PGPROMOTE_SUCCESS }, 1409 1414 #endif ··· 1456 1443 case PGDEMOTE_DIRECT: 1457 1444 case PGDEMOTE_KHUGEPAGED: 1458 1445 case PGDEMOTE_PROACTIVE: 1446 + case PGSTEAL_KSWAPD: 1447 + case PGSTEAL_DIRECT: 1448 + case PGSTEAL_KHUGEPAGED: 1449 + case PGSTEAL_PROACTIVE: 1450 + case PGSCAN_KSWAPD: 1451 + case PGSCAN_DIRECT: 1452 + case PGSCAN_KHUGEPAGED: 1453 + case PGSCAN_PROACTIVE: 1454 + case PGREFILL: 1459 1455 #ifdef CONFIG_NUMA_BALANCING 1460 1456 case PGPROMOTE_SUCCESS: 1461 1457 #endif ··· 1536 1514 1537 1515 /* Accumulated memory events */ 1538 1516 seq_buf_printf(s, "pgscan %lu\n", 1539 - memcg_events(memcg, PGSCAN_KSWAPD) + 1540 - memcg_events(memcg, PGSCAN_DIRECT) + 1541 - memcg_events(memcg, PGSCAN_PROACTIVE) + 1542 - memcg_events(memcg, PGSCAN_KHUGEPAGED)); 1517 + memcg_page_state(memcg, PGSCAN_KSWAPD) + 1518 + memcg_page_state(memcg, PGSCAN_DIRECT) + 1519 + memcg_page_state(memcg, PGSCAN_PROACTIVE) + 1520 + memcg_page_state(memcg, PGSCAN_KHUGEPAGED)); 1543 1521 seq_buf_printf(s, "pgsteal %lu\n", 1544 - memcg_events(memcg, PGSTEAL_KSWAPD) + 1545 - memcg_events(memcg, PGSTEAL_DIRECT) + 1546 - memcg_events(memcg, PGSTEAL_PROACTIVE) + 1547 - memcg_events(memcg, PGSTEAL_KHUGEPAGED)); 1522 + memcg_page_state(memcg, PGSTEAL_KSWAPD) + 1523 + memcg_page_state(memcg, PGSTEAL_DIRECT) + 1524 + memcg_page_state(memcg, PGSTEAL_PROACTIVE) + 1525 + memcg_page_state(memcg, PGSTEAL_KHUGEPAGED)); 1548 1526 1549 1527 for (i = 0; i < ARRAY_SIZE(memcg_vm_event_stat); i++) { 1550 1528 #ifdef CONFIG_MEMCG_V1
+13 -26
mm/vmscan.c
··· 1984 1984 unsigned long nr_taken; 1985 1985 struct reclaim_stat stat; 1986 1986 bool file = is_file_lru(lru); 1987 - enum vm_event_item item; 1987 + enum node_stat_item item; 1988 1988 struct pglist_data *pgdat = lruvec_pgdat(lruvec); 1989 1989 bool stalled = false; 1990 1990 ··· 2010 2010 2011 2011 __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken); 2012 2012 item = PGSCAN_KSWAPD + reclaimer_offset(sc); 2013 - if (!cgroup_reclaim(sc)) 2014 - __count_vm_events(item, nr_scanned); 2015 - count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned); 2016 - __count_vm_events(PGSCAN_ANON + file, nr_scanned); 2013 + mod_lruvec_state(lruvec, item, nr_scanned); 2014 + mod_lruvec_state(lruvec, PGSCAN_ANON + file, nr_scanned); 2017 2015 2018 2016 spin_unlock_irq(&lruvec->lru_lock); 2019 2017 ··· 2028 2030 stat.nr_demoted); 2029 2031 __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken); 2030 2032 item = PGSTEAL_KSWAPD + reclaimer_offset(sc); 2031 - if (!cgroup_reclaim(sc)) 2032 - __count_vm_events(item, nr_reclaimed); 2033 - count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed); 2034 - __count_vm_events(PGSTEAL_ANON + file, nr_reclaimed); 2033 + mod_lruvec_state(lruvec, item, nr_reclaimed); 2034 + mod_lruvec_state(lruvec, PGSTEAL_ANON + file, nr_reclaimed); 2035 2035 2036 2036 lru_note_cost_unlock_irq(lruvec, file, stat.nr_pageout, 2037 2037 nr_scanned - nr_reclaimed); ··· 2116 2120 2117 2121 __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken); 2118 2122 2119 - if (!cgroup_reclaim(sc)) 2120 - __count_vm_events(PGREFILL, nr_scanned); 2121 - count_memcg_events(lruvec_memcg(lruvec), PGREFILL, nr_scanned); 2123 + mod_lruvec_state(lruvec, PGREFILL, nr_scanned); 2122 2124 2123 2125 spin_unlock_irq(&lruvec->lru_lock); 2124 2126 ··· 4537 4543 { 4538 4544 int i; 4539 4545 int gen; 4540 - enum vm_event_item item; 4546 + enum node_stat_item item; 4541 4547 int sorted = 0; 4542 4548 int scanned = 0; 4543 4549 int isolated = 0; ··· 4545 4551 int scan_batch = min(nr_to_scan, MAX_LRU_BATCH); 4546 4552 int remaining = scan_batch; 4547 4553 struct lru_gen_folio *lrugen = &lruvec->lrugen; 4548 - struct mem_cgroup *memcg = lruvec_memcg(lruvec); 4549 4554 4550 4555 VM_WARN_ON_ONCE(!list_empty(list)); 4551 4556 ··· 4595 4602 } 4596 4603 4597 4604 item = PGSCAN_KSWAPD + reclaimer_offset(sc); 4598 - if (!cgroup_reclaim(sc)) { 4599 - __count_vm_events(item, isolated); 4600 - __count_vm_events(PGREFILL, sorted); 4601 - } 4602 - count_memcg_events(memcg, item, isolated); 4603 - count_memcg_events(memcg, PGREFILL, sorted); 4604 - __count_vm_events(PGSCAN_ANON + type, isolated); 4605 + mod_lruvec_state(lruvec, item, isolated); 4606 + mod_lruvec_state(lruvec, PGREFILL, sorted); 4607 + mod_lruvec_state(lruvec, PGSCAN_ANON + type, isolated); 4605 4608 trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, scan_batch, 4606 4609 scanned, skipped, isolated, 4607 4610 type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON); ··· 4682 4693 LIST_HEAD(clean); 4683 4694 struct folio *folio; 4684 4695 struct folio *next; 4685 - enum vm_event_item item; 4696 + enum node_stat_item item; 4686 4697 struct reclaim_stat stat; 4687 4698 struct lru_gen_mm_walk *walk; 4688 4699 bool skip_retry = false; ··· 4746 4757 stat.nr_demoted); 4747 4758 4748 4759 item = PGSTEAL_KSWAPD + reclaimer_offset(sc); 4749 - if (!cgroup_reclaim(sc)) 4750 - __count_vm_events(item, reclaimed); 4751 - count_memcg_events(memcg, item, reclaimed); 4752 - __count_vm_events(PGSTEAL_ANON + type, reclaimed); 4760 + mod_lruvec_state(lruvec, item, reclaimed); 4761 + mod_lruvec_state(lruvec, PGSTEAL_ANON + type, reclaimed); 4753 4762 4754 4763 spin_unlock_irq(&lruvec->lru_lock); 4755 4764
+13 -13
mm/vmstat.c
··· 1276 1276 [I(PGDEMOTE_DIRECT)] = "pgdemote_direct", 1277 1277 [I(PGDEMOTE_KHUGEPAGED)] = "pgdemote_khugepaged", 1278 1278 [I(PGDEMOTE_PROACTIVE)] = "pgdemote_proactive", 1279 + [I(PGSTEAL_KSWAPD)] = "pgsteal_kswapd", 1280 + [I(PGSTEAL_DIRECT)] = "pgsteal_direct", 1281 + [I(PGSTEAL_KHUGEPAGED)] = "pgsteal_khugepaged", 1282 + [I(PGSTEAL_PROACTIVE)] = "pgsteal_proactive", 1283 + [I(PGSTEAL_ANON)] = "pgsteal_anon", 1284 + [I(PGSTEAL_FILE)] = "pgsteal_file", 1285 + [I(PGSCAN_KSWAPD)] = "pgscan_kswapd", 1286 + [I(PGSCAN_DIRECT)] = "pgscan_direct", 1287 + [I(PGSCAN_KHUGEPAGED)] = "pgscan_khugepaged", 1288 + [I(PGSCAN_PROACTIVE)] = "pgscan_proactive", 1289 + [I(PGSCAN_ANON)] = "pgscan_anon", 1290 + [I(PGSCAN_FILE)] = "pgscan_file", 1291 + [I(PGREFILL)] = "pgrefill", 1279 1292 #ifdef CONFIG_HUGETLB_PAGE 1280 1293 [I(NR_HUGETLB)] = "nr_hugetlb", 1281 1294 #endif ··· 1331 1318 [I(PGMAJFAULT)] = "pgmajfault", 1332 1319 [I(PGLAZYFREED)] = "pglazyfreed", 1333 1320 1334 - [I(PGREFILL)] = "pgrefill", 1335 1321 [I(PGREUSE)] = "pgreuse", 1336 - [I(PGSTEAL_KSWAPD)] = "pgsteal_kswapd", 1337 - [I(PGSTEAL_DIRECT)] = "pgsteal_direct", 1338 - [I(PGSTEAL_KHUGEPAGED)] = "pgsteal_khugepaged", 1339 - [I(PGSTEAL_PROACTIVE)] = "pgsteal_proactive", 1340 - [I(PGSCAN_KSWAPD)] = "pgscan_kswapd", 1341 - [I(PGSCAN_DIRECT)] = "pgscan_direct", 1342 - [I(PGSCAN_KHUGEPAGED)] = "pgscan_khugepaged", 1343 - [I(PGSCAN_PROACTIVE)] = "pgscan_proactive", 1344 1322 [I(PGSCAN_DIRECT_THROTTLE)] = "pgscan_direct_throttle", 1345 - [I(PGSCAN_ANON)] = "pgscan_anon", 1346 - [I(PGSCAN_FILE)] = "pgscan_file", 1347 - [I(PGSTEAL_ANON)] = "pgsteal_anon", 1348 - [I(PGSTEAL_FILE)] = "pgsteal_file", 1349 1323 1350 1324 #ifdef CONFIG_NUMA 1351 1325 [I(PGSCAN_ZONE_RECLAIM_SUCCESS)] = "zone_reclaim_success",