Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: vmscan: prepare for reparenting traditional LRU folios

To resolve the dying memcg issue, we need to reparent LRU folios of child
memcg to its parent memcg. For traditional LRU list, each lruvec of every
memcg comprises four LRU lists. Due to the symmetry of the LRU lists, it
is feasible to transfer the LRU lists from a memcg to its parent memcg
during the reparenting process.

This commit implements the specific function, which will be used during
the reparenting process.

Link: https://lore.kernel.org/a92d217a9fc82bd0c401210204a095caaf615b1c.1772711148.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Allen Pais <apais@linux.microsoft.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Chengming Zhou <chengming.zhou@linux.dev>
Cc: Chen Ridong <chenridong@huawei.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: Hamza Mahfooz <hamzamahfooz@linux.microsoft.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Imran Khan <imran.f.khan@oracle.com>
Cc: Kamalesh Babulal <kamalesh.babulal@oracle.com>
Cc: Lance Yang <lance.yang@linux.dev>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Usama Arif <usamaarif642@gmail.com>
Cc: Vlastimil Babka <vbabka@kernel.org>
Cc: Wei Xu <weixugc@google.com>
Cc: Yosry Ahmed <yosry@kernel.org>
Cc: Yuanchu Xie <yuanchu@google.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

authored by

Qi Zheng and committed by
Andrew Morton
07a6e9a2 31b54a5e

+54 -19
+21
include/linux/swap.h
··· 546 546 547 547 return READ_ONCE(memcg->swappiness); 548 548 } 549 + 550 + void lru_reparent_memcg(struct mem_cgroup *memcg, struct mem_cgroup *parent, int nid); 549 551 #else 550 552 static inline int mem_cgroup_swappiness(struct mem_cgroup *mem) 551 553 { ··· 611 609 return vm_swap_full(); 612 610 } 613 611 #endif 612 + 613 + /* for_each_managed_zone_pgdat - helper macro to iterate over all managed zones in a pgdat up to 614 + * and including the specified highidx 615 + * @zone: The current zone in the iterator 616 + * @pgdat: The pgdat which node_zones are being iterated 617 + * @idx: The index variable 618 + * @highidx: The index of the highest zone to return 619 + * 620 + * This macro iterates through all managed zones up to and including the specified highidx. 621 + * The zone iterator enters an invalid state after macro call and must be reinitialized 622 + * before it can be used again. 623 + */ 624 + #define for_each_managed_zone_pgdat(zone, pgdat, idx, highidx) \ 625 + for ((idx) = 0, (zone) = (pgdat)->node_zones; \ 626 + (idx) <= (highidx); \ 627 + (idx)++, (zone)++) \ 628 + if (!managed_zone(zone)) \ 629 + continue; \ 630 + else 614 631 615 632 #endif /* __KERNEL__*/ 616 633 #endif /* _LINUX_SWAP_H */
+33
mm/swap.c
··· 1090 1090 fbatch->nr = j; 1091 1091 } 1092 1092 1093 + #ifdef CONFIG_MEMCG 1094 + static void lruvec_reparent_lru(struct lruvec *child_lruvec, 1095 + struct lruvec *parent_lruvec, 1096 + enum lru_list lru, int nid) 1097 + { 1098 + int zid; 1099 + struct zone *zone; 1100 + 1101 + if (lru != LRU_UNEVICTABLE) 1102 + list_splice_tail_init(&child_lruvec->lists[lru], &parent_lruvec->lists[lru]); 1103 + 1104 + for_each_managed_zone_pgdat(zone, NODE_DATA(nid), zid, MAX_NR_ZONES - 1) { 1105 + unsigned long size = mem_cgroup_get_zone_lru_size(child_lruvec, lru, zid); 1106 + 1107 + mem_cgroup_update_lru_size(parent_lruvec, lru, zid, size); 1108 + } 1109 + } 1110 + 1111 + void lru_reparent_memcg(struct mem_cgroup *memcg, struct mem_cgroup *parent, int nid) 1112 + { 1113 + enum lru_list lru; 1114 + struct lruvec *child_lruvec, *parent_lruvec; 1115 + 1116 + child_lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid)); 1117 + parent_lruvec = mem_cgroup_lruvec(parent, NODE_DATA(nid)); 1118 + parent_lruvec->anon_cost += child_lruvec->anon_cost; 1119 + parent_lruvec->file_cost += child_lruvec->file_cost; 1120 + 1121 + for_each_lru(lru) 1122 + lruvec_reparent_lru(child_lruvec, parent_lruvec, lru, nid); 1123 + } 1124 + #endif 1125 + 1093 1126 static const struct ctl_table swap_sysctl_table[] = { 1094 1127 { 1095 1128 .procname = "page-cluster",
-19
mm/vmscan.c
··· 269 269 } 270 270 #endif 271 271 272 - /* for_each_managed_zone_pgdat - helper macro to iterate over all managed zones in a pgdat up to 273 - * and including the specified highidx 274 - * @zone: The current zone in the iterator 275 - * @pgdat: The pgdat which node_zones are being iterated 276 - * @idx: The index variable 277 - * @highidx: The index of the highest zone to return 278 - * 279 - * This macro iterates through all managed zones up to and including the specified highidx. 280 - * The zone iterator enters an invalid state after macro call and must be reinitialized 281 - * before it can be used again. 282 - */ 283 - #define for_each_managed_zone_pgdat(zone, pgdat, idx, highidx) \ 284 - for ((idx) = 0, (zone) = (pgdat)->node_zones; \ 285 - (idx) <= (highidx); \ 286 - (idx)++, (zone)++) \ 287 - if (!managed_zone(zone)) \ 288 - continue; \ 289 - else 290 - 291 272 static void set_task_reclaim_state(struct task_struct *task, 292 273 struct reclaim_state *rs) 293 274 {