Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

mm: use correct numa policy node for transparent hugepages

Pass down the correct node for a transparent hugepage allocation. Most
callers continue to use the current node, however the hugepaged daemon
now uses the previous node of the first to be collapsed page instead.
This ensures that khugepaged does not mess up local memory for an
existing process which uses local policy.

The choice of node is somewhat primitive currently: it just uses the
node of the first page in the pmd range. An alternative would be to
look at multiple pages and use the most popular node. I used the
simplest variant for now which should work well enough for the case of
all pages being on the same node.

[akpm@linux-foundation.org: coding-style fixes]
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

authored by

Andi Kleen and committed by
Linus Torvalds
5c4b4be3 19ee151e

+19 -8
+17 -7
mm/huge_memory.c
··· 650 650 651 651 static inline struct page *alloc_hugepage_vma(int defrag, 652 652 struct vm_area_struct *vma, 653 - unsigned long haddr) 653 + unsigned long haddr, int nd) 654 654 { 655 655 return alloc_pages_vma(alloc_hugepage_gfpmask(defrag), 656 - HPAGE_PMD_ORDER, vma, haddr, numa_node_id()); 656 + HPAGE_PMD_ORDER, vma, haddr, nd); 657 657 } 658 658 659 659 #ifndef CONFIG_NUMA ··· 678 678 if (unlikely(khugepaged_enter(vma))) 679 679 return VM_FAULT_OOM; 680 680 page = alloc_hugepage_vma(transparent_hugepage_defrag(vma), 681 - vma, haddr); 681 + vma, haddr, numa_node_id()); 682 682 if (unlikely(!page)) 683 683 goto out; 684 684 if (unlikely(mem_cgroup_newpage_charge(page, mm, GFP_KERNEL))) { ··· 902 902 if (transparent_hugepage_enabled(vma) && 903 903 !transparent_hugepage_debug_cow()) 904 904 new_page = alloc_hugepage_vma(transparent_hugepage_defrag(vma), 905 - vma, haddr); 905 + vma, haddr, numa_node_id()); 906 906 else 907 907 new_page = NULL; 908 908 ··· 1745 1745 static void collapse_huge_page(struct mm_struct *mm, 1746 1746 unsigned long address, 1747 1747 struct page **hpage, 1748 - struct vm_area_struct *vma) 1748 + struct vm_area_struct *vma, 1749 + int node) 1749 1750 { 1750 1751 pgd_t *pgd; 1751 1752 pud_t *pud; ··· 1774 1773 * mmap_sem in read mode is good idea also to allow greater 1775 1774 * scalability. 1776 1775 */ 1777 - new_page = alloc_hugepage_vma(khugepaged_defrag(), vma, address); 1776 + new_page = alloc_hugepage_vma(khugepaged_defrag(), vma, address, 1777 + node); 1778 1778 if (unlikely(!new_page)) { 1779 1779 up_read(&mm->mmap_sem); 1780 1780 *hpage = ERR_PTR(-ENOMEM); ··· 1921 1919 struct page *page; 1922 1920 unsigned long _address; 1923 1921 spinlock_t *ptl; 1922 + int node = -1; 1924 1923 1925 1924 VM_BUG_ON(address & ~HPAGE_PMD_MASK); 1926 1925 ··· 1952 1949 page = vm_normal_page(vma, _address, pteval); 1953 1950 if (unlikely(!page)) 1954 1951 goto out_unmap; 1952 + /* 1953 + * Chose the node of the first page. This could 1954 + * be more sophisticated and look at more pages, 1955 + * but isn't for now. 1956 + */ 1957 + if (node == -1) 1958 + node = page_to_nid(page); 1955 1959 VM_BUG_ON(PageCompound(page)); 1956 1960 if (!PageLRU(page) || PageLocked(page) || !PageAnon(page)) 1957 1961 goto out_unmap; ··· 1975 1965 pte_unmap_unlock(pte, ptl); 1976 1966 if (ret) 1977 1967 /* collapse_huge_page will return with the mmap_sem released */ 1978 - collapse_huge_page(mm, address, hpage, vma); 1968 + collapse_huge_page(mm, address, hpage, vma, node); 1979 1969 out: 1980 1970 return ret; 1981 1971 }
+2 -1
mm/mempolicy.c
··· 1891 1891 page = alloc_page_interleave(gfp, order, interleave_nodes(pol)); 1892 1892 else 1893 1893 page = __alloc_pages_nodemask(gfp, order, 1894 - policy_zonelist(gfp, pol), policy_nodemask(gfp, pol)); 1894 + policy_zonelist(gfp, pol, numa_node_id()), 1895 + policy_nodemask(gfp, pol)); 1895 1896 put_mems_allowed(); 1896 1897 return page; 1897 1898 }