Linux kernel mirror (for testing) git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
kernel os linux
1
fork

Configure Feed

Select the types of activity you want to include in your feed.

arm64/mm: Enable batched TLB flush in unmap_hotplug_range()

During a memory hot remove operation, both linear and vmemmap mappings for
the memory range being removed, get unmapped via unmap_hotplug_range() but
mapped pages get freed only for vmemmap mapping. This is just a sequential
operation where each table entry gets cleared, followed by a leaf specific
TLB flush, and then followed by memory free operation when applicable.

This approach was simple and uniform both for vmemmap and linear mappings.
But linear mapping might contain CONT marked block memory where it becomes
necessary to first clear out all entire in the range before a TLB flush.
This is as per the architecture requirement. Hence batch all TLB flushes
during the table tear down walk and finally do it in unmap_hotplug_range().

Prior to this fix, it was hypothetically possible for a speculative access
to a higher address in the contiguous block to fill the TLB with shattered
entries for the entire contiguous range after a lower address had already
been cleared and invalidated. Due to the table entries being shattered, the
subsequent TLB invalidation for the higher address would not then clear the
TLB entries for the lower address, meaning stale TLB entries could persist.

Besides it also helps in improving the performance via TLBI range operation
along with reduced synchronization instructions. The time spent executing
unmap_hotplug_range() improved 97% measured over a 2GB memory hot removal
in KVM guest.

This scheme is not applicable during vmemmap mapping tear down where memory
needs to be freed and hence a TLB flush is required after clearing out page
table entry.

Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Closes: https://lore.kernel.org/all/aWZYXhrT6D2M-7-N@willie-the-truck/
Fixes: bbd6ec605c0f ("arm64/mm: Enable memory hot remove")
Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

authored by

Anshuman Khandual and committed by
Catalin Marinas
48478b9f 1f318b96

+20 -16
+20 -16
arch/arm64/mm/mmu.c
··· 1458 1458 1459 1459 WARN_ON(!pte_present(pte)); 1460 1460 __pte_clear(&init_mm, addr, ptep); 1461 - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); 1462 - if (free_mapped) 1461 + if (free_mapped) { 1462 + /* CONT blocks are not supported in the vmemmap */ 1463 + WARN_ON(pte_cont(pte)); 1464 + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); 1463 1465 free_hotplug_page_range(pte_page(pte), 1464 1466 PAGE_SIZE, altmap); 1467 + } 1468 + /* unmap_hotplug_range() flushes TLB for !free_mapped */ 1465 1469 } while (addr += PAGE_SIZE, addr < end); 1466 1470 } 1467 1471 ··· 1486 1482 WARN_ON(!pmd_present(pmd)); 1487 1483 if (pmd_sect(pmd)) { 1488 1484 pmd_clear(pmdp); 1489 - 1490 - /* 1491 - * One TLBI should be sufficient here as the PMD_SIZE 1492 - * range is mapped with a single block entry. 1493 - */ 1494 - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); 1495 - if (free_mapped) 1485 + if (free_mapped) { 1486 + /* CONT blocks are not supported in the vmemmap */ 1487 + WARN_ON(pmd_cont(pmd)); 1488 + flush_tlb_kernel_range(addr, addr + PMD_SIZE); 1496 1489 free_hotplug_page_range(pmd_page(pmd), 1497 1490 PMD_SIZE, altmap); 1491 + } 1492 + /* unmap_hotplug_range() flushes TLB for !free_mapped */ 1498 1493 continue; 1499 1494 } 1500 1495 WARN_ON(!pmd_table(pmd)); ··· 1518 1515 WARN_ON(!pud_present(pud)); 1519 1516 if (pud_sect(pud)) { 1520 1517 pud_clear(pudp); 1521 - 1522 - /* 1523 - * One TLBI should be sufficient here as the PUD_SIZE 1524 - * range is mapped with a single block entry. 1525 - */ 1526 - flush_tlb_kernel_range(addr, addr + PAGE_SIZE); 1527 - if (free_mapped) 1518 + if (free_mapped) { 1519 + flush_tlb_kernel_range(addr, addr + PUD_SIZE); 1528 1520 free_hotplug_page_range(pud_page(pud), 1529 1521 PUD_SIZE, altmap); 1522 + } 1523 + /* unmap_hotplug_range() flushes TLB for !free_mapped */ 1530 1524 continue; 1531 1525 } 1532 1526 WARN_ON(!pud_table(pud)); ··· 1553 1553 static void unmap_hotplug_range(unsigned long addr, unsigned long end, 1554 1554 bool free_mapped, struct vmem_altmap *altmap) 1555 1555 { 1556 + unsigned long start = addr; 1556 1557 unsigned long next; 1557 1558 pgd_t *pgdp, pgd; 1558 1559 ··· 1575 1574 WARN_ON(!pgd_present(pgd)); 1576 1575 unmap_hotplug_p4d_range(pgdp, addr, next, free_mapped, altmap); 1577 1576 } while (addr = next, addr < end); 1577 + 1578 + if (!free_mapped) 1579 + flush_tlb_kernel_range(start, end); 1578 1580 } 1579 1581 1580 1582 static void free_empty_pte_table(pmd_t *pmdp, unsigned long addr,