[PATCH V3 0/4] arm64/mm: Enable memory hot remove

From: Anshuman Khandual
Date: Tue May 14 2019 - 05:01:46 EST


This series enables memory hot remove on arm64 after fixing a memblock
removal ordering problem in generic __remove_memory() and kernel page
table race conditions on arm64. This is based on the following arm64
working tree.

git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-next/core

David had pointed out that the following patch is already in next/master
(58b11e136dcc14358) and will conflict with the last patch here. Will fix
the conflict once this series gets reviewed and agreed upon.

Author: David Hildenbrand <david@xxxxxxxxxx>
Date: Wed Apr 10 11:02:27 2019 +1000

mm/memory_hotplug: make __remove_pages() and arch_remove_memory() never fail

All callers of arch_remove_memory() ignore errors. And we should really
try to remove any errors from the memory removal path. No more errors are
reported from __remove_pages(). BUG() in s390x code in case
arch_remove_memory() is triggered. We may implement that properly later.
WARN in case powerpc code failed to remove the section mapping, which is
better than ignoring the error completely right now.

Testing:

Tested memory hot remove on arm64 for 4K, 16K, 64K page config options with
all possible CONFIG_ARM64_VA_BITS and CONFIG_PGTABLE_LEVELS combinations. But
build tested on non arm64 platforms.

Changes in V3:

- Implemented most of the suggestions from Mark Rutland for remove_pagetable()
- Fixed applicable PGTABLE_LEVEL wrappers around pgtable page freeing functions
- Replaced 'direct' with 'sparse_vmap' in remove_pagetable() with inverted polarity
- Changed pointer names ('p' at end) and removed tmp from iterations
- Perform intermediate TLB invalidation while clearing pgtable entries
- Dropped flush_tlb_kernel_range() in remove_pagetable()
- Added flush_tlb_kernel_range() in remove_pte_table() instead
- Renamed page freeing functions for pgtable page and mapped pages
- Used page range size instead of order while freeing mapped or pgtable pages
- Removed all PageReserved() handling while freeing mapped or pgtable pages
- Replaced XXX_index() with XXX_offset() while walking the kernel page table
- Used READ_ONCE() while fetching individual pgtable entries
- Taken overall init_mm.page_table_lock instead of just while changing an entry
- Dropped previously added [pmd|pud]_index() which are not required anymore

- Added a new patch to protect kernel page table race condtion for ptdump
- Added a new patch from Mark Rutland to prevent huge-vmap with ptdump

Changes in V2: (https://lkml.org/lkml/2019/4/14/5)

- Added all received review and ack tags
- Split the series from ZONE_DEVICE enablement for better review
- Moved memblock re-order patch to the front as per Robin Murphy
- Updated commit message on memblock re-order patch per Michal Hocko
- Dropped [pmd|pud]_large() definitions
- Used existing [pmd|pud]_sect() instead of earlier [pmd|pud]_large()
- Removed __meminit and __ref tags as per Oscar Salvador
- Dropped unnecessary 'ret' init in arch_add_memory() per Robin Murphy
- Skipped calling into pgtable_page_dtor() for linear mapping page table
pages and updated all relevant functions

Changes in V1: (https://lkml.org/lkml/2019/4/3/28)

Anshuman Khandual (3):
mm/hotplug: Reorder arch_remove_memory() call in __remove_memory()
arm64/mm: Hold memory hotplug lock while walking for kernel page table dump
arm64/mm: Enable memory hot remove

Mark Rutland (1):
arm64/mm: Inhibit huge-vmap with ptdump

arch/arm64/Kconfig | 3 +
arch/arm64/mm/mmu.c | 215 ++++++++++++++++++++++++++++++++++++++++-
arch/arm64/mm/ptdump_debugfs.c | 3 +
mm/memory_hotplug.c | 3 +-
4 files changed, 217 insertions(+), 7 deletions(-)

--
2.7.4