From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
Clearing a 2MB huge page will typically blow away several levels of CPU
caches. To avoid this only cache clear the 4K area around the fault
address and use a cache avoiding clears for the rest of the 2MB area.
It would be nice to test the patchset with more workloads. Especially if
you see performance regression with THP.
Any feedback is appreciated.
Andi Kleen (6):
THP: Use real address for NUMA policy
mm: make clear_huge_page tolerate non aligned address
THP: Pass real, not rounded, address to clear_huge_page
x86: Add clear_page_nocache
mm: make clear_huge_page cache clear only around the fault address
x86: switch the 64bit uncached page clear to SSE/AVX v2