Re: [PATCH v4 00/13] x86/mm: Add multi-page clearing

From: Ankur Arora
Date: Mon Jun 16 2025 - 14:26:08 EST



Dave Hansen <dave.hansen@xxxxxxxxx> writes:

> On 6/15/25 22:22, Ankur Arora wrote:
>> This series adds multi-page clearing for hugepages, improving on the
>> current page-at-a-time approach in two ways:
>>
>> - amortizes the per-page setup cost over a larger extent
>> - when using string instructions, exposes the real region size to the
>> processor. A processor could use that as a hint to optimize based
>> on the full extent size. AMD Zen uarchs, as an example, elide
>> allocation of cachelines for regions larger than L3-size.
>
> Have you happened to do any testing outside of 'perf bench'?

Yeah. My original tests were with qemu creating a pinned guest (where it
would go and touch pages after allocation.)

I think perf bench is a reasonably good test is because a lot of demand
faulting often just boils down to the same kind of loop. And of course
MAP_POPULATE is essentially equal to the clearing loop in the kernel.

I'm happy to try other tests if you have some in mind.

And, thanks for the quick comments!

--
ankur