Re: [PATCH] perf: optimize clear page in Intel specified model with movq instruction

From: Borislav Petkov
Date: Thu Sep 09 2021 - 05:39:57 EST


On Thu, Sep 09, 2021 at 04:45:51PM +0800, Jinhua Wu wrote:
> Clear page is the most time-consuming procedure in page fault handling.
> Kernel use fast-string instruction to clear page. We found that in specified
> Intel model such as CPX and ICX, the movq instruction perform much better
> than fast-string instruction when corresponding page is not in cache.
> But when the page is in cache, fast string perform better. We show the test
> result in the following:

What you should do is show the extensive tests you've run with
real-world benchmarks where you really can show 40% performance
improvement.

Also, the static branch "approach" you're using ain't gonna happen. If
anything, another X86_FEATURE_* bit.

Good luck.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette