RE: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranginginstruction sequence and saving register
From: Ma, Ling
Date: Mon Oct 15 2012 - 01:07:32 EST
Thanks Boris!
So the patch is helpful and no impact for other/older machines,
I will re-send new version according to comments.
Any further comments are appreciated!
Regards
Ling
> -----Original Message-----
> From: Borislav Petkov [mailto:bp@xxxxxxxxx]
> Sent: Sunday, October 14, 2012 6:58 PM
> To: Ma, Ling
> Cc: Konrad Rzeszutek Wilk; mingo@xxxxxxx; hpa@xxxxxxxxx;
> tglx@xxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; iant@xxxxxxxxxx;
> George Spelvin
> Subject: Re: [PATCH RFC 2/2] [x86] Optimize copy_page by re-arranging
> instruction sequence and saving register
>
> On Fri, Oct 12, 2012 at 08:04:11PM +0200, Borislav Petkov wrote:
> > Right, so benchmark shows around 20% speedup on Bulldozer but this is
> > a microbenchmark and before pursue this further, we need to verify
> > whether this brings any palpable speedup with a real benchmark, I
> > don't know, kernbench, netbench, whatever. Even something as boring
> as
> > kernel build. And probably check for perf regressions on the rest of
> > the uarches.
>
> Ok, so to summarize, on AMD we're using REP MOVSQ which is even faster
> than the unrolled version. I've added the REP MOVSQ version to the
> Âbenchmark. It nicely validates that we're correctly setting
> X86_FEATURE_REP_GOOD on everything >= F10h and some K8s.
>
> So, to answer Konrad's question: those patches don't concern AMD
> machines.
>
> Thanks.
>
> --
> Regards/Gruss,
> Boris.
èº{.nÇ+·®+%Ëlzwm
ébëæìr¸zX§»®w¥{ayºÊÚë,j¢f£¢·hàz¹®w¥¢¸¢·¦j:+v¨wèjØm¶ÿ¾«êçzZ+ùÝj"ú!¶iOæ¬z·vØ^¶m§ÿðÃnÆàþY&