RE: [PATCH] x86/copy_user_generic: Optimize copy_user_generic withCPU erms feature

From: Yu, Fenghua
Date: Thu May 24 2012 - 22:47:20 EST


> From: David Miller [mailto:davem@xxxxxxxxxxxxx]
> Sent: Thursday, May 24, 2012 6:50 PM
> From: "Fenghua Yu" <fenghua.yu@xxxxxxxxx>
> Date: Thu, 24 May 2012 18:19:45 -0700
>
> > According to Intel 64 and IA-32 SDM and Optimization Reference
> > Manual, beginning with Ivybridge, REG string operation using MOVSB
> > and STOSB can provide both flexible and high-performance REG string
> > operations in cases like memory copy. Enhancement availability is
> > indicated by CPUID.7.0.EBX[9] (Enhanced REP MOVSB/ STOSB).
>
> How does the cpu do overlap detection?
>
> If the cpu does overlap detection on sub-pagesize bits, performance
> will unnecessarily suffer under such circumstances.

Are you talking about memory overlap between source and destination? There is no overlap between these two areas in copy_user case because one area is in user space and another one is in kernel space.

In overlap case, it's software that detects overlap and sets backward copy. I don't see backward rep movsb performance degradation from my measurement.

Thanks.

-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/