Re: [PATCH] x86/copy_user_generic: Optimize copy_user_generic withCPU erms feature

From: David Miller
Date: Thu May 24 2012 - 23:21:23 EST


From: "Yu, Fenghua" <fenghua.yu@xxxxxxxxx>
Date: Fri, 25 May 2012 02:47:22 +0000

> Are you talking about memory overlap between source and destination?
> There is no overlap between these two areas in copy_user case
> because one area is in user space and another one is in kernel
> space.
>
> In overlap case, it's software that detects overlap and sets
> backward copy. I don't see backward rep movsb performance
> degradation from my measurement.

We have been told repeatedly in the past that the string instructions,
for compatibility with the defined semantics of the instruction, only
check the lowest bits when determining source and destination overlap.

So even if bits 12 and higher in the virtual address are different, it
is the address bits below bit 12 that determine overlap. And if this
overlap check triggers, the slow path is taken inside of the cpu.

This means that the impossibility of virtual address overlap, which
you mention, is irrelevant. Because it is the non-virtual address
bits which the cpu uses for overlap detection.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/