Re: [PATCH RFC] [X86] performance improvement for memcpy_64.S by fast string.

From: Andi Kleen
Date: Mon Nov 09 2009 - 04:26:24 EST


"H. Peter Anvin" <hpa@xxxxxxxxx> writes:
>
> My personal opinion is that if we can show no significant slowdown on
> P4, K8, P-M/Core 1, Core 2, and Nehalem then we can simply use this code

The issue is Core 2.

P4 uses a different path, and Core 1 doesn't use the 64bit code.

> unconditionally. If one of them is radically worse than baseline, then
> we have to do something conditional, which is a lot more complicated.

I have an older patchkit which did this, and some more optimizations
to this code.

There was still one open issue, that is why I didn't post it. If there's
interest I can post it.

-Andi
--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/