Re: [PATCH RFC] [x86] Optimize copy-page by reducing impact from HWprefetch

From: Ingo Molnar
Date: Thu Jun 23 2011 - 03:05:05 EST

* Andi Kleen <andi@xxxxxxxxxxxxxx> wrote:

> writes:
> > impact(DCU prefetcher), and simplify original code. The
> > performance is improved about 15% on core2, 36% on snb
> > respectively. (We use our micro-benchmark, and will do further
> > test according to your requirment)
> This doesn't make a lot of sense because neither Core-2 nor SNB use
> the code path you patched. They all use the rep ; movs path

Ling, mind double checking which one is the faster/better one on SNB,
in cold-cache and hot-cache situations, copy_page or copy_page_c?

Also, while looking at this file please fix the countless pieces of
style excrements it has before modifying it:

- non-Linux comment style (and needless two comments - it can
be in one comment block):

/* Don't use streaming store because it's better when the target
ends up in cache. */

/* Could vary the prefetch distance based on SMP/UP */

- (there's other non-standard comment blocks in this file as well)

- The copy_page/copy_page_c naming is needlessly obfuscated, it
should be copy_page, copy_page_norep or so - the _c postfix has no
obvious meaning.

- all #include's should be at the top

- please standardize it on the 'instrn %x, %y' pattern that we
generally use in arch/x86/, not 'instrn %x,%y' pattern.

and do this cleanup patch first and the speedup on top of it, and
keep the two in two separate patches so that the modification to the
assembly code can be reviewed more easily.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at