Re: [PATCH] Athlon/Opteron Prefetch Fix for 2.6.0test5 + numbers

From: Nick Piggin
Date: Tue Sep 16 2003 - 23:55:14 EST

Next message: NIWA Hideyuki: "Re: [RFC] Class-based Kernel Resource Management"
Previous message: David S. Miller: "Re: Incremental update of TCP Checksum"
In reply to: Andi Kleen: "Re: [PATCH] Athlon/Opteron Prefetch Fix for 2.6.0test5 + numbers"
Next in thread: Andrew Morton: "Re: [PATCH] Athlon/Opteron Prefetch Fix for 2.6.0test5 + numbers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Andrew Morton wrote:

Andi Kleen <ak@xxxxxxx> wrote:

This is much more efficient than the previous workaround used in the kernel,
which checked for AMD CPUs in every prefetch(). This can be seen in the size of the vmlinux:

That is hardly a serious comparison: the workaround is just to stop the
oopses while this gets sorted out. It makes no pretense at either
efficiency or permanence.

Without patch:
text data bss dec hex filename
4020232 665956 169092 4855280 4a15f0 vmlinux
With patch:
4011578 665973 169092 4846643 49f433

hrm. Why did data grow?

With prefetch check: 3.7268 microseconds
Without prefetch check: 3.65945 microseconds

We don't know how much of this difference is due to removing the branch and
how much is due to reenabling prefetch.

It would be interesting to see comparative benchmarking between prefetch
and no prefetch at all, see whether this feature is worth its icache
footprint.

The test was on a pentium 4, so its only removing the extra code.

I think Andi's patch is required (especially because it fixes
userspace), and under the current cpu selection scheme, it is
implemented correctly (although I am now at a loss as to what the
generic thing is for).

The conditional compilation thing is a seperate issue. This patch may
have just broken a few camels' backs.

What is intriguing to me is the "Its only a 2% slowdown of the page
fault for every cpu other than K[78] for this single workaround. There
is no point to conditional compilation" attitude some people have.
Of course, its only 2% on a pagefault, not anywhere near 2% of kernel
performance as a whole, so maybe that is justified.

Just repeating though, that is a seperate issue and I think Andi's patch
is needed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: NIWA Hideyuki: "Re: [RFC] Class-based Kernel Resource Management"
Previous message: David S. Miller: "Re: Incremental update of TCP Checksum"
In reply to: Andi Kleen: "Re: [PATCH] Athlon/Opteron Prefetch Fix for 2.6.0test5 + numbers"
Next in thread: Andrew Morton: "Re: [PATCH] Athlon/Opteron Prefetch Fix for 2.6.0test5 + numbers"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]