Re: touch_cache() only touches two thirds

From: Andi Kleen
Date: Fri Nov 10 2006 - 19:24:17 EST



> The corrected code in <http://bugzilla.kernel.org/show_bug.cgi?id=7476#c4>
> covers the full cache range. Granted that modern CPUs may be able to track
> multiple simultaneous cache access streams: how many such streams are they
> likely to be able to follow at once? It seems like going from 1 to 2 would
> be a big win, 2 to 3 a small win, beyond that it wouldn't likely make much
> incremental difference. So what do the actual implementations in the field
> support?

I remember reading at some point that a POWER4 could track at least 5+ parallel
streams. I don't know how many K8 handles, but it is multiple too at least
(forward and backwards)

I don't have more data, but at least the newer Intel CPUs seem to be also
very good at prefetching and when you look at a die photo the L/S unit
in general is quite big. More than 6 streams handled is certainly a
possibility.

I guess it could be figured out with some clever benchmarking.
>
> The code (original and corrected) uses 6 simultaneous streams.

My gut feeling is that this is not enough.

> I have a modified version that takes a `ways' parameter to use an arbitrary
> number of streams. I'll post that onto bugzilla.kernel.org.

Post it to the list please.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/