touch_cache() only touches two thirds

From: Bela Lubkin
Date: Wed Nov 08 2006 - 19:56:54 EST


Submitted as <http://bugzilla.kernel.org/show_bug.cgi?id=7476>:

I noticed this bug while looking at something else. These comments are based
purely on source inspection without any runtime observations. The bug
remains in the latest sources I could find: 2.6.19-rc5 and rc5-mm1.

Ingo Molnar's patch referred to as "scheduler cache-hot-auto-tune" introduced
the function kernel/sched.c:touch_cache(). Here is the 2.6.19-rc5-mm1
version (my "forward / backward" comments):

/*
* Dirty a big buffer in a hard-to-predict (for the L2 cache) way. This
* is the operation that is timed, so we try to generate unpredictable
* cachemisses that still end up filling the L2 cache:
*/
static void touch_cache(void *__cache, unsigned long __size)
{
unsigned long size = __size / sizeof(long);
unsigned long chunk1 = size / 3;
unsigned long chunk2 = 2 * size / 3;
unsigned long *cache = __cache;
int i;

for (i = 0; i < size / 6; i += 8) {
switch (i % 6) {
case 0: cache[i]++; /* 1st third, forward */
case 1: cache[size-1-i]++; /* 3rd third, backward */
case 2: cache[chunk1-i]++; /* 1st third, backward */
case 3: cache[chunk1+i]++; /* 2nd third, forward */
case 4: cache[chunk2-i]++; /* 2nd third, backward */
case 5: cache[chunk2+i]++; /* 3rd third, forward */
}
}
}

Notice that the for() loop increments `i' by 8 (earlier versions of the code
used a stride of 4). Since all visited values of i are even, `i % 6' can
never be odd. The switch cases 1/3/5 will never be hit -- so the 3rd third
of the test buffer isn't being touched. Migration costs are actually being
calculated relative to buffers that are only 2/3 as large as intended.

An easy fix is to make the stride relatively prime to the modulus:

--- sched.c.orig 2006-11-08 16:17:37.299500000 -0800
+++ sched.c 2006-11-08 16:28:21.699750000 -0800
@@ -5829,5 +5829,5 @@ static void touch_cache(void *__cache, u
int i;

- for (i = 0; i < size / 6; i += 8) {
+ for (i = 0; i < size / 6; i += 7) {
switch (i % 6) {
case 0: cache[i]++;

>Bela<

PS: I suspect this at least partially explains Kenneth W Chen's observations
in "Variation in measure_migration_cost() with
scheduler-cache-hot-autodetect.patch in -mm",
<http://lkml.org/lkml/2005/6/21/473>:

> I'm consistently getting a smaller than expected cache migration cost
> as measured by Ingo's scheduler-cache-hot-autodetect.patch currently
> in -mm tree.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/