Re: 2.6.0-test4 shocking (HT) benchmarking (wrong logic./phys. HT CPU distinction?)

From: bill davidsen
Date: Tue Aug 26 2003 - 14:29:50 EST


In article <200308261552.44541.max@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
<max@xxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:

| The next strange thing is that using FFTW (in a single program) with two or
| four simulataneous FFT-threads on 2.6.0-test4 is significantly *slower* than
| in 2.4, where the hyperthreading/SMP is said to be inferior to 2.6. The
| simulation took almost 50% longer on 2.6, all other things being equal.
| Unfortunately, these results made me go back to 2.4 for the time being.

You may want to run w/o HT at all for your particular problem, or limit
threads to the number of physical CPUs. Having multiple threads trying
to do ffts is likely to thrash the cache, and will result in contention
for the (single) FPU.

The schedular would seem to lack the information to make a determination
of the magnitude of the fpu contention, and by making better use of HT
it makes the contention worse.

Perhaps there should be a "don't hyperthread" attribute one could set to
hint that running multiple threads in a single CPU is unlikely to work
well. Since there isn't, I don't know a way to avoid the problem unless
you can set maxthreads to Ncpu.
--
bill davidsen <davidsen@xxxxxxx>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/