From: Christoph Lameter
Date: Fri Aug 28 2009 - 15:08:58 EST

On Fri, 28 Aug 2009, Thomas Gleixner wrote:

> That makes sense and should not be rocket science to implement.

I like it and such a thing would do a lot for reducing noise.

However, look at a typical task (from the HPC world) that would be
running on an isolated processors. It would

1. Spin on some memory location waiting for an event.

2. Process data passed to it, prepare output data and then go back to 1.

The enticing thing about doing 1 with shared memory and/or infiniband is
that it can be done in a few hundred nanoseconds instead of 10-20
microseconds. This allows a much faster IPC communication if we bypass
the OS.

For many uses deterministic responses are desired. If the handler that
runs is never disturbed by extraneous processing (IPI, faults, irqs etc)
then we can say that we run at the maximum speed that the machine can run
at. That is what many sites expect.

In an HPC environment synchronization points are essential and the
frequency of synchronization points (where we spin on a cacheline) is
important for the ability to scale the accuratey and the performance of
the algorithm. If we can make N processor operate in a deterministic
fashion on f.e. an array of floating point numbers then the rendezvous
occurring with minimal wait time in each of the N processes. Getting rid
of all sources of interruptions gets us the best performance possible.

Right now often strong variability makes it necessary to have long
durations of the processing periods and deal with long wait times because
one of the N processes has not finished yet.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at