Re: Better lmbench 2.2.14pre4 vs. 2.3.28

Larry McVoy (lm@bitmover.com)
Fri, 19 Nov 1999 18:42:38 -0800

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Alexander Viro: "[PATCH][procfs] procfs cleanup - against 29-pre1"
Previous message: Dancer: "Re: Full disk and 2.3.x kernels."

On Sat, Nov 20, 1999 at 02:27:47AM +0000, Linus Torvalds wrote:
> In article <Pine.LNX.4.10.9911161509270.4535-100000@chiara.csoma.elte.hu>,
> Ingo Molnar <mingo@chiara.csoma.elte.hu> wrote:
> >
> >i'm quite sure these are due to the scheduling heuristics removed in 2.3 -
> >we are running two lmbench processes on two CPUs, which causes excessive
> >cross-CPU (and main memory) traffic. Will think about how to solve this
> >cleanly.
>
> I know how to solve it right, but "cleanly"?
>
> The problem is that certain events are really of the type "wake up
> somebody else, and then I will go to sleep".
>
> This is true of the pipe case, for example: we've filled the write
> buffer, and we want to wake up the reader and go to sleep (or we're the
> reader and we've emptied the buffer and want to go to sleep).
>
> In that case, all of our current clever "select a good CPU to schedule
> for the wakeup" is just so much crap: we should not waste precious CPU
> cycles doing that, considering that the current thread is "synchronous"
> wrt the event in question anyway, and we're probably a lot better off
> just trying to do a nice clean single-CPU re-schedule from the producer
> to the consumer (or back).
>
> In short, what we really want is a "synchronous wakeup": a wakeup that
> does not imply the "search for a good CPU" heuristic.
>
> The problem is that unlike the "WAKEONE" thing, where it is the
> _sleeper_ that wants to say that it's a special kind of sleeper, this
> case it's the person who does the wakeup() that wants to tell the world
> that it's a special kind of waker that wants to wake people up in order
> for them to do some more work for it. And we don't have a very nice way
> of doing that right now, I'm afraid.

I'm not sure that you can. My experiences with all the special scheduling
hacks in IRIX convinced me that less is more. It is probably far better to
do this:

- put processes on the same CPU where they are forked into existence
- move a process only when a CPU is very busy for more than just one
timeslice
- perhaps move entire process groups
- provide a "runon CPU#" interface which can bind a process to a CPU -
both syscall level and command level are useful

rather than this:

- try and intuit the best thing at each scheduling event.

SGI did the hard work at almost every scheduling event and their simple
context switch cases are somewhere between 10 and 100x slower than Linux.

--- Larry McVoy lm@bitmover.com http://www.bitmover.com/lm

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/

Next message: Alexander Viro: "[PATCH][procfs] procfs cleanup - against 29-pre1"
Previous message: Dancer: "Re: Full disk and 2.3.x kernels."