Re: CFS: some bad numbers with Java/database threading [FIXED]

From: Peter Zijlstra
Date: Wed Sep 19 2007 - 18:00:09 EST


On Wed, 19 Sep 2007 23:41:05 +0200 Ingo Molnar <mingo@xxxxxxx> wrote:

>
> * Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> > Btw, the "enqueue at the end" could easily be a statistical thing, not
> > an absolute thing. So it's possible that we could perhaps implement
> > the CFS "yield()" using the same logic as we have now, except *not*
> > calling the "update_stats()" stuff:
> >
> > __dequeue_entity(..);
> > __enqueue_entity(..);
> >
> > and then just force the "fair_key" of the to something that
> > *statistically* means that it should be at the end of its nice queue.
> >
> > I dunno.
>
> i thought a bit about the statistical approach, and it's good in
> principle, but it has an implementational problem/complication: if there
> are only yielding tasks in the system, then the "queue rightwards in the
> tree, statistically" approach cycles through the key-space artificially
> fast. That can cause various problems. (this means that the
> workload-flag patch that uses yield_granularity is buggy as well. The
> queue-rightmost patch did not have this problem.)
>
> So right now there are only two viable options i think: either do the
> current weak thing, or do the rightmost thing. The statistical method
> might work too, but it needs more thought and more testing - i'm not
> sure we can get that ready for 2.6.23.
>
> So what we have as working code right now is the two extremes, and apps
> will really mostly prefer either the first (if they dont truly want to
> use yield but somehow it got into their code) or the second (if they
> want some massive delay). So while it does not have a good QoI, how
> about doing a compat_yield sysctl that allows the turning on of the
> "queue rightmost" logic? Find tested patch below.
>
> Peter, what do you think?

I have to agree that for .23 we can't do much more than this. And tasks
moving to the right without actually doing work and advancing
fair_clock do scare me a little.

Also, while I agree with Linus' definition of sched_yield, I'm afraid
it will cause 'regressions' for all the interactivity people out there.
Somehow this yield thing has made it into all sorts of unfortunate
places like video drivers, so a heavy penalizing yield will hurt them.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/