Re: CFS: some bad numbers with Java/database threading [FIXED]

From: Ingo Molnar
Date: Wed Sep 19 2007 - 15:57:20 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > Linus, what do you think? I have no strong feelings, I think the
> > patch cannot hurt (it does not change anything by default) - but we
> > should not turn the workaround flag on by default.
>
> I disagree. I think CFS made "sched_yield()" worse, and what you call
> "bug workaround" is likely the *better* behaviour.
>
> The fact is, sched_yield() is not - and should not be - about
> "recalculating the position in the scheduler queue" like you do now in
> CFS.
>
> It very much is about moving the thread *dead last* within its
> priority group.
[...]
> and quite frankly, the current CFS behaviour simply looks buggy. It
> should simply not move it to the "right place" in the rbtree. It
> should move it *last*.

ok, we can do that.

the O(1) implementation of yield() was pretty arbitrary: it did not move
it last on the same priority level - it only did it within the active
array. So expired tasks (such as CPU hogs) would come _after_ a
yield()-ing task.

so the yield() implementation was so much tied to the data structures of
the O(1) scheduler that it was impossible to fully emulate it in CFS.

in CFS we dont have a per-nice-level rbtree, so we cannot move it dead
last within the same priority group - but we can move it dead last in
the whole tree. (then they'd be put even after nice +19 tasks.) People
might complain about _that_.

another practical problem is that this will break certain desktop apps
that do calls to yield() [some firefox plugins do that, some 3D apps do
that, etc.] but they dont expect to be moved 'very late' into the queue
- they expect the O(1) scheduler's behavior of being delayed "a bit".
(That's why i added the yield-granularity tunable.)

we can make yield super-agressive, that is pretty much the only sane
(because well-defined) thing to do (besides turning yield into a NOP),
but there will be lots of regression reports about lost interactivity
during load. sched_yield() is a mortally broken API. "fix the app" would
be the answer, but still there will be lots of complaints.

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/