Re: sched_yield: delete sysctl_sched_compat_yield

From: Zhang, Yanmin
Date: Thu Nov 29 2007 - 22:16:59 EST


On Fri, 2007-11-30 at 13:46 +1100, Nick Piggin wrote:
> On Wednesday 28 November 2007 09:57, Arjan van de Ven wrote:
> > On Tue, 27 Nov 2007 17:33:05 +0800
> >
> > "Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx> wrote:
> > > If echo "1">/proc/sys/kernel/sched_compat_yield before starting
> > > volanoMark testing, the result is very good with kernel 2.6.24-rc3 on
> > > my 16-core tigerton.
> > >
> > > 1) If /proc/sys/kernel/sched_compat_yield=1, comparing with 2.6.22,
> > > 2.6.24-rc3 has more than 70% improvement;
> > > 2) If /proc/sys/kernel/sched_compat_yield=0, comparing with 2.6.22,
> > > 2.6.24-rc3 has more than 80% regression;
> > >
> > > On other machines, the volanoMark result also has much improvement if
> > > /proc/sys/kernel/sched_compat_yield=1.
> > >
> > > Would you like to change function yield_task_fair to delete codes
> > > around sysctl_sched_compat_yield, or just initiate it to 1?
> >
> > sounds like a bad idea; volanomark (well, technically the jvm behind
> > it) is abusing sched_yield() by assuming it does something it really
> > doesn't do, and as it happens some of the earlier 2.6 schedulers
> > accidentally happened to behave in a way that was nice for this
> > benchmark.
>
> OK, why is this still happening? Haven't we been asking JVMs to use
> futexes or posix locking for years and years now? Are there any sane
> jvms that _don't_ use yield?
I think it's an issue of volanomark (a kind of java application) instead of JVM.

>
>
> > Todays kernel has a different behavior somewhat (and before people
> > scream "regression"; sched_yield() behavior isn't really specified and
> > doesn't make any sense at all, whatever you get is what you get....
> > it's pretty much an insane defacto behavior that is incredibly tied to
> > which decisions the scheduler makes how, and no app can depend on that
>
> It is a performance regression. Is there any reason *not* to use the
> "compat" yield by default?
There is no, so I suggest to set sched_compat_yield=1 by default.
If sched_compat_yield=0, kernel almost does nothing but returns. When
sched_compat_yield=1, it is closer to the meaning of sched_yield man page.

> As you say, for SCHED_OTHER tasks, yield
> can do almost anything. We may as well do something that isn't a
> regression...
I just found SCHED_OTHER in man sched_setscheduler. Is it SCHED_NORMAL in
the latest kernel?

>
>
> > in any way. In fact, I've proposed to make sched_yield() just do an
> > msleep(1)... that'd be closer to what sched_yield is supposed to do
> > standard wise than any of the current behaviors .... ;_
>
> What makes you say that? IIRC of all the things that sched_yeild can
> do, it is not allowed to block. So this is about the only thing that
> will break the standard...
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/