Re: [PATCH 6/7] sched: Clean up preempt_enable_no_resched() abuse

From: Eliezer Tamir
Date: Thu Nov 21 2013 - 08:26:27 EST


On 21/11/2013 12:10, Peter Zijlstra wrote:
> On Wed, Nov 20, 2013 at 08:02:54PM +0200, Eliezer Tamir wrote:
>> IMHO This has been reviewed thoroughly.
>>
>> When Ben Hutchings voiced concerns I rewrote the code to use time_after,
>> so even if you do get switched over to a CPU where the time is random
>> you will at most poll another full interval.
>>
>> Linus asked me to remove this since it makes us use two time values
>> instead of one. see https://lkml.org/lkml/2013/7/8/345.
>
> I'm not sure I see how this would be true.
>
> So the do_select() code basically does:
>
> for (;;) {
>
> /* actual poll loop */
>
> if (!need_resched()) {
> if (!busy_end) {
> busy_end = now() + busypoll;
> continue;
> }
> if (!((long)(busy_end - now()) < 0))
> continue;
> }
>
> /* go sleep */
>
> }
>
> So imagine our CPU0 timebase is 1 minute ahead of CPU1 (60e9 vs 0), and we start by:
>
> busy_end = now() + busypoll; /* CPU0: 60e9 + d */
>
> but then we migrate to CPU1 and do:
>
> busy_end - now() /* CPU1: 60e9 + d' */
>
> and find we're still a minute out; and in fact we'll keep spinning for
> that entire minute barring a need_resched().

not exactly, poll will return if there are any events to report of if
a signal is pending.

> Surely that's not intended and desired?

This limit is an extra safety net, because busy polling is expensive,
we limit the time we are willing to do it.

We don't override any limit the user has put on the system call.
A signal or having events to report will also stop the looping.
So we are mostly capping the resources an _idle_ system will waste
on busy polling.

We want to globally cap the amount of time the system busy polls, on
average. Nothing catastrophic will happen in the extremely rare occasion
that we miss.

The alternative is to use one more int on every poll/select all the
time, this seems like a bigger cost.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/