Re: [RFC patch 1/2] sched: dynamically adapt granularity withnr_running

From: Mathieu Desnoyers
Date: Sat Sep 11 2010 - 15:57:30 EST


* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> On Sat, 2010-09-11 at 13:37 -0400, Mathieu Desnoyers wrote:
>
> Its not at all clear what or why you're doing what exactly.
>
> What we used to have is:
>
> period -- time in which each task gets scheduled once
>
> This period was adaptive in that we had an ideal period
> (sysctl_sched_latency), but since keeping to this means that each task
> gets latency/nr_running time. This is undesired in that it means busy
> systems will over-schedule due to tiny slices. Hence we also had a
> minimum slice (sysctl_sched_min_granularity).
>
> This yields:
>
> period := max(sched_latency, nr_running * sched_min_granularity)
>
> [ where we introduce the intermediate:
> nr_latency := sched_latency / sched_min_granularity
> in order to avoid the multiplication where possible ]
>
> Now you introduce a separate preemption measure, sched_gran as:
>
> sched_std_granularity; nr_running <= 8
> sched_gran := {
> max(sched_min_granularity, sched_latency / nr_running)
>
> Which doesn't make any sense at all, because it will either be larger or
> as large as the current sched_min_granularity.
>
> And you break the above definition of period by replacing nr_latency by
> 8.
>
> Not at all charmed, this look like random changes without conceptual
> integrity.

Err.. I think the preemption measure you are describing does not match my code,
so let's try to figure this one out. Here is what I am doing:

nr_latency is still 3.
I introduce nr_latency_max (8).

sched_min_granularity is now sched_latency / nr_latency_max
sched_std_granularity is sched_latency / nr_latency

sched_std_granularity is the granularity effective when there are 3 tasks or
less running. This is the exact same behavior as the current kernel.

For more than 8 tasks, the behavior is the same as the current kernel (we
increase the scheduling period, ergo the latency); we are using the new
"sched_min_granularity" (which is now sched_latency / 8 rather than
sched_latency /3).

The interesting part is in the range from 4 to 8 tasks. I diminish the scheduler
granularity as the number of tasks increases rather than increasing latency.
This leads to more scheduler preemptions than usual, but only in the 4-8 running
tasks range.

We could possibly fine-tune nr_latency_max to a value that would keep an
appropriate sched_min_granularity (that would not cause an insane rate of
scheduler events).

The major interest in the approach I propose (rather than just increasing
nr_latency and decreasing sched_min_granularity) is that I don't have to change
the scheduler granularity when there are only few tasks running. So the extra
scheduler overhead is only taken when we are running more tasks.

I hope my explanation clarifies things a bit,

Thanks,

Mathieu


--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/