Re: [RFC] perf_events: ctx_flexible_sched_in() not maximizing PMU utilization

From: Stephane Eranian
Date: Mon May 10 2010 - 05:41:22 EST


On Fri, May 7, 2010 at 1:15 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Fri, 2010-05-07 at 12:49 +0200, Stephane Eranian wrote:
>> You'd have to insert all event into the tree, read leftmost.
>> I believe you need more than just basic integer arithmetic
>> to compare s_i to avg. Or you need to tweak the values
>> to get more precisions. But you may already be doing that
>> elsewhere in the kernel.
>
> I've got a modification to CFS which implements EEVDF which needs
> similar eligibility checks. So yeah, I've got code to do this.
>
> The trick to computable avg is to keep a monotonic min_s around and use
> ds_i = s_i - min_s. These ds_i will be 'small', in the order of the max
> lag.
>
> We can thus keep a sum of ds_i up-to-date when inserting/removing events
> from the tree without fear of overflowing our integer.
>
> When we update min_s, we must also update our relative sum
> proportionally and in the opposite direction.
>
> Comparing for eligibility can be done by:
>
> s_i < 1/n \Sum s_i, or s_i - min_s < 1/n \Sum s_i - min_s, which we can
> write as: n*ds_i < \Sum ds_i
>
> Again, this can be done without fear of overflows because ds_i is small.
>
Ok, that's what I thought. You had that for the scheduler.

Looks like a good solution, at least better than what is there right now.

>> Yes. Not clear how you could avoid this without having a global
>> view of the dependencies between events (which are really event
>> groups, BTW). Here you'd need to know that if you have
>> evt  A ÂB ÂC
>> s(0) Â0 Â 0 Â0 -> avg = 0/3=0.00, sort = A, B, C, schedule A, fail on B
>> s(1) Â1 Â 0 Â0 -> avg = 1/3=0.33, sort = B, C, A, schedule B, fail on C
>>
>> You'd have two options:
>> Â Âs(2) Â1 Â 1 Â0 -> avg = 2/3=0.66, sort = C, A, B, schedule A, C
>> or
>> Â Âs(2) Â1 Â 1 Â0 -> avg = 2/3=0.66, sort = C, B, A Âschedule C
>>
>> The first is more attractive because it shortens the blind spots on C.
>> Both are equally fair, though. Looks like you'd need to add a 2nd
>> parameter to the sort when s_i are equal. That would have to be
>> related to the number of constraints. You could Âsort in reverse order
>> to the number of constraints, assuming you can express the constraint
>> as a number. For simple constraints, such as counter restrictions, you
>> could simply compare the number of possible counters between events.
>> But there are PMU where there is no counter constraints but events are
>> incompatible. What values do you compare Âin this case?
>
> Not sure, but yeah, using constraint data to tie break is indeed an
> interesting option.
>
> I wonder how much tie breaking we'll really need in practice though, if
> we use event->total_time_running as our s_i we've got ns resolution
> timestamps, and with sub jiffies preemption like is common I doubt we'll
> actually see a lot equal service numbers.
>
I suspect you are right. I would not worry about this now. This can be
fixed later if it turns out to be problematic in corner cases.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/