Re: [RFC] perf: a different approach to perf_rotate_context()

From: Song Liu
Date: Mon Mar 12 2018 - 20:40:00 EST




> On Mar 3, 2018, at 7:26 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Mar 01, 2018 at 11:53:21AM -0800, Song Liu wrote:
>
>> Second, flexible_groups in cpuctx->ctx and cpuctx->task_ctx now have
>> exact same priority and equal chance to run. I am not sure whether this
>> will change the behavior in some use cases.
>>
>> Please kindly let me know whether this approach makes sense.
>
> What you've not said is, and what is not at all clear, is if your scheme
> preserved fairness.
>
>
> In any case, there's a ton of conflict against the patches here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=perf/testing
>
> And with those the idea was to move to a virtual time based scheduler
> (basically schedule those flexible events that have the biggest lag --
> that also solves 1).

While looking at these patches, I found it might not solve issue #1
(cpuctx->task_ctx->flexible_groups starvation). Here is an example on Intel
CPU (where ref-cycle can only use one hardware counter):

First, in one console start:
perf stat -e ref-cycles -I 10000

Second, in another console run:
perf stat -e ref-cycles -- benchmark

The second event will not run because the first event occupies the counter
all the time.

Maybe we can solve this by combining the two flexible_groups (cpuctx->ctx,
and cpuctx->task_ctx), and rotate them together? If this sounds reasonable,
I would draft a RFC for it.

Thanks,
Song