Re: [RFC][PATCH] perf: Rewrite core context handling

From: Song Liu
Date: Wed Oct 17 2018 - 12:45:09 EST




> On Oct 17, 2018, at 4:06 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Oct 16, 2018 at 06:28:10PM +0000, Song Liu wrote:
>>> How about this:
>>>
>>> 1. Keep multiple perf_cpu_context per CPU, just like before this patch.
>>>
>>> 2. For perf_event_context, add PMU as an order for the RB tree.
>>>
>>> 3. (hw) pmu->perf_cpu_context->ctx only has events for this PMU (and sw
>>> events moved to this context).
>>>
>>> 4. task->perf_event_ctxp has events for all PMUs.
>>>
>>> With this path, we keep the existing perf_cpu_context/perf_event_context
>>> logic as-is, which I think is simpler than the new logic (with extra
>>> *_pmu_context). And it should also solve the problem.
>>>
>>> Does this make sense? If this doesn't look too broken, I am happy to
>>> draft RFC for it.
>>>
>>
>> I am not sure whether you missed this one, or found it totally insane.
>> Could you please share your comments on it? My gut feeling is that this
>> would be a simpler patch to solve the problem (two hw PMUs). (It might
>> be less efficient though).
>
> Ah, sorry, somehow this email got lost.
>
> That makes task and cpu contexts wildly different, which will complicate
> matters I feel.
>

I think we only need different logic when adding events to the task/cpu
contexts. The ctx_sched_in() and ctx_sched_out() will need some extra
logic to filter out events that are not being scheduled (don't schedule
events on PMU-a when rotating PMU-b). This logic will be the same for
task and cpu context. The difference is, the CPU context will not have
such events, because we never added such event to CPU context.

Does this make sense? I could try draft a RFC to see how difficult it is.

Thanks,
Song