RE: [RFC 0/6] optimize ctx switch with rb-tree

From: Budankov, Alexey
Date: Wed Apr 26 2017 - 06:35:01 EST


Hi David,

I would like to take over on the patches development relying on your help with reviews.

Could you provide me with the cumulative patch set to expedite the ramp up?

Thanks,
Alexey

-----Original Message-----
From: David Carrillo-Cisneros [mailto:davidcc@xxxxxxxxxx]
Sent: Tuesday, April 25, 2017 9:55 PM
To: Budankov, Alexey <alexey.budankov@xxxxxxxxx>
Cc: Liang, Kan <kan.liang@xxxxxxxxx>; linux-kernel@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx; Ingo Molnar <mingo@xxxxxxxxxx>; Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Andi Kleen <ak@xxxxxxxxxxxxxxx>; Peter Zijlstra <peterz@xxxxxxxxxxxxx>; Borislav Petkov <bp@xxxxxxx>; Srinivas Pandruvada <srinivas.pandruvada@xxxxxxxxxxxxxxx>; Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>; Vikas Shivappa <vikas.shivappa@xxxxxxxxxxxxxxx>; Mark Rutland <mark.rutland@xxxxxxx>; Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>; Vince Weaver <vince@xxxxxxxxxx>; Paul Turner <pjt@xxxxxxxxxx>; Stephane Eranian <eranian@xxxxxxxxxx>
Subject: Re: [RFC 0/6] optimize ctx switch with rb-tree

>
> If I disable traversing in the per-process case then the overhead disappears.
>
> For the system-wide case the ctx->pinned_groups and ctx->flexible_groups lists are parts of per-cpu perf_cpu_context object and count of iterations is small (#events == 29).


Yes, seems like it would benefit from the rb-tree optimization.

Something that is wrong in my RFC (as Mark notes in the "enjoyment"
section of https://lkml.org/lkml/2017/1/12/254), is that care must be taken to disable the right pmu when dealing with contexts that have events from more than one PMU. A way to do it is to have the pmu as part of the rb-tree key (as Peter initially suggested) and use that to iterate events in the same pmu together.

There's still the open question of what to do when pmu->add fails.
Currently, it stops scheduling events, but that's not right when dealing with events in "software context" that are not software events (I am looking at you CQM) and in hardware contexts with more than one PMU (ARM big-little). Ideally a change in event scheduler should address that, but it requires more work. Here is a discussion with Peter about that (https://lkml.org/lkml/2017/1/25/365).

If you guys want to work on it, I'll be happy to help review.
Otherwise, I'll get to it as soon as I have a chance (1-2 months).

Thanks,
David

--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.