Re: [PATCH] perf cgroups: Don't rotate events for cgroups unnecessarily

From: Ian Rogers
Date: Thu Jun 27 2019 - 17:47:28 EST


group_index On Mon, Jun 24, 2019 at 12:55 AM Peter Zijlstra
<peterz@xxxxxxxxxxxxx> wrote:
>
> On Fri, Jun 21, 2019 at 11:01:29AM -0700, Ian Rogers wrote:
> > On Fri, Jun 21, 2019 at 1:24 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Sat, Jun 01, 2019 at 01:27:22AM -0700, Ian Rogers wrote:
> > > > @@ -3325,6 +3331,15 @@ static int flexible_sched_in(struct perf_event *event, void *data)
> > > > sid->can_add_hw = 0;
> > > > }
> > > >
> > > > + /*
> > > > + * If the group wasn't scheduled then set that multiplexing is necessary
> > > > + * for the context. Note, this won't be set if the event wasn't
> > > > + * scheduled due to event_filter_match failing due to the earlier
> > > > + * return.
> > > > + */
> > > > + if (event->state == PERF_EVENT_STATE_INACTIVE)
> > > > + sid->ctx->rotate_necessary = 1;
> > > > +
> > > > return 0;
> > > > }
> > >
> > > That looked odd; which had me look harder at this function which
> > > resulted in the below. Should we not terminate the context interation
> > > the moment one flexible thingy fails to schedule?
> >
> > If we knew all the events were hardware events then this would be
> > true, as there may be software events that always schedule then the
> > continued iteration is necessary.
>
> But this is the 'old' code, where this is guaranteed by the context.
> That is, if this is a hardware context; there wil only be software
> events due to them being in a group with hardware events.
>
> If this is a software group, then we'll never fail to schedule and we'll
> not get in this branch to begin with.
>
> Or am I now confused for having been staring at two different code-bases
> at the same time?

I believe you're right and I'd overlooked this. I think there is a
more efficient version of this code possible that I can follow up
with. There are 3 perf_event_contexts, potentially a sw and hw context
within the task_struct and one per-CPU in perf_cpu_context. With this
change I'm focussed on improving rotation of cgroup events that appear
system wide within the per-CPU context. Without cgroup events the
system wide events don't need to be iterated over during scheduling
in. The branch that can set rotate_necessary will only be necessary
for the task hardware events. For system wide events, considered with
cgroup mode scheduling in, the branch is necessary as rotation may be
necessary. It'd be possible to have variants of flexible_sched_in that
behave differently in the task software event context, and the system
wide and task hardware contexts.

I have a series of patches related to Kan Liang's cgroup
perf_event_groups improvements. I'll mail these out and see if I can
avoid the branch in the task software event context case.

Thanks,
Ian