Re: [PATCH 4/5] perf: Introduce task, softirq and hardirq contextsexclusion

From: Frederic Weisbecker
Date: Thu Jun 10 2010 - 12:24:39 EST


On Thu, Jun 10, 2010 at 01:01:27PM +0200, Peter Zijlstra wrote:
> On Thu, 2010-06-10 at 05:49 +0200, Frederic Weisbecker wrote:
> > @@ -642,17 +643,31 @@ event_sched_in(struct perf_event *event,
> > if (event->state <= PERF_EVENT_STATE_OFF)
> > return 0;
> >
> > - event->state = PERF_EVENT_STATE_ACTIVE;
> > + if (event->attr.exclude_task)
> > + event->state = PERF_EVENT_STATE_PAUSED;
> > + else
> > + event->state = PERF_EVENT_STATE_ACTIVE;
> > +
> > event->oncpu = smp_processor_id();
> > +
>
> Aah, so that is why you added the PAUSE state knowledge to the arch
> code, you want to be able to call ->enable() on a PAUSEd event.
>
> That means you need to audit/touch all implementations anyway, isn't
> there a better interface we can use, like maybe extend ->enable() with a
> flags argument?




The problem is more on hw_perf_enable/disable() as this is the place
where we do the true activations.

Moreover we also need to keep track of this paused state from the
generic code.

If I add a flag in enable(), this is going to add the same check
everywhere, and in fact hardware event would need to maintain
an internal paused state while this is something we have in the
generic code already.




> > /*
> > * The new state must be visible before we turn it on in the hardware:
> > */
> > smp_wmb();
> >
> > - if (event->pmu->enable(event)) {
> > - event->state = PERF_EVENT_STATE_INACTIVE;
> > - event->oncpu = -1;
> > - return -EAGAIN;
> > + /*
> > + * If we exclude the tasks, we only need to schedule hardware
> > + * events that need to settle themselves, even in a pause mode.
> > + * Software events can simply be scheduled anytime.
> > + * If we want more granularity in all that, we can still provide
> > + * later a pmu->reserve callback.
> > + */
> > + if (!event->attr.exclude_task || !is_software_event(event)) {
> > + if (event->pmu->enable(event)) {
> > + event->state = PERF_EVENT_STATE_INACTIVE;
> > + event->oncpu = -1;
> > + return -EAGAIN;
> > + }
> > }
> >
> > event->tstamp_running += ctx->time - event->tstamp_stopped;
>
> Remove is_software_event(), not add more.



I don't like this either, that's why I talked about the potential need
of a pmu:reserve() callback in the comments.

ie, pmu->reserve() would mean that the pmu knows how to deal with the
PAUSED state. If a pmu doesn't have it, then we would need to call
->enable() without setting the PAUSED state, so that exclude_* things
don't work with them (as we haven't audited the code).

What do you think? That would solve the audit everywhere problem
plus the is_software_event() problem.

pmu->reserve() would be the same than pmu->enable on x86 pmu, and
it would be a stub on software events. On x86 we would just need
to use the generic PAUSED state to make the difference, as it
would be a waste to maintain a parallel internal state.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/