Re: perf_counters issue with enable_on_exec

From: stephane eranian
Date: Mon Aug 24 2009 - 12:03:36 EST

Next message: Paul E. McKenney: "Re: [PATCH -tip] v3 Consolidate sparse and lockdep declarations ininclude/linux/rcupdate.h"
Previous message: Nicolas Pitre: "git pull request: small Orion/Kirkwood fixes for 2.6.31-rc"
In reply to: stephane eranian: "Re: perf_counters issue with enable_on_exec"
Next in thread: Peter Zijlstra: "Re: perf_counters issue with enable_on_exec"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Aug 24, 2009 at 5:44 PM, stephane eranian<eranian@xxxxxxxxxxxxxx> wrote:
> On Mon, Aug 24, 2009 at 3:46 PM, Peter Zijlstra<a.p.zijlstra@xxxxxxxxx> wrote:
>> On Thu, 2009-08-20 at 15:49 +0200, stephane eranian wrote:
>>> Hi,
>>>
>>> I am running into an issue trying to use enable_on_exec
>>> in per-thread mode with an event group.
>>>
>>> My understanding is that enable_on_exec allows activation
>>> of an event on first exec. This is useful for tools monitoring
>>> other tasks and which you invoke as: tool my_program. In
>>> other words, the tool forks+execs my_program. This option
>>> allows developers to setup the events after the fork (to get
>>> the pid) but before the exec(). Only execution after the exec
>>> is monitored. This alleviates the need to use the
>>> ptrace(PTRACE_TRACEME) call.
>>>
>>> My understanding is that an event group is scheduled only
>>> if all events in the group are active (disabled=0). Thus, one
>>> trick to activate a group Âwith a single ioctl(PERF_IOC_ENABLE)
>>> is to enable all events in the group except the leader. This works
>>> well. But once you add enable_on_exec on on the events,
>>> things go wrong. The non-leader events start counting before
>>> the exec. If the non-leader events are created in disabled state,
>>> then they never activate on exec.
>>>
>>> The attached test program demonstrates the problem.
>>> simply invoke with a program that runs for a few seconds.
>>
>> OK, lots of issues here
>>
>> Â1) your code is broken ;-)
>
> That's true. I knew about the missing synchro. But I think
> the problem existed nonetheless.
>
>> Â2) enable_on_exec on !leader counters is undefined
>
> then fail it.
>
>> Â3) there is something fishy non the less
>>
> True.
>
>>
>> 1. you fork() then create a counter group in both the parent and the
>> child without sync, then read the parent group. This obviously doesn't
>> do what is expected. See attached proglet for a better version.
>>
> I have modified the program based on your changes. See new version attached.
>
>> 2. enable_on_exec only works on leaders, Paul, was that intended?
>>
> All events in a group are scheduled together. If one event is not enabled
> in a group, then the group is not dispatched. Setting enable_on_exec
> just on leader makes sense. Then to enable the group on exec, you
> enabled all events but the leader. The enable_on_exec will enable
> the leader on exec and the group will be ready for dispatch. That's
> how it should work in my mind.
>
>
> As you indicated the issue is with the timing information and I think
> it is not related to enable_on_exec. It is more related to the fact
> that to enable a group with a single ioctl() you enable ALL BUT the
> leader. But that means that the time_enabled for the !leader is
> ticking. Thus scaling won't be as expected yet it is correct
> given what happens internally.
>
> I think there needs to be a distinction between 'enabled immediately
> but cannot run because group is not totally enabled' and 'cannot run
> because the group has been multiplexed out yet all could be dispatched
> because all events were dispatched'. In the former, it seems you don't
> want time_enabled to tick, while in the latter you do. In other words,
> time_enabled ticks for each event if the group is 'dispatch-able' (or
> runnable in your terminology) otherwise it does not. time_enabled reflects
> the fact that the group could run but did not have access to the PMU
> resource because of contention with other groups.
>
In other words, I think timing_enabled is measuring the wrong thing.
It should be instead called time_runnable and it should measure the
time during which the event is runnable, i.e, its group is runnable. That
means the event (group) could be dispatched if PMU was "free".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Paul E. McKenney: "Re: [PATCH -tip] v3 Consolidate sparse and lockdep declarations ininclude/linux/rcupdate.h"
Previous message: Nicolas Pitre: "git pull request: small Orion/Kirkwood fixes for 2.6.31-rc"
In reply to: stephane eranian: "Re: perf_counters issue with enable_on_exec"
Next in thread: Peter Zijlstra: "Re: perf_counters issue with enable_on_exec"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]