Re: [PATCH] perf lock: Drop "-a" option from set of default argumentsto cmd_record()

From: Hitoshi Mitake
Date: Sun May 09 2010 - 10:55:16 EST


On 05/09/10 01:14, Frederic Weisbecker wrote:
> On Sat, May 08, 2010 at 05:10:29PM +0900, Hitoshi Mitake wrote:
>> This patch drops "-a" from record_args, which is passed to cmd_record().
>>
>> Even if user wants to record all lock events during process runs,
>> perf lock record -a<program> <argument> ...
>> is enough for this purpose.
>>
>> This can reduce size of perf.data.
>>
>> % sudo ./perf lock record whoami
>> root
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.439 MB perf.data (~19170 samples) ]
>> % sudo ./perf lock record -a whoami # with -a option
>> root
>> [ perf record: Woken up 0 times to write data ]
>> [ perf record: Captured and wrote 48.962 MB perf.data (~2139197 samples) ]
>>
>> This patch was made on perf/test of random-tracing.git,
>> could you queue this, Frederic?
>>
>> Cc: Ingo Molnar<mingo@xxxxxxx>
>> Cc: Peter Zijlstra<a.p.zijlstra@xxxxxxxxx>
>> Cc: Paul Mackerras<paulus@xxxxxxxxx>
>> Cc: Arnaldo Carvalho de Melo<acme@xxxxxxxxxx>
>> Cc: Jens Axboe<jens.axboe@xxxxxxxxxx>
>> Cc: Jason Baron<jbaron@xxxxxxxxxx>
>> Cc: Xiao Guangrong<xiaoguangrong@xxxxxxxxxxxxxx>
>> Signed-off-by: Hitoshi Mitake<mitake@xxxxxxxxxxxxxxxxxxxxx>
>
>
> Thanks, will test it and if it's fine I'll queue.
>
> I did a lot of tests these last days to understand what was going on
> with perf lock, I mean the fact we have various bad locking scenario.
>
> So far, the state machine looks rather good. In fact, the real problem
> is that we don't have every events. We lose a _lot_ of them and that's
> because the frequency of lock events is too high and perf record
> can't keep up.

Really, I didn't think about lack of events :(

>
> I think I'm going to unearth the injection code to reduce the size
> of these events.
>
>

Yeah, injection will be really helpful thing.

And I have a rough idea for reducing event frequency.

Many lock event sequences are like this form:
* acquire -> acquired -> release
* acquire -> contended -> acquired -> release
I think that making 3 or 4 events per each lock sequences
is waste of CPU time and memory space.

If threads store time of each events
and make only 1 event at time of release,
we will be able to reduce lots of time and space.

For example, ID of each lock instance is 8 byte in x86_64.
In this scheme 8 * 4 byte for ID will be only 8 byte.
I think this optimization has worth to consider because of
high frequency of lock events.

How do you think?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/