Re: [RFC] perf: perf record sets inherit by default

From: Stephane Eranian
Date: Mon May 17 2010 - 10:25:38 EST

Next message: Arjan van de Ven: "Re: PROBLEM: tickless scheduling"
Previous message: Vegard Nossum: "Re: [ANNOUNCE] GSoC project: Improving kconfig using a SAT solver"
Next in thread: Peter Zijlstra: "Re: [RFC] perf: perf record sets inherit by default"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, May 11, 2010 at 4:48 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Tue, 2010-05-11 at 16:04 +0200, Stephane Eranian wrote:
>> Hi,
>>
>>
>> I am confused by the inheritance cmd line option of perf record:
>>
>> $ perf record -h
>> Âusage: perf record [<options>] [<command>]
>> Â Â or: perf record [<options>] -- <command> [<options>]
>>
>> Â Â -e, --event <event> Â event selector. use 'perf list' to list
>> available events
>> Â Â Â Â --filter <filter>
>> Â Â Â Â Â Â Â Â Â Â Â Â Â event filter
>> Â Â -p, --pid <n> Â Â Â Â record events on existing process id
>> Â Â -t, --tid <n> Â Â Â Â record events on existing thread id
>> Â Â -r, --realtime <n> Â Âcollect data with this RT SCHED_FIFO priority
>> Â Â -R, --raw-samples Â Â collect raw sample records from all opened counters
>> Â Â -a, --all-cpus Â Â Â Âsystem-wide collection from all CPUs
>> Â Â -A, --append Â Â Â Â Âappend to the output file to do incremental profiling
>> Â Â -C, --profile_cpu <n>
>> Â Â Â Â Â Â Â Â Â Â Â Â Â CPU to profile on
>> Â Â -f, --force Â Â Â Â Â overwrite existing data file (deprecated)
>> Â Â -c, --count Â Â Â Â Â event period to sample
>> Â Â -o, --output <file> Â output file name
>> Â Â -i, --inherit Â Â Â Â child tasks inherit counters
>>
>> This leads to believe that by default inheritance in children is off.
>>
>> However, builtin-record.c says:
>>
>> static bool Â Â Â Â Â Â Â Â Â Â inherit Â Â Â Â Â Â Â Â Â Â Â Â = Â true;
>>
>> If that's the case, what's the point of the -i option?
>
> Right, I think we should invert that, does --no-inherit work?
>
>> Another side effect of inheritance is that in per-thread mode,
>> perf creates as many "sessions" as you have CPUs. So
>> on a 16-way processor, sampling on cycles, perf creates
>> 16 events and 16 x 2-page sampling buffers. That's a lot of
>> resources consumed if I am just interested in monitoring
>> a single-threaded workload.
>
> Right, but I think the default of inherit is right, and once you do that
> you basically have to do the per-task-per-cpu thing, otherwise your
> fancy 16-way will start spending most of its time in cacheline bounces.
>
In that case, don't you think you should also ensure that the buffer is
allocated on the NUMA node of the designated per-thread-per-cpu?
I don't think it is the case today.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Arjan van de Ven: "Re: PROBLEM: tickless scheduling"
Previous message: Vegard Nossum: "Re: [ANNOUNCE] GSoC project: Improving kconfig using a SAT solver"
Next in thread: Peter Zijlstra: "Re: [RFC] perf: perf record sets inherit by default"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]