Re: [PATCHv6 00/25] perf stat: Add scripting support

From: Jiri Olsa
Date: Wed Dec 02 2015 - 08:59:32 EST


On Wed, Dec 02, 2015 at 01:51:13PM +0000, Liang, Kan wrote:
> Hi Arnaldo and Jirka,
>
> Any update about the status of this patchset?

there's first batch waiting in Arnaldo's perf/stat
it'll get in soon hopefuly ;-)

I'll rebase the rest on top of it, once it's in
and resend..

jirka

>
> Thanks,
> Kan
>
> > hi,
> > sending another version of stat scripting.
> >
> > v6 changes:
> > - several patches from v4 already taken
> > - perf stat record can now place 'record' keyword
> > anywhere within stat options
> > - placed STAT feature checking earlier into record
> > patches so commands processing perf.data recognize
> > stat data and skip sample_type checking
> > - rebased on Arnaldo's perf/stat
> > - added Tested-by: Kan Liang <kan.liang@xxxxxxxxx>
> >
> > v5 changes:
> > - several patches from v4 already taken
> > - using u16 for cpu number in cpu_map_event
> > - renamed PERF_RECORD_HEADER_ATTR_UPDATE to
> > PERF_RECORD_EVENT_UPDATE
> > - moved low hanging fuits patches to the start of the patchset
> > - patchset tested by Kan Liang, thanks!
> >
> > v4 changes:
> > - added attr update event for event's cpumask
> > - forbig aggregation on task workloads
> > - some minor reorders and changelog fixes
> >
> > v3 changes:
> > - added attr update event to handle unit,scale,name for event
> > it fixed the uncore_imc_1/cas_count_read/ record/report
> > - perf report -D now displays stat related events
> > - some minor and changelog fixes
> >
> > v2 changes:
> > - rebased to latest Arnaldo's perf/core
> > - patches 1 to 11 already merged in
> > - added --per-core/--per-socket/-A options for perf stat report
> > command to allow custom aggregation in stat report, please
> > check new examples below
> > - couple changelogs changes
> >
> > The initial attempt defined its own formula lang and allowed triggering
> > user's script on the end of the stat command:
> > http://marc.info/?l=linux-kernel&m=136742146322273&w=2
> >
> > This patchset abandons the idea of new formula language and rather adds
> > support to:
> > - store stat data into perf.data file
> > - add python support to process stat events
> >
> > Basically it allows to store stat data into perf.data and post process it with
> > python scripts in a similar way we do for sampling data.
> >
> > The stat data are stored in new stat, stat-round, stat-config user events.
> > stat - stored for each read syscall of the counter
> > stat round - stored for each interval or end of the command invocation
> > stat config - stores all the config information needed to process data
> > so report tool could restore the same output as record
> >
> > The python script can now define 'stat__<eventname>_<modifier>'
> > functions to get stat events data and 'stat__interval' to get stat-round data.
> >
> > See CPI script example in scripts/python/stat-cpi.py.
> >
> > Also available in:
> > git://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
> > perf/stat_script
> >
> > thanks,
> > jirka
> >
> > Examples:
> >
> > - To record data for command stat workload:
> >
> > $ perf stat record kill
> > ...
> >
> > Performance counter stats for 'kill':
> >
> > 0.372007 task-clock (msec) # 0.613 CPUs utilized
> > 3 context-switches # 0.008 M/sec
> > 0 cpu-migrations # 0.000 K/sec
> > 62 page-faults # 0.167 M/sec
> > 1,129,973 cycles # 3.038 GHz
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 813,313 instructions # 0.72 insns per cycle
> > 166,161 branches # 446.661 M/sec
> > 8,747 branch-misses # 5.26% of all branches
> >
> > 0.000607287 seconds time elapsed
> >
> > - To report perf stat data:
> >
> > $ perf stat report
> >
> > Performance counter stats for '/home/jolsa/bin/perf stat record kill':
> >
> > 0.372007 task-clock (msec) # inf CPUs utilized
> > 3 context-switches # 0.008 M/sec
> > 0 cpu-migrations # 0.000 K/sec
> > 62 page-faults # 0.167 M/sec
> > 1,129,973 cycles # 3.038 GHz
> > <not supported> stalled-cycles-frontend
> > <not supported> stalled-cycles-backend
> > 813,313 instructions # 0.72 insns per cycle
> > 166,161 branches # 446.661 M/sec
> > 8,747 branch-misses # 5.26% of all branches
> >
> > 0.000000000 seconds time elapsed
> >
> > - To store system-wide period stat data:
> >
> > $ perf stat -e cycles:u,instructions:u -a -I 1000 record
> > # time counts unit events
> > 1.000265471 462,311,482 cycles:u (100.00%)
> > 1.000265471 590,037,440 instructions:u
> > 2.000483453 722,532,336 cycles:u (100.00%)
> > 2.000483453 848,678,197 instructions:u
> > 3.000759876 75,990,880 cycles:u (100.00%)
> > 3.000759876 86,187,813 instructions:u
> > ^C 3.213960893 85,329,533 cycles:u (100.00%)
> > 3.213960893 135,954,296 instructions:u
> >
> > - To report perf stat data:
> >
> > $ perf stat report
> > # time counts unit events
> > 1.000265471 462,311,482 cycles:u (100.00%)
> > 1.000265471 590,037,440 instructions:u
> > 2.000483453 722,532,336 cycles:u (100.00%)
> > 2.000483453 848,678,197 instructions:u
> > 3.000759876 75,990,880 cycles:u (100.00%)
> > 3.000759876 86,187,813 instructions:u
> > 3.213960893 85,329,533 cycles:u (100.00%)
> > 3.213960893 135,954,296 instructions:u
> >
> > - To run stat-cpi.py script over perf.data:
> >
> > $ perf script -s scripts/python/stat-cpi.py
> > 1.000265: cpu -1, thread -1 -> cpi 0.783529 (462311482/590037440)
> > 2.000483: cpu -1, thread -1 -> cpi 0.851362 (722532336/848678197)
> > 3.000760: cpu -1, thread -1 -> cpi 0.881689 (75990880/86187813)
> > 3.213961: cpu -1, thread -1 -> cpi 0.627634 (85329533/135954296)
> >
> > - To pipe data from stat to stat-cpi script:
> >
> > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf script -s
> > scripts/python/stat-cpi.py
> > 1.000192: cpu 0, thread -1 -> cpi 0.739535 (23921908/32347236)
> > 2.000376: cpu 0, thread -1 -> cpi 1.663482 (2519340/1514498)
> > 3.000621: cpu 0, thread -1 -> cpi 1.396308 (16162767/11575362)
> > 4.000700: cpu 0, thread -1 -> cpi 1.092246 (20077258/18381624)
> > 5.000867: cpu 0, thread -1 -> cpi 0.473816 (45157586/95306156)
> > 6.001034: cpu 0, thread -1 -> cpi 0.532792 (43701668/82023818)
> > 7.001195: cpu 0, thread -1 -> cpi 1.122059 (29890042/26638561)
> >
> > - Raw script stat data output:
> >
> > $ perf stat -e cycles:u,instructions:u -A -C 0 -I 1000 record | perf --no-
> > pager script
> > CPU THREAD VAL ENA RUN TIME EVENT
> > 0 -1 12302059 1000811347 1000810712 1000198821 cycles:u
> > 0 -1 2565362 1000823218 1000823218 1000198821
> > instructions:u
> > 0 -1 14453353 1000812704 1000812704 2000382283 cycles:u
> > 0 -1 4600932 1000799342 1000799342 2000382283
> > instructions:u
> > 0 -1 15245106 1000774425 1000774425 3000538255 cycles:u
> > 0 -1 2624324 1000769310 1000769310 3000538255
> > instructions:u
> >
> > - To display different aggregation in report:
> >
> > $ perf stat -e cycles -a -I 1000 record sleep 3
> > # time counts unit events
> > 1.000223609 703,427,617 cycles
> > 2.000443651 609,975,307 cycles
> > 3.000569616 668,479,597 cycles
> > 3.000735323 1,155,816 cycles
> >
> > $ perf stat report
> > # time counts unit events
> > 1.000223609 703,427,617 cycles
> > 2.000443651 609,975,307 cycles
> > 3.000569616 668,479,597 cycles
> > 3.000735323 1,155,816 cycles
> >
> > $ perf stat report --per-core
> > # time core cpus counts unit events
> > 1.000223609 S0-C0 2 327,612,412 cycles
> > 1.000223609 S0-C1 2 375,815,205 cycles
> > 2.000443651 S0-C0 2 287,462,177 cycles
> > 2.000443651 S0-C1 2 322,513,130 cycles
> > 3.000569616 S0-C0 2 271,571,908 cycles
> > 3.000569616 S0-C1 2 396,907,689 cycles
> > 3.000735323 S0-C0 2 694,977 cycles
> > 3.000735323 S0-C1 2 460,839 cycles
> >
> > $ perf stat report --per-socket
> > # time socket cpus counts unit events
> > 1.000223609 S0 4 703,427,617 cycles
> > 2.000443651 S0 4 609,975,307 cycles
> > 3.000569616 S0 4 668,479,597 cycles
> > 3.000735323 S0 4 1,155,816 cycles
> >
> > $ perf stat report -A
> > # time CPU counts unit events
> > 1.000223609 CPU0 205,431,505 cycles
> > 1.000223609 CPU1 122,180,907 cycles
> > 1.000223609 CPU2 176,649,682 cycles
> > 1.000223609 CPU3 199,165,523 cycles
> > 2.000443651 CPU0 148,447,922 cycles
> > 2.000443651 CPU1 139,014,255 cycles
> > 2.000443651 CPU2 204,436,559 cycles
> > 2.000443651 CPU3 118,076,571 cycles
> > 3.000569616 CPU0 149,788,954 cycles
> > 3.000569616 CPU1 121,782,954 cycles
> > 3.000569616 CPU2 247,277,700 cycles
> > 3.000569616 CPU3 149,629,989 cycles
> > 3.000735323 CPU0 269,675 cycles
> > 3.000735323 CPU1 425,302 cycles
> > 3.000735323 CPU2 364,169 cycles
> > 3.000735323 CPU3 96,670 cycles
> >
> >
> > Cc: Andi Kleen <andi@xxxxxxxxxxxxxx>
> > Cc: Ulrich Drepper <drepper@xxxxxxxxx>
> > Cc: Will Deacon <will.deacon@xxxxxxx>
> > Cc: Stephane Eranian <eranian@xxxxxxxxxx>
> > Cc: Don Zickus <dzickus@xxxxxxxxxx>
> > Tested-by: Kan Liang <kan.liang@xxxxxxxxx>
> > ---
> > Jiri Olsa (25):
> > perf stat: Make stat options global
> > perf stat record: Add record command
> > perf stat record: Initialize record features
> > perf stat record: Synthesize stat record data
> > perf stat record: Store events IDs in perf data file
> > perf stat record: Add pipe support for record command
> > perf stat record: Write stat events on record
> > perf stat record: Write stat round events on record
> > perf stat record: Do not allow record with multiple runs mode
> > perf stat record: Synthesize event update events
> > perf stat report: Add report command
> > perf stat report: Process cpu/threads maps
> > perf stat report: Process stat config event
> > perf stat report: Add support to initialize aggr_map from file
> > perf stat report: Process stat and stat round events
> > perf stat report: Process event update events
> > perf stat report: Move csv_sep initialization before report command
> > perf stat report: Allow to override aggr_mode
> > perf script: Process cpu/threads maps
> > perf script: Process stat config event
> > perf script: Add process_stat/process_stat_interval scripting interface
> > perf script: Add stat default handlers
> > perf script: Display stat events by default
> > perf script: Add python support for stat events
> > perf script: Add stat-cpi.py script
> >
> > tools/perf/Documentation/perf-stat.txt | 34 ++++
> > tools/perf/builtin-script.c | 139 +++++++++++++++
> > tools/perf/builtin-stat.c | 742
> > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> > ++++++++++---------
> > tools/perf/scripts/python/stat-cpi.py | 74 ++++++++
> > tools/perf/util/evlist.c | 6 +-
> > tools/perf/util/evlist.h | 3 +
> > tools/perf/util/scripting-engines/trace-event-python.c | 114
> > +++++++++++-
> > tools/perf/util/session.c | 3 +
> > tools/perf/util/trace-event.h | 4 +
> > 9 files changed, 1021 insertions(+), 98 deletions(-) create mode 100644
> > tools/perf/scripts/python/stat-cpi.py
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/