Re: [PATCH v6 0/4] perf: add support for profiling jitted code

From: Brendan Gregg
Date: Tue Mar 31 2015 - 17:32:13 EST


On Tue, Mar 31, 2015 at 12:33 AM, Brendan Gregg
<brendan.d.gregg@xxxxxxxxx> wrote:
> G'Day Stephane,
>
> On Mon, Mar 30, 2015 at 3:19 PM, Stephane Eranian <eranian@xxxxxxxxxx> wrote:
> [...]
>> The current support only works when the runtime is monitored from
>> start to finish: perf record java --agentpath:libpfmjvmti.so my_class.
>>
>> Once the run is completed, the jitdump file needs to be injected into
>> the perf.data file. This is accomplished by using the perf inject command.
>> This will also generate an ELF image for each jitted function. The
>> inject MMAP records will point to those ELF images. The reasoning
>> behind using ELF images is that it makes processing for perf report
>> and annotate automatic and transparent. It also makes it easier to
>> package and analyze on a remote machine.
> [...]
>
> This is really impressive work. Do we have an idea of the overhead for
> running the java agent?
>
> Today, I'm using perf-map-agent, loaded dynamically, to dump a
> /tmp/perf*.map file as needed. My company has tens of thousands of
> Linux instances running Java, but very few need profiling, and we
> don't know which beforehand. So a snapshot-on-demand approach is
> ideal. An always-on approach, well, we'd have to know the overhead (I
> can build the agent and test...).

I built the agent and tested with an application framework
micro-benchmark, and saw the performance overhead drop after start
from about 13% initially (measured as a reduction in maximum req/sec
given fixed CPU capacity), to 1.1% after a minute, and then 0.13%
(which is really just noise) after several minutes of high load.

So the overhead is basically zero after (minutes of) warmup, at least
for my test. My jit.dump file reached 8 Mbytes, and was growing by a
tiny amount every 30 seconds or so (hence the near-zero overhead). I'm
much less concerned about overheads now.

I'll test with a production workload if I can... But I'm still curious
about why we're even doing this, instead of the previous method of
taking symbol snapshots. Is there a backstory? If it involves a case
of high symbol churn, then this should also mean non-zero overhead to
constantly log.

Brendan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/