Re: [PATCH v2 1/3] perf/core: Add a tracepoint for perf sampling

From: Peter Zijlstra
Date: Mon Aug 08 2016 - 05:57:27 EST


On Fri, Aug 05, 2016 at 10:22:08AM -0700, Brendan Gregg wrote:

> (Normally I'd use I$ miss overflow, but none of our Linux systems have
> PMCs: cloud.)

I think I had better not comment on that ;-)

> >> > The perf:perf_hrtimer probe point is also reading state mid-way
> >> > through a function, so it's not quite as simple as wrapping the
> >> > function pointer. I do like that idea, though, but for things like
> >> > struct file_operations.
> >
> > So what additional state to you need?
>
> I was pulling in regs after get_irq_regs(), struct perf_event *event
> after it's populated. Not that hard to duplicate. Just noting it
> didn't map directly to the function entry.

Right, both of which are available to the overflow handler.

> I wanted perf_event just for event->ctx->task->pid, so that a BPF
> program can differentiate between it's samples and other concurrent
> sessions.
>
> (I was thinking of changing my patch to expose pid_t instead of
> perf_event, since I was noticing it didn't add many instructions.)

Slightly confused, event->ctx->task == current, no? We flip that pointer
when we flip the contexts.

At which point, it should be the same as SAMPLE_TID.

!?

> [...]
> >> instead of adding a tracepoint to perf_swevent_hrtimer we can replace
> >> overflow_handler for that particular event with some form of bpf wrapper.
> >> (probably new bpf program type). Then not only periodic events
> >> will be triggering bpf prog, but pmu events as well.
> >
> > Exactly.
>
> Although the timer use case is a bit different, and is via
> hwc->hrtimer.function = perf_swevent_hrtimer.

Still not entirely sure why you could not hook into
event->overflow_handler.