Re: [RFC 2/2] perf: Marker software event and ioctl

From: Arnaldo Carvalho de Melo
Date: Mon Sep 15 2014 - 14:31:14 EST


Em Mon, Sep 15, 2014 at 06:27:14PM +0100, Pawel Moll escreveu:
> On Fri, 2014-09-12 at 17:19 +0100, Arnaldo Carvalho de Melo wrote:
> > Em Fri, Sep 12, 2014 at 02:58:55PM +0100, Pawel Moll escreveu:
> > > On Fri, 2014-09-12 at 14:49 +0100, Arnaldo Carvalho de Melo wrote:
> > > > Perhaps both? I.e. an u64 followed from a string, if the u64 is zero,
> > > > then there is a string right after it?
> >
> > > How would this look like in userspace? Something like this?
> >
> > > 8<----
> > > struct perf_event_marker {
> > > uint64_t value;
> > > char *string;
> > > } arg;
> >
> > > arg.value = 0x1234;
> >
> > > /* or */
> >
> > > arg.value = 0;
> > > arg.string = "abcd";
> >
> > > ioctl(fd, PERF_EVENT_IOC_MARKER, &arg)
> > > 8<----
> >
> > > If so, maybe it would simpler just to go for classic size/data
> > > structure?
> >
> > > 8<-----
> > > struct perf_event_marker {
> > > uint32_t size;
> > > void *data;
> > > }
> > > 8<-----
> >
> > > This would directly map into struct perf_raw_record...
> >
> > I can see the usefulness of having it all, i.e. if we do just:
> >
> > perf trace --pid `pidof some-tool-in-debug-mode-using-this-interface`
>
> Hm. I haven't thought about a situation when 3rd party wants to inject
> something into "my" data stream... I guess it could be implemented (a

I was thinking about intercepting calls that pass some logging data, as
strings, and 'tee' them to the 'perf trace' 'data stream'.

> "pid" member of the struct perf_event_marker with default 0 meaning

Humm, Isn't PERF_SAMPLE_TID enough for that?

> "myself"?), but will definitely complicate the patch. Should I have a
> look at it now or maybe leave it till we get a general agreement about
> the marker ioctl existence?
>
> > Then 'perf trace' doesn't know about any binary format a tool may have,
> > getting strings there (hey, LD_PRELOADing some logging library to hook
> > into this comes to mind) and having it merged with other events
> > (syscalls, pagefaults, etc) looks useful.
>
> But do you still mean a "magic" u64 before the rest? Injecting a string
> would just mean:
>
> marker.size = strlen(s) + 1;
> marker.data = s;
>
> > As well as some specialized version of 'perf trace' that knows about
> > some binary protocol that would get app specific stats or lock status,
> > etc, perhaps even plugins for 'perf trace' that would be selected by
> > that first u64? Also seems useful.
> >
> > I.e. having a way to provide just strings and another that would allow
> > passing perf_raw_record.
>
> Sounds interesting. But then maybe this stuff shouldn't go into "raw"
> then? It could be something like this in the sample:
>
> { u64 type; /* 0 means zero-terminated string in data */
> u32 size;
> char data[size]; } && PERF_SAMPLE_MARKER

Yes, this is how I think it should be.

> Pawel
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/