Re: Perf and ftrace [was Re: PyTimechart]

From: Mathieu Desnoyers
Date: Wed May 12 2010 - 14:51:17 EST


* Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> On Wed, 2010-05-12 at 14:37 -0400, Mathieu Desnoyers wrote:
> > * Peter Zijlstra (peterz@xxxxxxxxxxxxx) wrote:
> > > On Wed, 2010-05-12 at 14:04 -0400, Mathieu Desnoyers wrote:
> > > > Can't we keep multiple references to each page ? (shared page) so it's still in
> > > > the buffer, also accessed by mmap(), and in addition accessed by splice.
> > >
> > > I'm not sure, the problem seems to be that a splice-consumer might want
> > > to inject the page into a whole different address-space, over-writing
> > > page->mapping/->index etc.
> >
> > OK, I see. In LTTng, I dropped the mmap() support when I integrated splice(). In
> > both case, I can share the pages between the "output" (mmap or splice) and the
> > ring buffer because my ring buffer does not care about
> > page->mapping/->index/etc, so I never have to swap them.
> >
> > However, doing mmap() and splice() at the same time on the same pages seems
> > problematic for the reason you point out here (and not very useful anyway).
> > But I think restrictions could be done more transparently than what you propose,
> > e.g.:
> >
> > 1) create buffer -> return fd
> > (perform pfn alignment for the architecture worse-case, e.g. support mmap()
> > on sparc)
> >
> > 2a) mmap(fd)
> > return -EBUSY if any of the pages has non-NULL mapping.
> > 3a) munmap(fd)
> >
> > 2b) splice(fd)
> > return -EBUSY if any of the pages has non-NULL mapping.
> >
> > 2c) read(fd)
> > Could probably be done concurrently with splice() or mmap().
> >
> > This way we would ensure that only mmap or splice is used on the buffer at a
> > given time without crippling the API.
> >
> > Thoughts ?
>
> Right, so the problem is that we now use mmap() to size the buffer. I
> guess we could go adding a size attribute to perf_event_attr, but I
> think its makes more sense to separate the actual event and the output
> buffer objects.

It makes it hard to use splice() or read() if you don't specify the buffer size
at creation time. That alone seems like a pretty good argument for fixing the
size before the mmap() call.

Thanks,

Mathieu


--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/