Re: [PATCH 1/2] ring-buffer: Introducing ring-buffer mapping functions

From: Steven Rostedt
Date: Tue Mar 21 2023 - 11:41:12 EST


On Tue, 21 Mar 2023 15:17:15 +0000
Vincent Donnefort <vdonnefort@xxxxxxxxxx> wrote:

> On Mon, Mar 20, 2023 at 09:45:16PM -0400, Steven Rostedt wrote:
> > On Fri, 17 Mar 2023 14:33:09 +0000
> > Vincent Donnefort <vdonnefort@xxxxxxxxxx> wrote:
> >
> > > Also, the meta-page being... a single page, this limits at the moment the
> > > number of pages in the ring-buffer that can be mapped: ~3MB on a 4K pages
> > > system.
> >
> > I hate this limitation, so I fixed it ;-)
>
> Thanks a lot for having a look. Do you mind if I fold this in my patch for a V2?

Hold off, I found some bugs that I'm fixing ;-)

>
> >
> > I added a meta_page_size field to the meta page, and user space can do:
> >
> > meta = mmap(NULL, page_size, PROT_READ, MAP_SHARED, fd, 0);
> > if (meta == MAP_FAILED)
> > pdie("mmap");
> >
> > map = meta;
> > meta_len = map->meta_page_size;
> >
> > if (meta_len > page_size) {
> > munmap(meta, page_size);
> > meta = mmap(NULL, meta_len, PROT_READ, MAP_SHARED, fd, 0);
> > if (meta == MAP_FAILED)
> > pdie("mmap");
> > map = meta;
> > }
> >
> > This appears to work (but I'm still testing it).
> >
> > -- Steve
> >
> > diff --git a/include/uapi/linux/trace_mmap.h b/include/uapi/linux/trace_mmap.h
> > index 24bcec754a35..12f3f7ee33d9 100644
> > --- a/include/uapi/linux/trace_mmap.h
> > +++ b/include/uapi/linux/trace_mmap.h
> > @@ -18,6 +18,7 @@ struct ring_buffer_meta_page {
> > __u32 reader_page;
> > __u32 nr_data_pages; /* doesn't take into account the reader_page */
> > __u32 data_page_head; /* index of data_pages[] */
> > + __u32 meta_page_size; /* size of the meta page */
>
> Do we want a specific field here? That could be deduced from nr_data_pages()
> quite easily?

I rather not have too much implementation detail knowledge in user space.
It only removes a single entry, and it makes user space easier. In fact,
I'm thinking we should not include "__u32 data_pages[]" but instead add a:
"__u32 data_start" where user space does:

__u32 *data_pages = (_u32 *)meta_page + meta_page->data_start;

That way we could extend the data provided by the meta_page in the future.

-- Steve


>
>
> > __u32 data_pages[];
> > };
> >
> > diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
> > index 10a17e78cfe6..77c92e4a7adc 100644
> > --- a/kernel/trace/ring_buffer.c
> > +++ b/kernel/trace/ring_buffer.c
> > @@ -526,6 +526,7 @@ struct ring_buffer_per_cpu {
> > u64 read_stamp;
> >
> > int mapped;
> > + int meta_order;
> > struct mutex mapping_lock;
> > unsigned long *page_ids; /* ID to addr */
> > struct ring_buffer_meta_page *meta_page;
> > @@ -5898,7 +5899,7 @@ int ring_buffer_read_page(struct trace_buffer *buffer,
> > EXPORT_SYMBOL_GPL(ring_buffer_read_page);
> >
> > #define META_PAGE_MAX_PAGES \
> > - ((PAGE_SIZE - (offsetof(struct ring_buffer_meta_page, data_page_head))) >> 2)
> > + ((PAGE_SIZE - (offsetof(struct ring_buffer_meta_page, data_pages))) >> 2)
> >
>
> [...]