Re: block: make blktrace use per-cpu buffers for message notes

From: Jens Axboe
Date: Thu May 29 2008 - 02:45:32 EST


On Wed, May 28 2008, Andrew Morton wrote:
> On Thu, 29 May 2008 08:22:15 +0200 Jens Axboe <jens.axboe@xxxxxxxxxx> wrote:
>
> > On Wed, May 28 2008, Andrew Morton wrote:
> > > On Wed, 28 May 2008 15:59:07 GMT Linux Kernel Mailing List <linux-kernel@xxxxxxxxxxxxxxx> wrote:
> > >
> > > > Gitweb: http://git.kernel.org/git/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=64565911cdb57c2f512a9715b985b5617402cc67
> > > > Commit: 64565911cdb57c2f512a9715b985b5617402cc67
> > > > Parent: 4722dc52a891ab6cb2d637ddb87233e0ce277827
> > > > Author: Jens Axboe <jens.axboe@xxxxxxxxxx>
> > > > AuthorDate: Wed May 28 14:45:33 2008 +0200
> > > > Committer: Jens Axboe <jens.axboe@xxxxxxxxxx>
> > > > CommitDate: Wed May 28 14:49:27 2008 +0200
> > >
> > > please try to avoid merging unreviewed changes.
> >
> > Just because you didn't review it doesn't mean it's unreviewed :-)
> >
> > It's not unreviewed, it was posted on lkml and a few version were
> > bounced back and forth.
>
> OK. The Subject: swizzling confounded me.
>
> > > > if (unlikely(bt)) \
> > > > __trace_note_message(bt, fmt, ##__VA_ARGS__); \
> > > > } while (0)
> > > > -#define BLK_TN_MAX_MSG 1024
> > > > +#define BLK_TN_MAX_MSG 128
> > >
> > > It seems a bit strange to do this right when we've taken this _off_ the
> > > stack. But I suppose nothing will break.
> >
> > It was never on the stack, it was a global static char array. We are
> > still allocating memory for this, per-cpu. So I think it still makes
> > sense to shrink the size. It's really meant for small trace messages,
> > 128 bytes is plenty. It's an in-kernel property, the userland app
> > doesn't care. So we could easily grow this in the future, should the
> > need arise.
>
> yup.
>
> It's a bit sad to stage the data in a local per-cpu buffer and then
> copy it into relay's per-cpu buffer. I guess this is because the
> length of the output isn't known beforehand. Could be fixed by doing
> what kvasprintf() does, but that might well be slower.

I agree, this is what we debated. My reasoning is that it's better
to minimize usage of the relay buffer, so the stage-and-copy doesn't
matter a whole lot.

I seem to recall a relay_unreserve() patch from Tom back in the day,
if we had something like that we could do the optimal approach of:

buf = relay_reserver(max_size);
n = vscnprintf(buf, max_size, ...);
if (max_size - n)
relay_unreserve(max_size - n);

and get the best of both worlds. But, again, it's not really a big deal
I think.

My main interest in this is adding cfq trace messages, so we have
direct ways of comparing queue+dispatch with what cfq is deciding to
do.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/