Re: [PATCH, RFC 0/3] Improvements to the tracing documentation

From: Ingo Molnar
Date: Mon Apr 13 2009 - 18:56:15 EST



* Theodore Tso <tytso@xxxxxxx> wrote:

> On Mon, Apr 13, 2009 at 11:31:24PM +0200, Ingo Molnar wrote:
> > Cool. [ And i guess you'll like the per tracepoint filter
> > expressions too :-) ]
>
> I haven't played with them yet, but I was looking over the source
> code at them (since they aren't documented yet :-). It looks like
> at the moment only integer matches are allowed, right? That's a
> bit of an issue for me, since one of the things I'd really like to
> be able to do is filter based on devname (i.e., sda2). (Most of
> the time we only want to collect information for a particular
> block device or filesystem.)

You can already do:

aldebaran:/debug/tracing/events/sched/sched_process_wait> echo "comm == Xorg" > filter
aldebaran:/debug/tracing/events/sched/sched_process_wait> cat filter
comm == Xorg

But string values depends on the type of the format field - so you
cannot do string matches on integer fields.

For kdev_t matches i think we'll need native support for that type -
in addition to the integer/string types. It will come up in other
places as well and user-space knows about devices as well.

> Actually, the fact that I'm having to drop some 32 bytes for each
> jbd2 and ext4 trace log for the bdevname in the ring buffer is
> really for the birds. What I really want to do is just to drop in
> the dev_t, and then for the tracing infrastructure to have an
> efficient (cached) way of taking the dev_t and turning that back
> into struct block_device at TP_printk time so we can print the
> bdevname when it's needed. We deifnitely don't want to be calling
> bdget() in fs/block_dev.c each time we print a line in the tracing
> buffer! I'm guessing that's something the blktrace tracer would
> find handy as well.

Yeah.

It could be worked around right now by converting it to an integer
but i think what we want is native support for kdev_t, together with
all the usual convenience forms of specifying it: sda1 should work
the same way as 8:1 or 0801. Even /dev/sda1 should be recognized in
a filter expression.

> Of course having more kernel code play with dev_t's directly isn't
> considered politically correct in some circles, but tough. :-) We
> can't exactly drop a pointer to a struct block_device in the trace
> buffer, since there's no guarantee it will still be valid when we
> read it out. Dropping in a dev_t is exactly what we want. It
> would be nice though if there was a way to specify a major/minor
> number as the filter predicate for the dev_t, and not to have the
> user generate the MAJOR/MINOR encoding. So some way of parsing
> "MKDEV(8, 4)" as the input to the filter predicate would probably
> be a really good thing to do.

Yeah, exactly. We already have smarts in init/* to recognize certain
device string patterns (for rootdev specification) - that could be
factored out (it already is to a large degree) and reused. We dont
need full udev enumeration really - we just need the most common
variants.

Regardless of whether it's considered politically correct or not ;-)
It's clearly useful.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/