Re: [RFC][PATCH 1/5] [PATCH 1/5] events: Add EVENT_FS the eventfilesystem

From: Steven Rostedt
Date: Wed Nov 17 2010 - 07:26:00 EST


On Wed, 2010-11-17 at 11:39 +0100, Ingo Molnar wrote:
> * Greg KH <gregkh@xxxxxxx> wrote:
>
> > On Tue, Nov 16, 2010 at 07:53:58PM -0500, Steven Rostedt wrote:
> > > From: Steven Rostedt <srostedt@xxxxxxxxxx>
> > >
> > > Copied mostly from debugfs, the eventfs is the filesystem that will include
> > > stable tracepoints. Currently nothing enables this filesystem as of this patch.
> >
> > What? Wait, I wrote tracefs a long time ago just for this, why not take that code
> > and use it instead?
>
> Yeah, and i know that i suggested 'eventfs' to Steve and others in a prior thread a
> few months ago - and i suspect Steve was following up on that suggestion with this
> patch? So i guess it's partly my fault ;-)

And we brought this up at Kernel Summit.

>
> [ Also, i think our _real_ problems with tracing lie entirely elsewhere, but i've
> explained that numerous times. Maintaining instrumentation bits is the ultimate
> cat herding experience ;-) ]
>
> I also explained it in that eventfs suggestion thread that eventfs (or, indeed
> tracefs) is IMO only a second tier approach compared to the real thing: proper
> enumeration of events in sysfs.
>
> [ Beyond the obvious compatibility detail that we are _NOT_ getting rid of
> /debug/tracing/events/, as existing tooling depends on it. So unless eventfs or
> sysfs integration brings some real tangible benefits over what we have already we
> dont want to force tooling to migrate to yet another API. ]

One benefit is that we have a way to distinguish between
in-field-debugging tracepoints and tracepoints that are only for
analysis tools.

>
> Lin Ming and PeterZ are working on sysfs integration and they have posted several
> iterations of that work which extends event details to sysfs. That work is not
> complete yet and they need help. (I've Cc:-ed them.)

Actually, I suck at adding anything to the sysfs/kobject code. I always
screw it up. I only got the /sys/kernel/events working because I copied
it directly from Greg. I doubt I'd be much help.


>
> The sysfs approach has numerous upsides:
>
> - Design: sysfs is a mature, multi-year project with tons of meaningful hardware
> and software hieararchies already well established. Attaching events to these
> existing nodes optionally is an obvious advantage and avoids duplication and
> forces people to think about structure.
>
> - Concentration of structure: subsystem and driver authors/maintainers already care
> about their sysfs layout - and when they define new tracepoints for subsystem or
> driver instrumentation it would be very natural for those events to go somewhere
> nearby, in the existing sysfs hieararchy.
>
> - Practicalities: sysfs is already mounted on all distros so tooling could rely on
> it universally. It's the ultimate 'describe system structure' store.
>
> - Long term maintenance: we want to be strict with events, i.e. keep the
> descriptors read only and single-line structured. You sysfs folks are enforcing
> that pretty well - with eventfs we'd always have the nasty lure to apply API
> hacks to eventfs components when we really shouldnt ...


Are these events now going to be labeled as stable? Is every tracepoint
we have, much have the same data? Linus specifically said at Kernel
Summit that he wants absolutely NO modules to have a stable tracepoint.

Also, if we just blindly label a tracepoint as "stable" then we must
keep all its contents. For example, the sched_switch will contain the
priority. As Peter has stated several times, that may go away. We also
do not want to lose getting that information, as a lot of us use it.

>
> Eventfs has a couple of downsides:
>
> - Design: it's slapping events into a separate, partly duplicated, partly unique,
> partly inconsistent set of hierarchies. We can deal with it, but it's not
> particularly intelligent and i'd like us to try harder.
>
> - Practicalities: eventfs has to be mounted on every distro. It's an uphill climb
> in general and the appeal of an approach has to be _strong_ for this to be
> feasible.

Some distros already mount debugfs by default. It's a oneliner in fstab.

>
> So putting it into sysfs looks like a pretty intelligent solution all around and i'd
> prefer it.

Another downside is that you need to scan hundreds of directories to
find tracepoints. And again, are they all now stable?


>
> Steve, would you be interested in helping out Lin Ming and PeterZ with the sysfs
> work - or at least help them come to the conclusion that we want eventfs?

I don't think I would be much help with the former, and I'm thinking I'm
losing the later.

Hmm, seems that every decision that we came to agreement with at Kernel
Summit has been declined in practice. Makes me think that Kernel Summit
is pointless, and was a waste of my time. :-(

-- Steve

>
> Thanks,
>
> Ingo


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/