Re: The case for a standard kernel debugger

From: Karim Yaghmour (karym@opersys.com)
Date: Mon Oct 09 2000 - 04:10:16 EST


richardj_moore@uk.ibm.com wrote:
>
> Completely agree - co-operation+integration is the order of the day.

Cool :)

> They
> other thing I didn't mention was that the GKHI was substantially coded
> before we discovered your hook capability. Part of the GKHI is also to
> allow hooks to be dynamicaly defined i.e. to allow kernel modules to
> declare hooks themselves, though I haven't yet implemented that - the
> design is done.

Same here, as I said it was next on my list.

> The MP complient remark refers to the fact that the hook
> mechanism must work under MP. Our hooks are implemented by modifying code
> dynamically - that has certain serialisation requrements: you need to
> ensure that other processors see a consistent view of memory, which means
> that you need to stop them while you're change code dynamically and also
> flush their I-fetch caches. There are some very odd (H/W) behavious that
> can occur with self-modifying code if the appropriate measures are not
> taken.

Well, this explains why I didn't understand what you meant by "MP compliant".
LTT has no self-modifying code. Hence, there is no need to make special
cases for that.

>
> Our uniprocessor version of GKHI is being tested as I write. We hope to
> release it in the next couple of weeks. As soon as we are happy with it I
> will send you a copy so you can evaluate it against you hook methodology
> and we can see what ecconomies can be established.

That's fine by my book.

>
> And talking further of co-operation. I'd like to make DProbes drive your
> trace facility.

See below.

> Did you see the announcement post I sent to LTT?

Yes I did get the announcement, though I was far too busy at the time to respond
anything meaningful more than: "Cool stuff" :) So I said to myself that
I'd take the time later to write something more intelligible ... I never got
to the later part ...

> The idea
> is that DProbes is an enabler for other RAS facilities. We can dynamically
> insert a probe anywhere into memory (user and kernel) without the need for
> re-compilation of the source. From the RPN program that's driven by the
> probe event handler we can initiate other facilities such as entering SGI's
> kernel debugger or invoking Crash Dump or forcing a core dump. Now, DProbes
> came from OS/2 and was called dynamic trace. Its original purpose was to
> implement tracepoints on the fly. We can still do that with DProbes,
> provided we have a tracing mechanism we can feed into. That's where you
> come in. Can you provide an interface we can call from kernel space to log
> a trace event with a variable length buffer?

(I begin with a paragraph on the "philosophy" behind LTT and related trace
projects to make sure everyone is aware of those issues).

OK, I think there's a design issue that ought to be addressed right away.
LTT wasn't designed to help anyone delve into or debug kernel internals.
That was never a design goal and will not be for any foreseeable future. The
main purpose of LTT is to provide it's user with an accurate reconstruction
of the dynamic behavior of the system being observed. That said, LTT __can
also__ be used to understand kernel internals and track internal operations.
The generic hooking mechanism provided, the capability of creating new
trace points and the event-driven state machine engine are all part of
this "internals" functionality. Even though these are not an essential part
of LTT (the essential part being as discussed above), they do form a very
interesting development avenue and this is where I believe the projects I am
putting forward and Dprobes cross the same paths. In each, there's a desire
to empower __kernel developers__ or __kernel power users__ (if such a thing
exists) with the tools necessary to track and act upon the occurrence of
key events. Where the current LTT implementation of things only regards a
fixed set of key kernel events, Dprobes leaves definitions open to the
programer. I don't think that either is better than the other, I think
both serve their purpose. By providing a pre-defined set of key events,
LTT "defines" which events are interesting for system behavior reconstruction
and which aren't (notice that I speak of "system behavior" not "kernel behavior",
the former being the combined behavior of the kernel and the applications).
By providing a way to insert arbitrary probes, Dprobes enables system
developers to rapidly isolate different sort of bugs __without__ having
to modify the original source code.

So, my point is that although Dprobes can make a world of a difference
to help developers isolate bugs and fast, it cannot, by itself, serve as
the basis for the system behavior reconstruction infrastructure. For that,
and many other purposes, there must be a pre-defined set of key events.
An example of "other purposes" is using the pre-defined trace points to
implement C2 security in the kernel (this was suggested by Alan).

Hence, yes I can provide an interface from the kernel to log a trace event
with a variable length buffer, but I don't think that taking away the statically
defined trace points is the right thing to do. (I might have gotten this
completely wrong, though ... My presumption about your suggestion of using
Dprobes to "drive" LTT, is that you mean that all events should come from
Dprobes and Drpobes alone. I could be wrong).

So here's what I suggest:
There's already two event types within the events recognized by LTT which
had been planned for this type of usage. They are: "New event" and "Custom
event". The first is used to declare a new event type and the second is used
to log all such events. To declare a new event, the caller would call upon
an event ID creation function providing it with an event size. The function
would use the "New event" type to declare a new event in the log and would
return a unique event ID. Thereafter, the normal tracing function, already
available through the LTT kernel patch, could be used to log the new events.

This could be used by Dprobes to enable dynamically inserted probe points to
be logged within a normal trace and, thereafter, be part of trace analysis.

Does this fit your needs?

>
> Richard Moore - RAS Project Lead - Linux Technology Centre (PISC).
>
> http://oss.software.ibm.com/developerworks/opensource/linux
> Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
> IBM UK Ltd, MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK
>
> Karim Yaghmour <karym@opersys.com> on 06/10/2000 09:16:12
>
> Please respond to Karim Yaghmour <karym@opersys.com>
>
> To: Richard J Moore/UK/IBM@IBMGB
> cc: linux-kernel@vger.kernel.org
> Subject: Re: The case for a standard kernel debugger
>
> Hello Richard,
>
> Part of your analysis is correct. The hooks were designed to take care of
> static tracepoints only. That said, dynamic allocation of event IDs was
> next on my list and the hooking mechanism would have been modified
> consequently.
>
> As for "multiple exits registered per hook", if you mean that you can have
> more
> than one function called back for each event, then this is already
> possible.
> The other items you mention such as atomicity and prioritization seem
> interesting
> indeed, although I am not sure what you mean by MP compliant as the only
> thing that stops the current generalized hooking mechanism to be MP
> compliant
> is the insertion of correct locks during callback registration.
>
> Please understand that the purpose wasn't to discredit your work, but
> rather
> to stop duplication of work as efforts could be deployed elsewhere. I think
> that your work and the work already done on LTT can be brought together in
> a way that would profit all. This is what I was hinting to towards the end
> of the posting. It was an invitation more than anything else.
>
> Apart from the hooking mechanism, there were other items which I mentioned
> that merit discussion, such as the ability to enable dynamic probes to log
> events in normal LTT traces or the event-driven state machine engine.
> Hence,
> if you are interested in joining forces to further enhance probing and
> tracing
> capabilities in Linux, I think this would be a good opportunity.
>
> Best regards
>
> Karim
>
> richardj_moore@uk.ibm.com wrote:
> >
> > Yes, we looked at that and it didn't seem to provide the generality we
> > needed - multipe exits registered per hook, ability to arm a set of hooks
> > atomically, ability to prioritise dispatching order of a hook exit, MP
> > complient. I may be wrong but the Linux Trace Toolkit hooks like like
> they
> > were specifically designed to cater for inserting static tracepoints into
> > the kernel.
> >
> > Richard Moore - RAS Project Lead - Linux Technology Centre (PISC).
> >
> > http://oss.software.ibm.com/developerworks/opensource/linux
> > Office: (+44) (0)1962-817072, Mobile: (+44) (0)7768-298183
> > IBM UK Ltd, MP135 Galileo Centre, Hursley Park, Winchester, SO21 2JN, UK
>
> --
> ===================================================
> Karim Yaghmour
> karym@opersys.com
> Operating System Consultant
> (Linux kernel, real-time and distributed systems)
> ===================================================
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> Please read the FAQ at http://www.tux.org/lkml/

-- 
===================================================
                 Karim Yaghmour
               karym@opersys.com
          Operating System Consultant
 (Linux kernel, real-time and distributed systems)
===================================================
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:11 EST