Re: [PATCH 4/4] x86, mce: Have MCE persistent event off by defaultfor now

From: Ingo Molnar
Date: Thu May 05 2011 - 03:34:13 EST



* Borislav Petkov <bp@xxxxxxxxx> wrote:

> On Thu, May 05, 2011 at 02:39:51AM -0400, Ingo Molnar wrote:
> > printk events are a compatibility wrapper to allow RAS functionality to have
> > easy and unified access to all system events that matter. The structure of
> > printk events is obviously the log level plus a free-form ASCII string,
> > something like:
> >
> > 1- the printk timestamp
>
> Yeah, we want here the timestamp when the event happened.
>
> > 2- the log level of the kernel when the message was generated
> > 3- the log level of the message
> > 4- the printk message itself, as a free-form string
> >
> > > [...] a big issue when you have some heavy duty infrastructure trying to
> > > parse and consume these messages. We should really consider such stuff a
> > > user visible ABI, and thus not subject to random breakage - which is a
> > > radical departure from our current attitude to printk().
> >
> > Indeed, turning printk into an ABI clearly wont fly upstream although i'd
> > expect upstream to *care more* about good printk messages if the RAS daemon
> > starts making good use of it. Any printk message that turns out to be useful
> > can be turned into an ABI by defining a proper structured event out of it, via
> > TRACE_EVENT() et al.
>
> Actually let's have the RAS printk's as TRACE_EVENT's from the start
> - it's not like we're going to convert every printk call into a RAS
> printk event. [...]

Fully agreed that printk should be a TRACE_EVENT() from the get go.

What i meant was that it also gives an opportunity for the introduction of new
TRACE_EVENT()s: if RAS tooling sees problems with some important printk
changing its format all the time then such problems can be addressed by
'upgrading' that printk event to a TRACE_EVENT().

> [...] We only want relevant ones from traps.c, maybe some power management
> events, fs, maybe some critical security stuff, etc.

Yeah. And what that 'critical stuff' is will probably be found out gradually.
In the meanwhile we'll have printk events as a starting point.

The printk event is also useful for a practical reason: if you add it right now
you can test the RAS daemon and provoke a steady stream of events easily by
catching (and generating) printk events.

( We also want event injection to work, to be able to simulate real MCE
events. )

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/