Re: [PATCH 3/3] Stop tracing on a schedule bug

From: Thomas Gleixner
Date: Thu Apr 15 2010 - 19:51:09 EST


On Thu, 15 Apr 2010, Chase Douglas wrote:
> On Thu, Apr 15, 2010 at 4:01 PM, Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> > On Thu, 15 Apr 2010, Chase Douglas wrote:
> >> On Thu, Apr 15, 2010 at 2:03 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> >> > On Wed, Apr 14, 2010 at 12:20:16PM -0400, Chase Douglas wrote:
> >> >> This change adds a tracing_off_event() call to stop tracing on schedule
> >> >> bugs unless tracing_off=none was specified on the commandline.
> >> >>
> >> >> Signed-off-by: Chase Douglas <chase.douglas@xxxxxxxxxxxxx>
> >> >> ---
> >> >>  kernel/sched.c |    2 ++
> >> >>  1 files changed, 2 insertions(+), 0 deletions(-)
> >> >>
> >> >> diff --git a/kernel/sched.c b/kernel/sched.c
> >> >> index 6af210a..439f036 100644
> >> >> --- a/kernel/sched.c
> >> >> +++ b/kernel/sched.c
> >> >> @@ -3590,6 +3590,8 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >> >>  {
> >> >>       struct pt_regs *regs = get_irq_regs();
> >> >>
> >> >> +     tracing_off_event(TRACE_EVENT_BUG);
> >> >> +
> >> >>       printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
> >> >>               prev->comm, prev->pid, preempt_count());
> >> >
> >> >
> >> >
> >> > I would rather call that a TRACE_EVENT_WARN as this is what happens: we
> >> > warn but we continue.
> >>
> >> I tend to think of the TRACE_EVENT_* as an indication of severity and
> >> whether we want to stop the trace by default. From a distro
> >> standpoint, the likelihood that we want to continue tracing after a
> >> __schedule_bug is pretty low. It's easiest if we don't have to tell
> >
> > Well, scheduling while atomic is a BUG, but one of the category which
> > allows the kernel to continue. So in fact it's treated like a WARN_ON.
> > So the tracing_off_event() qualifier should be *_WARN.
> >
> > That's independent of the question whether you want to stop tracing in
> > that very case. Though I agree that the tracer should stop here.
>
> We seem to be agreeing on the functionality. The disagreement seems to
> be in the macro name/functionality mapping. However, the name of the
> function itself is *_bug. I don't see how things are clearer or more
> useful by inserting a *_WARN level macro in a *_bug named function.

Care to read what I wrote ? Again:

> > Well, scheduling while atomic is a BUG, but one of the category which
> > allows the kernel to continue. So in fact it's treated like a WARN_ON.
> > So the tracing_off_event() qualifier should be *_WARN.

It does not matter at all whether the function name has "bug" in it or
not. What matters is the semantics of the function. It does _NOT_
raise a BUG. It merily warns and tries to continue. So it follows the
WARN() semantics.

If you feel strong about that send a patch to
s/schedule_bug/schedule_warn/ and I'll ack it.

> Essentially, it makes more sense to me for the macro to represent the
> severity of the case, and not be coupled somehow to what the kernel
> decides to do outside of the tracing.

Essentially you are wrong. The semantic of schedule_bug() is clearly
WARN() and not BUG(). So the tracing off qualifier needs to be
WARN. And it does not matter what you consider as the severity. The
severity is given by the semantics of schedule_bug().

If your extra stupid grub hiding logic prevents an user to change the
trace off level then you need to fix that instead of anything else.

BTW, if interacting with grub is that hard: how does an user start the
tracer at all ?

Thanks,

tglx