Re: [PATCH 3/3] Stop tracing on a schedule bug

From: Thomas Gleixner
Date: Thu Apr 15 2010 - 19:02:13 EST


On Thu, 15 Apr 2010, Chase Douglas wrote:
> On Thu, Apr 15, 2010 at 2:03 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> > On Wed, Apr 14, 2010 at 12:20:16PM -0400, Chase Douglas wrote:
> >> This change adds a tracing_off_event() call to stop tracing on schedule
> >> bugs unless tracing_off=none was specified on the commandline.
> >>
> >> Signed-off-by: Chase Douglas <chase.douglas@xxxxxxxxxxxxx>
> >> ---
> >>  kernel/sched.c |    2 ++
> >>  1 files changed, 2 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/kernel/sched.c b/kernel/sched.c
> >> index 6af210a..439f036 100644
> >> --- a/kernel/sched.c
> >> +++ b/kernel/sched.c
> >> @@ -3590,6 +3590,8 @@ static noinline void __schedule_bug(struct task_struct *prev)
> >>  {
> >>       struct pt_regs *regs = get_irq_regs();
> >>
> >> +     tracing_off_event(TRACE_EVENT_BUG);
> >> +
> >>       printk(KERN_ERR "BUG: scheduling while atomic: %s/%d/0x%08x\n",
> >>               prev->comm, prev->pid, preempt_count());
> >
> >
> >
> > I would rather call that a TRACE_EVENT_WARN as this is what happens: we
> > warn but we continue.
>
> I tend to think of the TRACE_EVENT_* as an indication of severity and
> whether we want to stop the trace by default. From a distro
> standpoint, the likelihood that we want to continue tracing after a
> __schedule_bug is pretty low. It's easiest if we don't have to tell

Well, scheduling while atomic is a BUG, but one of the category which
allows the kernel to continue. So in fact it's treated like a WARN_ON.
So the tracing_off_event() qualifier should be *_WARN.

That's independent of the question whether you want to stop tracing in
that very case. Though I agree that the tracer should stop here.

> our users to add a kernel command line, especially since grub in
> Ubuntu 10.04 LTS is difficult to interact with for end users.

That's a serious PITA caused by the "let's mimic the other OS" crowd
and no excuse for creating a mess in the kernel.

Thanks,

tglx