Re: deadlock caused by printk

From: Steven Rostedt
Date: Fri Mar 25 2011 - 14:40:30 EST


On Fri, Mar 25, 2011 at 01:26:00PM +0100, Peter Zijlstra wrote:
> On Fri, 2011-03-25 at 13:21 +0100, Peter Zijlstra wrote:
> > On Fri, 2011-03-25 at 20:15 +0800, Ming Lei wrote:
> > > Hi,
> > >
> > > These days I was studying "perf code" and try to dump some debug
> > > info in event_sched_out(): kernel/perf_event.c by printk, but found
> > > system can hang easily.
> > >
> > > After digging into related code for several days, I find the hang is
> > > caused by spinlock deadlock when acquiring rq->lock in task_rq_lock(),
> > > which is called in the path below:
> > >
> > > try_to_wake_up<-wake_up_process<-up<-console_unlock<-printk.

I've brought this up before, and basically was told "don't do that" ;)

> > >
> > > I wonder if there are some usage conventions about printk, which says
> > > explicitly that printk can not by used in some places of kernel. Or does the
> > > above show that it is a bug of kernel?

Just don't use it when the rq lock his held. Which you probably don't
want to anyway since most paths that require rq lock can flood the
console with printk messages.

>
> > One thing you could to is use trace_printk() and use ftrace to debug
> > perf.
>
> That should still work.

Yep, trace_printk() is the perfect thing to use to debug anything. I
tried hard to make it work in all contexts.

You can also set ftrace_dump_on_opps on the command line, or echo 1 into
/proc/sys/kernel/ftrace_dump_on_oops at run time if you want the ftrace
buffer to write to the console if the box crashes.

And finally, early_printk() works in these contexts too, as it doesn't
grab any locks. But because it does not, it can easily get corrupted
buffers.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/