Re: [debug patch] printk: Add a printk killswitch to robustify NMIwatchdog messages

From: Ingo Molnar
Date: Mon Jun 06 2011 - 13:12:19 EST



* Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:

> > > but console_sem isn't klogd. We delay klogd and that's
> > > perfectly fine, but afaict we don't delay console_sem.
> >
> > But console_sem is really a similar special case as klogd. See,
> > it's about a *printk*. That's rare by definition.
>
> But its not rare, its _the_ lock that serialized the whole console
> layer. Pretty much everything a console does goes through that
> lock.

Please. Think.

If console_sem was so frequently held then why on earth were you
*unable* to trigger the lockup with an artificial printk() storm and
why on earth has almost no-one else but Arne triggered it? :-)

This bug is the very proof that console_sem is seldom contended!

> Ahh, what we could do is something like the below and delay both
> the acquire and release of the console_sem.

Yeah!

> +void printk_tick(void)
> +{
> + if (!__this_cpu_read(printk_pending))
> + return;
> +
> + /*
> + * Try to acquire and then immediately release the
> + * console semaphore. The release will do all the
> + * actual magic (print out buffers, wake up klogd,
> + * etc).
> + */
> + if (console_trylock_for_printk(smp_processor_id())) {
> + console_unlock();
> + __this_cpu_write(printk_pending, 0);
> + }
> +}

Arne does this fix the hang you are seeing?

Now, we probably don't want to do this in 3.0, just to give time for
interactions to found and complaints to be worded. So we could do the
minimal fix first and queue up the bigger change for 3.1.

Hm?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/