Re: [RFC][PATCH 3/3] x86/nmi: Perform a safe NMI stack trace on all CPUs

From: Petr Mladek
Date: Wed Nov 19 2014 - 08:03:02 EST


On Wed 2014-11-19 11:41:14, Borislav Petkov wrote:
> On Tue, Nov 18, 2014 at 11:39:20PM -0500, Steven Rostedt wrote:
> > static int
> > arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
> > {
> > @@ -78,12 +157,14 @@ arch_trigger_all_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
> > cpu = smp_processor_id();
> >
> > if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
> > - static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
> > + printk_func_t printk_func_save = this_cpu_read(printk_func);
> >
> > - arch_spin_lock(&lock);
> > + /* Replace printk to write into the NMI seq */
> > + this_cpu_write(printk_func, nmi_vprintk);
> > printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
> > show_regs(regs);
> > - arch_spin_unlock(&lock);
> > + this_cpu_write(printk_func, printk_func_save);
>
> I'm wondering if this could be used in a generic manner throughout code
> where we could say "ok, I'm in an NMI context, so lemme switch printk's
> and do some printing" so that NMI and NMI-like atomic contexts could use
> printk. Lemme do an mce example:
>
> do_machine_check(..)
> {
> printk_func_t printk_func_save = this_cpu_read(printk_func);
>
> ...
>
> /* in #MC handler, switch printks */
> this_cpu_write(printk_func, nmi_vprintk);
>
> printk("This is a hw error, details: ...\n");
>
> /* more bla */
>
> this_cpu_write(printk_func, printk_func_save);
> }
>
> or should we change that in entry.S, before we call the handler?
>
> Because, if we could do something like that, then we finally get to use
> printk in an NMI context which would be a good thing.

Unfortunately, the problem is more complicated. The alternative
printk_func() just stores the string into the seq_buf. Then someone
needs to move the data to the main ring buffer, see print_seq_line()
and the related cycle in this patch.

The copying needs to be done in the normal context because it will
need to take the lock for the main ring buffer. Then it will need to
somehow protect the seq_buf against other accesses from NMI context.

By other words, any serious generic solution would end up with implementing
an alternative ring buffer and lockless operations. It would be
something similar to my older patch set, see
https://lkml.org/lkml/2014/5/9/118. It was not
accepted and it triggered this solution for NMI backtraces.

Note that this patch works because arch_trigger_all_cpu_backtrace() is
called from normal context and any further calls are ignored until the
current call is finished. So, there is a safe place to copy the data.


I have another idea for the generic solution. Linus was against using
printk() in NMI context, see https://lkml.org/lkml/2014/6/10/388.
The backtraces are the only serious usage. Other than
that there are only few warnings. My proposal is to handle the
warnings similar way like recursive printk() warning. The recursive
printk() just sets a flag, the message is dropped, warning about lost
message is printed within the next printk() call.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/