Re: [PATCH 3/4] kernel: Avoid softlockups in stop_machine() during heavy printing

From: Jan Kara
Date: Wed Sep 23 2015 - 04:37:48 EST


On Tue 22-09-15 12:55:02, Jan Kara wrote:
> > > +{
> > > + bool retry;
> > > + unsigned long flags;
> > > +
> > > + while (1) {
> > > + raw_spin_lock_irqsave(&logbuf_lock, flags);
> > > + retry = console_seq != log_next_seq;
> > > + raw_spin_unlock_irqrestore(&logbuf_lock, flags);
> >
> > Does this lock/unlock do anything useful?

Forgot to comment on this: console_seq and log_next_seq are updated under
logbuf_lock. Also they are 64-bit so on 32-bit archs their updates are
non-atomic. So although in practice the check will likely work fine without
logbuf_lock, I prefer taking the lock to save reader some pondering and the
code isn't performance sensitive in any way.

> > > + if (!retry || console_suspended)
> > > + break;
> > > + /* Cycle console_sem to wait for outstanding printing */
> > > + console_lock();
> > > + console_unlock();
> > > + }
> > > +}
> > > +
> > > +/*
> > > * Prevent further output on the passed console device so that (for example)
> > > * serial drivers can disable console output before suspending a port, and can
> > > * re-enable output afterwards.
> > > diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> > > index fd643d8c4b42..016d34621d2e 100644
> > > --- a/kernel/stop_machine.c
> > > +++ b/kernel/stop_machine.c
> > > @@ -21,6 +21,7 @@
> > > #include <linux/smpboot.h>
> > > #include <linux/atomic.h>
> > > #include <linux/lglock.h>
> > > +#include <linux/console.h>
> > >
> > > /*
> > > * Structure to determine completion condition and record errors. May
> > > @@ -543,6 +544,14 @@ int __stop_machine(int (*fn)(void *), void *data, const struct cpumask *cpus)
> > > return ret;
> > > }
> > >
> > > + /*
> > > + * If there are lots of outstanding messages, printing them can take a
> > > + * long time and all cpus would be spinning waiting for the printing to
> > > + * finish thus triggering NMI watchdog, RCU lockups etc. Wait for the
> > > + * printing here to avoid these.
> > > + */
> > > + console_flush();
> >
> > This is pretty pointless if num_possible_cpus==1. I'd suggest setting
> > printk_offload_chars=0 in this case, add some early bale-out into
> > console_flush(). Or something along those lines.
> >
> > And make console_flush() go away altogether if CONFIG_SMP=n - it's
> > pointless bloat.
>
> Sure, I'll do that.

I've implemented all this except for early bail out from console_flush()
when num_possible_cpus==1 - that doesn't seem very useful since we'll just
check that the buffer is empty and bail out anyway...

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/