Re: [PATCH 02/10] printk: Try harder to get logbuf_lock on NMI

From: Petr Mladek
Date: Thu May 28 2015 - 09:51:08 EST


On Wed 2015-05-27 16:14:09, Andrew Morton wrote:
> On Mon, 25 May 2015 14:46:25 +0200 Petr Mladek <pmladek@xxxxxxx> wrote:
>
> > If the logbuf_lock is not available immediately, it does not mean
> > that there is a deadlock. We should try harder and wait a bit.
> >
> > On the other hand, we must not forget that we are in NMI and the timeout
> > has to be rather small. It must not cause dangerous stalls.
> >
> > I even got full system freeze when the timeout was 10ms and I printed
> > backtraces from all CPUs. In this case, all CPUs were blocked for
> > too long.
> >
> > ...
> >
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -231,6 +231,8 @@ static DEFINE_RAW_SPINLOCK(logbuf_lock);
> >
> > #ifdef CONFIG_PRINTK
> > DECLARE_WAIT_QUEUE_HEAD(log_wait);
> > +/* cpu currently holding logbuf_lock */
> > +static unsigned int logbuf_cpu = UINT_MAX;
> > /* the next printk record to read by syslog(READ) or /proc/kmsg */
> > static u64 syslog_seq;
> > static u32 syslog_idx;
> > @@ -1610,6 +1612,38 @@ static size_t cont_print_text(char *text, size_t size)
> > return textlen;
> > }
> >
> > +/*
> > + * This value defines the maximum delay that we spend waiting for logbuf_lock
> > + * in NMI context. 100us looks like a good compromise. Note that, for example,
> > + * syslog_print_all() might hold the lock for quite some time. On the other
> > + * hand, waiting 10ms caused system freeze when many backtraces were printed
> > + * in NMI.
> > + */
> > +#define TRY_LOCKBUF_LOCK_MAX_DELAY_NS 100000UL
> > +
> > +/* We must be careful in NMI when we managed to preempt a running printk */
> > +static int try_logbuf_lock_in_nmi(void)
> > +{
> > + u64 start_time, current_time;
> > + int this_cpu = smp_processor_id();
> > +
> > + /* no way if we are already locked on this CPU */
> > + if (logbuf_cpu == this_cpu)
> > + return 0;
> > +
> > + /* try hard to get the lock but do not wait forever */
> > + start_time = cpu_clock(this_cpu);
> > + current_time = start_time;
> > + while (current_time - start_time < TRY_LOCKBUF_LOCK_MAX_DELAY_NS) {
> > + if (raw_spin_trylock(&logbuf_lock))
> > + return 1;
> > + cpu_relax();
> > + current_time = cpu_clock(this_cpu);
> > + }
>
> (Looks at the read_seqcount_retry() in
> kernel/time/sched_clock.c:sched_clock())
>
> Running cpu_clock() in NMI context seems a generally bad idea.

I am sorry but this is too cryptic for me :-)
read_seqcount_retry() looks safe to me under NMI.

> Using an ndelay/udelay wait loop here would be safer?

My intention was to call raw_spin_trylock() as many times as possible
within a given timeframe. I think that my solution gives a better
control over the maximum time. The CPU has to stay in NMI and
thus busy anyway.

Or do you have some other concern, please?


> There are many sites in kernel/printk/printk.c which take logbuf_lock,
> but this patch only sets logbuf_cpu in one of those cases:
> vprintk_emit(). I suggest adding helper functions to take/release
> logbuf_lock. And rename logbuf_lock to something else to ensure that
> nobody accidentally takes the lock directly.

IMHO, vprintk_emit() is special. It is the only location where the
lock is taken in NMI context. The other functions are used to dump
@logbuf and are called in normal context.

try_logbuf_lock_in_nmi() could fail and we need to handle the error
path. We do not need to do this in the other locations.

Note that we do not want to get the console in NMI because
there are even more locks that might cause a deadlock.


In each case, thanks a lot for detailed review. It is really appreciated.

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/