Re: linux.git: printk() problem

From: Sergey Senozhatsky
Date: Mon Oct 24 2016 - 21:56:17 EST


Hello,

Cc more people

report: https://marc.info/?l=linux-kernel&m=147721454506634&w=2

patch: https://marc.info/?l=linux-kernel&m=147733173800859


FB is using ext header a lot (afaik), so may be Tejun or Calvin
will also be interested.

On (10/24/16 10:55), Linus Torvalds wrote:
> > Note the "totally untested" part. It compiles for me. But it may do
> > unspeakably stupid things. Caveat applior.
>
> Well, it is hard to apply a patch that I didn't even attach. Blush.

[..]

> -static void cont_flush(void)
> +static bool cont_flush(void)
> {
> - if (cont.flushed)
> - return;
> - if (cont.len == 0)
> + if (!cont.len)
> + return false;
> +
> + log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec,
> + NULL, 0, cont.buf, cont.len);
> + cont.len = 0;
> + return true;
> +}


I think cont_flush() should grab the logbuf_lock lock, because
it does log_store() and touches the cont.len. so something like
this perhaps

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index c7f490f..47f887c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1608,13 +1608,20 @@ static struct cont {

static bool cont_flush(void)
{
+ unsigned long flags;
+ bool flushed = false;
+
+ raw_spin_lock_irqsave(&logbuf_lock, flags);
if (!cont.len)
- return false;
+ goto out;

log_store(cont.facility, cont.level, cont.flags, cont.ts_nsec,
NULL, 0, cont.buf, cont.len);
cont.len = 0;
- return true;
+ flushed = true;
+out:
+ raw_spin_unlock_irqrestore(&logbuf_lock, flags);
+ return flushed;
}


[..]
> @@ -2449,7 +2311,6 @@ void console_unlock(void)
> }
> console_idx = log_next(console_idx);
> console_seq++;
> - console_prev = msg->flags;
> raw_spin_unlock(&logbuf_lock);
>
> stop_critical_timings(); /* don't trace print latency */
> @@ -2483,7 +2344,7 @@ void console_unlock(void)
> if (retry && console_trylock())
> goto again;
>
> - if (wake_klogd)
> + if (wake_klogd || cont.len)
^^^^^^^^^^
this _technically_ can result in additional spurious wakeups - cont.len
check is done outside of console_sem && logbuf_lock - but I don't think
this is a huge problem.

-ss