Re: [RFC][PATCHv2 3/7] printk: introduce per-cpu alt_print seq buffer

From: Sergey Senozhatsky
Date: Fri Sep 30 2016 - 22:24:52 EST


On (10/01/16 00:17), Sergey Senozhatsky wrote:
[..]
> +void alt_printk_enter(void)
> +{
> + unsigned long flags;
-
> + local_irq_save(flags);
> + if (!(this_cpu_read(alt_printk_ctx) & ALT_PRINTK_CONTEXT_MASK))
> + this_cpu_write(alt_printk_irq_flags, flags);
> + this_cpu_inc(alt_printk_ctx);
> +}
[..]
> +void alt_printk_exit(void)
> +{
> + this_cpu_dec(alt_printk_ctx);
> + if (!(this_cpu_read(alt_printk_ctx) & ALT_PRINTK_CONTEXT_MASK))
> + local_irq_restore(this_cpu_read(alt_printk_irq_flags));
> +}

a few words why did I make it so complex here. it may look that nested
alt_printk_enter()-s are not really possible; but it's not exactly the
case. and the reasons being is that printk exports too many symbols.
basically, the only 'safe' printk variant at the moment is

printk()
vprintk_func()

because it 'respects' current printk context... and that's it.
the reset of them will call vprintk_emit() regardless the context.
this patch set fixes (in 0001) vprintk()

vprintk()
vprintk_func()

but we still have

EXPORT_SYMBOL(vprintk_emit)
EXPORT_SYMBOL(printk_emit)
EXPORT_SYMBOL_GPL(vprintk_default)

any of those called from alt_printk_log_store() will deadlock us.

printk()
vprintk_emit()
alt_printk_enter()
raw_spin_lock(&logbuf_lock)
...
printk() << first recursion
vprintk_alt()
alt_printk_log_store()
vprintk_default()
---> vprintk_emit() << second recursion
raw_spin_lock(&logbuf_lock) << deadlock
...
alt_printk_exit()



our vprintk_alt()->alt_printk_log_store() seems to be *more or less*
safe, for the time being. there are many WARN_ON_* on
vprintk_alt()->alt_printk_log_store()->vsnprintf() path but they all
seem to be calling printk():

- WARN_ON_ONCE() from vsnprintf()
- WARN_ONCE() from vsnprintf()->format_decode()
- WARN_ON vsnprintf()->set_field_width()
- WARN_ON from vsnprintf()->set_precision()
- WARN_ON from vsnprintf()->pointer()->flags_string()

a side note, some of these WARNs are... 'funny'. e.g., to deadlock a
system it's enough to just pass an unsupported flag in format string.
vsnprintf() will
WARN_ONCE(1, "Please remove unsupported %%%c in format string\n", *fmt)

but the problem is that we are already in printk():

printk()
raw_spin_lock(&logbuf_lock)
text_len = vscnprintf(text, sizeof(textbuf), fmt, args)
WARN_ONCE(1, "Please remove unsupported ...)
printk()
raw_spin_lock(&logbuf_lock) << deadlock



with alt_printk enabled we are cool now.

[ 175.226161] BUG: recent printk recursion!
[ 175.226170] ------------[ cut here ]------------
[ 175.226173] WARNING: CPU: 3 PID: 361 at lib/vsprintf.c:1900 format_decode+0x22c/0x308
[ 175.226175] Please remove unsupported %{ in format string
[ 175.226182] CPU: 3 PID: 361 Comm: bash
[ 175.226194] Call Trace:
[ 175.226197] [<ffffffff811efbaf>] dump_stack+0x68/0x92
[ 175.226199] [<ffffffff8103d5e7>] __warn+0xb8/0xd3
[ 175.226202] [<ffffffff8103d648>] warn_slowpath_fmt+0x46/0x4e
[ 175.226204] [<ffffffff811f648d>] format_decode+0x22c/0x308
[ 175.226207] [<ffffffff811f7715>] vsnprintf+0x89/0x481
[ 175.226209] [<ffffffff811f7e92>] vscnprintf+0xd/0x26
[ 175.226212] [<ffffffff81080701>] vprintk_emit+0xcc/0x271
[ 175.226214] [<ffffffff81080a08>] vprintk_default+0x18/0x1a
[ 175.226217] [<ffffffff810819b1>] vprintk_func+0x82/0x89
[ 175.226219] [<ffffffff810c82d0>] printk+0x3e/0x46
[..]
[ 175.226244] [<ffffffff814c1d6a>] entry_SYSCALL_64_fastpath+0x18/0xad
[ 175.226249] ---[ end trace ]---



speaking of %p. FORMAT_TYPE_PTR is big and sometimes seem to be platform
dependent. there are tons of things that can go wrong there. for instance:

- vsnprintf()
pointer()
symbol_string()
sprint_backtrace()
__sprint_symbol()
kallsyms_lookup()
get_symbol_pos()
BUG_ON

- vsnprintf()
pointer()
symbol_string()
sprint_backtrace()
__sprint_symbol()
kallsyms_lookup()
module_address_lookup()
__module_address()
BUG_ON

- vsnprintf()
pointer()
symbol_string()
sprint_backtrace()
__sprint_symbol()
kallsyms_lookup()
module_address_lookup()
module_assert_mutex_or_preempt()
WARN_ON_ONCE()


and probably even more.

for example, drivers/base/core.c has its own printk() implementation
and calls vprintk_emit() directly
return vprintk_emit(0, level, hdrlen ? hdr : NULL, hdrlen, fmt, args)


so as a precautions measure I've decided to enable alt_printk_enter()
nesting. but can drop it.

-ss