Re: [PATCH] printk: use console_trylock() in console_cpu_notify()

From: Petr Mladek
Date: Wed Jan 25 2017 - 10:03:16 EST


On Sat 2017-01-21 19:47:29, Sergey Senozhatsky wrote:
> There is no need to always call blocking console_lock() in
> console_cpu_notify(), it's quite possible that console_sem can
> be locked by other CPU on the system, either already printing
> or soon to begin printing the messages. console_lock() in this
> case can simply block CPU hotplug for unknown period of time
> (console_unlock() is time unbound). Not that hotplug is very
> fast, but still, with other CPUs being online and doing
> printk() console_cpu_notify() can stuck.
>
> Use console_trylock() instead and opt-out if console_sem is
> already acquired from another CPU, since that CPU will do
> the printing for us.
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
> ---
> kernel/printk/printk.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 7180088cbb23..772eb16436ce 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2028,15 +2028,16 @@ void resume_console(void)
> * @cpu: unused
> *
> * If printk() is called from a CPU that is not online yet, the messages
> - * will be spooled but will not show up on the console. This function is
> - * called when a new CPU comes online (or fails to come up), and ensures
> - * that any such output gets printed.
> + * will be printed on the console only if there are CON_ANYTIME consoles.
> + * This function is called when a new CPU comes online (or fails to come
> + * up) or goes offline.
> */
> static int console_cpu_notify(unsigned int cpu)
> {
> if (!cpuhp_tasks_frozen) {
> - console_lock();
> - console_unlock();
> + /* If trylock fails, someone else is doing the printing */
> + if (console_trylock())
> + console_unlock();
> }
> return 0;
> }

Rather theoretically, the other owner of console_sem might be
on a CPU that is going online/offline and will refuse to print
the messages.

It is not that big deal when it goes online because then we will
call this notifier also for the other CPU and it will flush
the messages.

The problem is when the CPU goes offline. But it is broken
even now. It does not make much sense to call this notifier
when cpu_online(raw_smp_processor_id() returns false,
see can_use_console().


All in all. The change looks fine to me because it most likely
improves the performance[1] and does not affect the reliability
that much.

But we should revisit this notifier after adding the async printk
patchset. It will allow to defer the console handling to
another CPU and work even for going CPUs.

I add some people working on CPU hotplug into CC to get their
opinion.

[1] https://lkml.kernel.org/r/20170119120744.GB435@xxxxxxxxxxxxxxxxxxx

Best Regards,
Petr