Re: [RFC PATCH v2] printk: console: Allow each console to have its own loglevel

From: Petr Mladek
Date: Thu Jul 07 2022 - 10:38:36 EST


On Fri 2022-05-20 12:06:33, Rik van Riel wrote:
> On Fri, 2022-05-20 at 13:57 +0100, Chris Down wrote:
> > [Once the goals of this patch are generally agreed upon, it can be
> > split
> > out further with more detailed changelogs if desired.]
> >
> > Consoles can have vastly different latencies and throughputs. For
> > example, writing a message to the serial console can take on the
> > order
> > of tens of milliseconds to get the UART to successfully write a
> > message.
> > While this might be fine for a single, one-off message, this can
> > cause
> > significant application-level stalls in situations where the kernel
> > writes large amounts of information to the console.
> >
> It's more than just application-level stalls. I have seen
> some cases of the kernel spending so much time logging things
> to serial console that it thinks it locked up, and panics as
> a result of how slow the serial console is.
>
> Adding insult to injury, because the log level is sytem wide,
> we only see _some_ of the hints of why the kernel started
> spewing like that in the netcons logs.
>
> If we print all the information, we will have more hosts panic
> because we spent too much time in the serial console code.
>
> If we print less information, we won't find out some of the
> other things causing issues on systems.
>
> Having per console log levels will allow us to avoid the
> serial console issues, and gather all the info we need on
> other stuff happening on the system.

The problem is clear. But the big part of the problem is that printk()
tries to show the messages on all consoles immediately.

I wonder how much the per-console loglevel would be needed
when the console handling is offloaded to per-console kthreads, see
https://lore.kernel.org/all/20220421212250.565456-1-john.ogness@xxxxxxxxxxxxx/
It causes that printk() should "never" block and each console might
run on its own speed.

It still might be useful from some reasons:

+ Serial consoles might miss messages because the old messages are
over-written before they reach the console. It might be solved
by big enough buffer.

+ printk() still tries to show the messages immediately in some
critical situations, for example, early boot, watchdog warnings,
suspend, reboot, OOps, panic(). The slow consoles might still
cause stalls and put the system into its knees.

+ People might need to explicitly disable the kthreads, for
example, when debugging a situation when kthreads are not
scheduled.


So, I think that the per-console loglevels might still be useful.
But I wonder if they really will be used in practice. It does
not make sense to add feature that would get obsoleted by
the kthreads.

Note that the per-console kthreads were added into 5.19-rc1.
Unfortunately they were reverted in 5.19-rc4 because there were
some issues that need more work. But we still believe that
they are needed and we could make them working reliably.

Best Regards,
Petr

PS: I am sorry for the late response. I am still snowed under
many tasks. The printk kthreads are complicated and need
a lot of attention. Plus there was a sickness, vacations,
and other tasks.