Re: [PATCH printk v1 07/18] printk: nobkl: Add buffer management

From: Petr Mladek
Date: Thu Mar 23 2023 - 11:25:25 EST


On Thu 2023-03-23 14:44:43, John Ogness wrote:
> On 2023-03-21, Petr Mladek <pmladek@xxxxxxxx> wrote:
> > The per-CPU buffer actually looks dangerous. It might
> > be used by more NOBKL consoles. How is the access synchronized
> > please? By console_list_lock? It is not obvious to me.
>
> Each console has its own set of per-CPU buffers (con->pcpu_data).
>
> > On the contrary, we might need 4 static buffers for the early
> > boot. For example, one atomic console might start printing
> > in the normal context. Second atomic console might use
> > the same static buffer in IRQ context. But the first console
> > will not realize it because it did not loose the per-CPU
> > atomic lock when the CPU handled the interrupt..
> > Or is this handled another way, please?
>
> You are correct! Although I think 3 initdata static buffers should
> suffice. (2 if the system does not support NMI).

I am never completely sure about it. My undestanding is that softirq
might be proceed at the end if irq_exit():

+ irq_exit()
+ __irq_exit_rcu()
+ invoke_softirq()
+ __do_softirq()

And I see local_irq_enable() in __do_softirq() before softirq actions
are proceed. It means that there might be 4 nested contexts:

+ task
+ softirq
+ irq
+ NMI

So we need 4 buffers (3 if the system does not support NMI).


> Your feedback points out that we are allocating a lot of extra memory
> for the rare case of a hostile takeover from another CPU when in
> panic. I suppose it would be enough to have a single dedicated panic
> buffer to use in this case.

Yup.

> With all that in mind, we would have 3 initdata early buffers, a single
> panic buffer, and per-console buffers. So the function would look
> something like this:
>
> static __ref void cons_context_set_pbufs(struct cons_context *ctxt)
> {
> struct console *con = ctxt->console;
>
> if (atomic_read(&panic_cpu) == smp_processor_id())
> ctxt->pbufs = &panic_ctxt_data.pbufs;
> else if (con->pbufs)
> ctxt->pbufs = con->pbufs;
> else
> ctxt->pbufs = &early_cons_ctxt_data[early_nbcon_nested].pbufs;
> }

Looks good.

> It should be enough to increment @early_nbcon_nested in cons_get_wctxt()
> and decrement it in a new cons_put_wctxt() that is called after
> cons_atomic_flush_con().

I still have to understand the logic related to
cons_atomic_flush_con() and early boot.


> Originally in tglx's design, hostile takeovers were allowed at any time,
> which requires the per-CPU data per console. His idea was that the
> policy about hostile takeovers should be implemented outside the nbcons
> framework. However, with this newly proposed change in order to avoid
> per-CPU buffers for every console, we are adding an implicit rule that
> hostile takeovers only occur at panic. Maybe it is ok to hard-code this
> particular policy. It would certainly save significant buffer space and
> I not sure if hostile takeovers make sense outside of a panic context.

I am not sure about the hostile takeovers as well. But they might be
potentially dangerous so I would allow them only in panic for a start.
And I would avoid the per-CPU buffers if we do not need them now.
We could always make it more complicated...

Best Regards,
Petr