Re: two locations: was: Re: [PATCH printk v1 03/13] printk: use percpu flag instead of cpu_online()

From: Petr Mladek
Date: Fri Mar 04 2022 - 11:14:39 EST


On Wed 2022-03-02 15:55:23, John Ogness wrote:
> On 2022-02-16, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> >> index d1b773823d63..b346e60e9e51 100644
> >> --- a/kernel/printk/printk.c
> >> +++ b/kernel/printk/printk.c
> >> @@ -2577,11 +2577,11 @@ static int have_callable_console(void)
> >> *
> >> * Console drivers may assume that per-cpu resources have been allocated. So
> >> * unless they're explicitly marked as being able to cope (CON_ANYTIME) don't
> >> - * call them until this CPU is officially up.
> >> + * call them until per-cpu resources have been allocated.
> >> */
> >> static inline int can_use_console(void)
> >> {
> >> - return cpu_online(raw_smp_processor_id()) || have_callable_console();
> >> + return (printk_percpu_data_ready() || have_callable_console());
> >> }
> >
> > cpu_online(raw_smp_processor_id()) check is used also in
> > call_console_drivers(). The same logic should be used in both
> > locations.
> >
> > I found this when reviewing 6th patch that replaced both checks
> > with a single one.
> >
> > Note that I am still not sure if this change is correct at all.
> > It will allow to always call the console during CPU hotplug
> > and I am not sure if it is safe. IMHO, it might cause problems when
> > a console driver uses, for example, CPU-bound workqueues.
>
> You are correct. We must take hotplug into account for !CON_ANYTIME
> consoles. There may be some hotplug callbacks that make memory
> unavailable for the console.
>
> However, I will add the use of printk_percpu_data_ready() in the
> check. !CON_ANYTIME consoles also should not be called until the per-cpu
> areas are ready. For example, it would be bad if a console queued
> irq_work before per-cpu areas are setup (cpu_online() is true during
> this time).
>
> One of my main concerns was that raw_smp_processor_id() was used for the
> check. It is conceptually wrong to exclude certain consoles based on a
> current CPU when migration is still enabled. I understand that the use
> of can_use_console() is an optimization to avoid doing extra work where
> there are no consoles available. But the task could be preemptible there
> and _conceptually_, could get moved to another CPU before its write()
> callback is called. The cpu_online() check belongs in code where
> preemption is disabled.
>
> If the context is preemptible, I do not think it will ever see
> !cpu_online(). So I think if the cpu_online() check is limited to
> unlocking when console_trylock() was used, it will be correct.

This would require calling console_lock()/console_unlock()
in a hotplug code when cpu_online() already returns false.
Do I get it correctly?

I agree that it should not happen. console_lock() must be called in
a preemptible context. And CPU should not be in a pre-emptible
context when cpu_online() returns false. To be honest, I did
not check the code. It just does not make much sense.

> In the current implementation of printk(), it would be odd to do this
> conditional check (perhaps by passing @do_cond_resched to
> can_use_console()). But my series does significant refactoring and
> actually does need to distinguish between console_lock() and
> console_trylock() due to the kthreads and supporting the handover. So it
> should work well that the cpu_online() check for !CON_ANYTIME is only
> performed when !preemptible.
>
> Regardless, my v2 will keep cpu_online() checks since they are necessary
> for hotplug support.

Yes, I would do it to stay on the safe side.

Best Regards,
Petr