Re: recursion handling: Re: [PATCH next v2 3/3] printk: remove logbuf_lock, add syslog_lock

From: John Ogness
Date: Sun Dec 06 2020 - 16:45:39 EST


On 2020-12-04, Petr Mladek <pmladek@xxxxxxxx> wrote:
> On Tue 2020-12-01 21:59:41, John Ogness wrote:
>> Since the ringbuffer is lockless, there is no need for it to be
>> protected by @logbuf_lock. Remove @logbuf_lock.
>>
>> --- a/kernel/printk/printk.c
>> +++ b/kernel/printk/printk.c
>> @@ -1847,6 +1811,65 @@ static void call_console_drivers(const char *ext_text, size_t ext_len,
>> }
>> }
>
> The recursion-related code needs some explanation or we should do it
> another way. I spent quite some time on it and I am still not sure
> that I understand it.

Sorry. :-/

> Let me describe how I understand it.
>
>> +#ifdef CONFIG_PRINTK_NMI
>> +#define NUM_RECURSION_CTX 2
>> +#else
>> +#define NUM_RECURSION_CTX 1
>> +#endif
>
> OK, the number of context is limited because interrupts are disabled
> inside print_enter()/printk_exit(). It is basically the same reason
> why we have only two printk_safe buffers (NNI + other contexts).

Correct.

> What is the exact reason to disable interrupts around the entire
> vprintk_store(), please? It should get documented.

It simplifies the context tracking. Also, in mainline interrupts are
already disabled for all of vprintk_store(). AFAIK latencies due to
logbuf_lock contention were not an issue.

> One reason is the use of per-cpu variables. Alternative solution would
> be to store printk_context into task_struct. Well, I am not sure if
> "current" task is available during early boot. But it might solve
> problems with per-cpu variables that are not working during early
> boot.
>
> That said, I am not sure if it is worth it.

I really don't want to touch task_struct. IMHO the usefulness of that
struct is limited, considering that printk can be called from scheduling
and interrupting contexts.

>> +
>> +struct printk_recursion {
>> + char count[NUM_RECURSION_CTX];
>> +};
>>
>> +static DEFINE_PER_CPU(struct printk_recursion, percpu_printk_recursion);
>> +static char printk_recursion_count[NUM_RECURSION_CTX];
>
> This is pretty confusing. The array is hidden in a struct when per-cpu
> variables are used. And a naked array is used for early boot.
>
> Is the structure really needed? What about?
>
> static DEFINE_PER_CPU(char [PRINTK_CTX_NUM], printk_count);
> static char printk_count_early[NUM_RECURSION_CTX];

OK.

>> +
>> +static char *get_printk_count(void)
>> +{
>> + struct printk_recursion *rec;
>> + char *count;
>> +
>> + if (!printk_percpu_data_ready()) {
>> + count = &printk_recursion_count[0];
>
> I see why you avoided per-cpu variables for early boot. I am just
> curious how printk_context variable works these days. It is used by
> any printk(), including early code, see vprintk_func().

IMO printk_context is serving a different purpose. With the existance of
logbuf_lock, printk_context exists for the sole purpose of making sure
logbuf_lock is not taken recursively or that the CPU does not spin on it
in NMI context. printk_context is simply gating calls to the safe
buffers.

For the lockless ringbuffer, there is no issue of taking a lock
recursively or dangers from NMI. There is no need for the printk_context
"gate". However, IMHO there is a real danger if a bug in printk (or its
ringbuffer) lead to infinite recursion. This new recursion counter is
offering safety against this scenario. Until now this scenario has been
ignored. So I suppose if we are comfortable with continuing to ignore
the scenario, then we don't need to track the recursion level.

To test, I triggered artificial WARNs in vsnprintf() of printk code. I
found it nice to be able to see the stack trace going into printk and at
the same time I was relieved that such a nested warning was not blasting
the system into infinite recursion.

>> + } else {
>> + rec = this_cpu_ptr(&percpu_printk_recursion);
>> +
>> + count = &rec->count[0];
>> + }
>> +
>> +#ifdef CONFIG_PRINTK_NMI
>> + if (in_nmi())
>> + count++;
>> +#endif
>
> This is extremely confusing. It is far from obvious that
> the pointer and not the value is incremented.
>
> If we really need this to avoid per-cpu variables during early boot
> then a more clear implementation would be:
>
> char *get_printk_counter_by_ctx()
> {
> int ctx = 0;
>
> if (in_nmi)
> ctx = 1;
>
> if (!printk_percpu_data_ready())
> return &printk_count_early[ctx];
>
> return this_cpu_ptr(printk_count[ctx]);
> }

Yes, much cleaner. Thank you.

>> +
>> + return count;
>> +}
>> +
>> +static bool printk_enter(unsigned long *flags)
>> +{
>> + char *count;
>> +
>> + local_irq_save(*flags);
>> + count = get_printk_count();
>> + /* Only 1 level of recursion allowed. */
>
> We should allow at least some level of recursion. Otherwise, we would
> not see warnings printed from vsprintf code.

With 1 level, you will see warnings from vsprintf code. I'm not sure it
makes sense to allow more than 1 level. It causes exponential logging.

>> + if (*count > 1) {
>> + local_irq_restore(*flags);
>> + return false;
>> + }
>> + (*count)++;
>> +
>> + return true;
>> +}
>
> This should be unified with printk_context, printk_nmi_enter(),
> printk_nmi_exit(). It does not make sense to have two separate
> printk context counters.
>
> Or is there any plan to remove printk_safe and printk_context?

Yes, I plan to remove the safe buffers, which also removes printk_safe.c
and the printk_context "gate".

> BTW: I prefer to use the bitmask approach. It allows to check
> the normal context by a single operation (no bit is set).
> There is no need to go through all counters.

OK.

> Note that we need at least one more context for gdb.

Ah yes, thank you.

John Ogness