Re: [RFC PATCH v2 1/2] printk-rb: add a new printk ringbuffer implementation

From: John Ogness
Date: Tue Jun 25 2019 - 08:03:54 EST


On 2019-06-25, Sergey Senozhatsky <sergey.senozhatsky.work@xxxxxxxxx> wrote:
>>>> In vprintk_emit(), are we going to always reserve 1024-byte
>>>> records, since we don't know the size in advance, e.g.
>>>>
>>>> printk("%pS %s\n", regs->ip, current->name)
>>>> prb_reserve(&e, &rb, ????);
>>>>
>>>> or are we going to run vscnprintf() on a NULL buffer first,
>>>> then reserve the exactly required number of bytes and afterwards
>>>> vscnprintf(s) -> prb_commit(&e)?
>>>
>>> (As suggested by Petr) I want to use vscnprintf() on a NULL
>>> buffer. However, a NULL buffer is not sufficient because things like the
>>> loglevel are sometimes added via %s (for example, in /dev/kmsg). So
>>> rather than a NULL buffer, I would use a small buffer on the stack
>>> (large enough to store loglevel/cont information). This way we can use
>>> vscnprintf() to get the exact size _and_ printk_get_level() will see
>>> enough of the formatted string to parse what it needs.
>>
>> vscnprintf() with NULL pointer is perfectly fine. Only the formatted
>> string has variable size.
>
> Yeah, that should work. Probably. Can't think of any issues, except
> for increased CPU usage. Some sprintf() format specifiers are heavier
> than the rest (pS/pF on ia64/ppc/hppa).
>
> OK, very theoretically.
>
> There is a difference.
>
> Doing "sz = vscprintf(NULL, msg); vscnprintf(buf, sz, msg)" for
> msg_print_text() and msg_print_ext_header() was safe, because the
> data - msg - would not change under us, we would work with logbuf
> records, IOW with data which is owned by printk() and printk only.
>
> But doing
> sz = vcsprintf(NULL, "xxx", random_pointer);
> if ((buf = prb_reserve(... sz))) {
> vscnprintf(buf, sz, "xxx", random_pointer);
> prb_commit(...);
> }
>
> might have different outcome sometimes. We probably (!!!) can have
> some race conditions. The problem is that, unlike msg_print_text()
> and msg_print_ext_header(), printk() works with pointers which it
> does not own nor control. IOW within single printk() we will access
> some random kernel pointers, then do prb stuff, then access those
> same pointers, expecting that none of them will ever change their
> state. A very simple example
>
> printk("Comm %s\n", current->comm)
>
> Suppose printk on CPU0 and ia64_mca_modify_comm on CPU1
>
> CPU0 CPU1
> printk(...)
> sz = vscprintf(NULL, "Comm %s\n", current->comm);
> ia64_mca_modify_comm()
> snprintf(comm, sizeof(comm), "%s %d", current->comm, previous_current->pid);
> memcpy(current->comm, comm, sizeof(current->comm));
> if ((buf = prb_reserve(... sz))) {
> vscnprintf(buf, "Comm %s\n", current->comm);
> ^^^^^^^^^^^^^^ ->comm has changed.
> Nothing critical, we
> should not corrupt
> anything, but we will
> truncate ->comm if its
> new size is larger than
> what it used to be when
> we did vscprintf(NULL).
> prb_commit(...);
> }
>
> Probably there can be other examples.

This is a very good point, and quite important. It is not acceptable if
some crash output is cut off because of this effect.

In my v1 rfc series, I avoided this issue by having a separate dedicated
ringbuffer (rb_sprintf) that was used to allocate a temporary max-size
(2KB) buffer for sprinting to. Then _that_ was used for the real
ringbuffer input (strlen, prb_reserve, memcpy, prb_commit). That would
still be the approach of my choice.

John Ogness