Re: [PATCH bpf-next v4 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf

From: Andrii Nakryiko
Date: Thu Apr 15 2021 - 18:38:13 EST


On Thu, Apr 15, 2021 at 2:33 AM Florent Revest <revest@xxxxxxxxxxxx> wrote:
>
> On Thu, Apr 15, 2021 at 2:38 AM Andrii Nakryiko
> <andrii.nakryiko@xxxxxxxxx> wrote:
> > On Wed, Apr 14, 2021 at 11:54 AM Florent Revest <revest@xxxxxxxxxxxx> wrote:
> > > +static int try_get_fmt_tmp_buf(char **tmp_buf)
> > > +{
> > > + struct bpf_printf_buf *bufs;
> > > + int used;
> > > +
> > > + if (*tmp_buf)
> > > + return 0;
> > > +
> > > + preempt_disable();
> > > + used = this_cpu_inc_return(bpf_printf_buf_used);
> > > + if (WARN_ON_ONCE(used > 1)) {
> > > + this_cpu_dec(bpf_printf_buf_used);
> >
> > this makes me uncomfortable. If used > 1, you won't preempt_enable()
> > here, but you'll decrease count. Then later bpf_printf_cleanup() will
> > be called (inside bpf_printf_prepare()) and will further decrease
> > count (which it didn't increase, so it's a mess now).
>
> Awkward, yes. :( This code is untested because it only covers a niche
> preempt_rt usecase that is hard to reproduce but I should have thought
> harder about these corner cases.
>
> > > + i += 2;
> > > + if (!final_args)
> > > + goto fmt_next;
> > > +
> > > + if (try_get_fmt_tmp_buf(&tmp_buf)) {
> > > + err = -EBUSY;
> > > + goto out;
> >
> > this probably should bypass doing bpf_printf_cleanup() and
> > try_get_fmt_tmp_buf() should enable preemption internally on error.
>
> Yes. I'll fix this and spend some more brain cycles thinking about
> what I'm doing. ;)
>
> > > -static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
> > > +BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
> > > + u64, arg2, u64, arg3)
> > > {
> > > + u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
> > > + enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
> > > static char buf[BPF_TRACE_PRINTK_SIZE];
> > > unsigned long flags;
> > > - va_list ap;
> > > int ret;
> > >
> > > - raw_spin_lock_irqsave(&trace_printk_lock, flags);
> > > - va_start(ap, fmt);
> > > - ret = vsnprintf(buf, sizeof(buf), fmt, ap);
> > > - va_end(ap);
> > > - /* vsnprintf() will not append null for zero-length strings */
> > > + ret = bpf_printf_prepare(fmt, fmt_size, args, args, mod,
> > > + MAX_TRACE_PRINTK_VARARGS);
> > > + if (ret < 0)
> > > + return ret;
> > > +
> > > + ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > > + BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
> > > + /* snprintf() will not append null for zero-length strings */
> > > if (ret == 0)
> > > buf[0] = '\0';
> > > +
> > > + raw_spin_lock_irqsave(&trace_printk_lock, flags);
> > > trace_bpf_trace_printk(buf);
> > > raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
> > >
> > > - return ret;
> >
> > see here, no + 1 :(
>
> I wonder if it's a bug or a feature though. The helper documentation
> says the helper returns "the number of bytes written to the buffer". I
> am not familiar with the internals of trace_printk but if the
> terminating \0 is not outputted in the trace_printk buffer, then it
> kind of makes sense.
>
> Also, if anyone uses this return value, I can imagine that the usecase
> would be if (ret == 0) assume_nothing_was_written(). And if we
> suddenly output 1 here, we might break something.
>
> Because the helper is quite old, maybe we should improve the helper
> documentation instead? Your call :)

Yeah, let's make helper's doc a bit more precise, otherwise let's not
touch it. I doubt many users ever check return result of
bpf_trace_printk() at all, tbh.