Re: [PATCH bpf-next v1 4/9] bpf: Introduce sleepable tracepoints

From: Alexei Starovoitov
Date: Thu Mar 03 2022 - 15:04:53 EST


On Thu, Mar 3, 2022 at 12:02 PM Alexei Starovoitov
<alexei.starovoitov@xxxxxxxxx> wrote:
>
> On Thu, Mar 3, 2022 at 11:43 AM Hao Luo <haoluo@xxxxxxxxxx> wrote:
> >
> > On Wed, Mar 2, 2022 at 6:29 PM Alexei Starovoitov
> > <alexei.starovoitov@xxxxxxxxx> wrote:
> > >
> > > On Wed, Mar 2, 2022 at 5:09 PM Yonghong Song <yhs@xxxxxx> wrote:
> > > >
> > > >
> > > >
> > > > On 3/2/22 1:30 PM, Alexei Starovoitov wrote:
> > > > > On Wed, Mar 2, 2022 at 1:23 PM Yonghong Song <yhs@xxxxxx> wrote:
> > > > >>
> > > > >>
> > > > >>
> > > > >> On 2/25/22 3:43 PM, Hao Luo wrote:
> > > > >>> Add a new type of bpf tracepoints: sleepable tracepoints, which allows
> > > > >>> the handler to make calls that may sleep. With sleepable tracepoints, a
> > > > >>> set of syscall helpers (which may sleep) may also be called from
> > > > >>> sleepable tracepoints.
> > > > >>
> > > > >> There are some old discussions on sleepable tracepoints, maybe
> > > > >> worthwhile to take a look.
> > > > >>
> > > > >> https://lore.kernel.org/bpf/20210218222125.46565-5-mjeanson@xxxxxxxxxxxx/T/
> > > > >
> > > > > Right. It's very much related, but obsolete too.
> > > > > We don't need any of that for sleeptable _raw_ tps.
> > > > > I prefer to stay with "sleepable" name as well to
> > > > > match the rest of the bpf sleepable code.
> > > > > In all cases it's faultable.
> > > >
> > > > sounds good to me. Agree that for the bpf user case, Hao's
> > > > implementation should be enough.
> > >
> > > Just remembered that we can also do trivial noinline __weak
> > > nop function and mark it sleepable on the verifier side.
> > > That's what we were planning to do to trace map update/delete ops
> > > in Joe Burton's series.
> > > Then we don't need to extend tp infra.
> > > I'm fine whichever way. I see pros and cons in both options.
> >
> > Joe is also cc'ed in this patchset, I will sync up with him on the
> > status of trace map work.
> >
> > Alexei, do we have potentially other variants of tp? We can make the
> > current u16 sleepable a flag, so we can reuse this flag later when we
> > have another type of tracepoints.
>
> When we added the ability to attach to kernel functions and mark them
> as allow_error_inject the usefulness of tracepoints and even
> writeable tracepoints was deminissed.
> If we do sleepable tracepoint, I suspect, it may be the last extension
> in that area.
> I guess I'm convincing myself that noinline weak nop func
> is better here. Just like it's better for Joe's map tracing.

To add to the above... The only downside of sleepable nop func
comparing to tp is the lack of static_branch.
So this nop call will always be there.
For map tracing and for cgroup mkdir/rmdir the few nanosecond
overhead of calling an empty function isn't even measurable.