Re: [PATCH v4] kretprobe: percpu support

From: Masami Hiramatsu
Date: Thu Mar 05 2020 - 09:46:41 EST


On Thu, 5 Mar 2020 12:05:42 +0100
Luigi Rizzo <lrizzo@xxxxxxxxxx> wrote:

> > > As part of this patch, we factor out the code to allocate instances in
> > > get_pcpu_rp_instance() and get_rp_instance().
> > >
> > > At the moment we only allow one pending kretprobe per CPU. This can be
> > > extended to a small constant number of entries, but finding a free entry
> > > would either bring back the lock, or require scanning an array, which can
> > > be expensive (especially if callers block and migrate).
> >
> > I think if you disables irq while scanning an array (that should be
> > a small array), you don't need to afraid of such racing (maybe we need
> > a pair of memory barriers).
> >
>
> To be clear, I was not concerned by races (irq disabled solve that, worst
> case we'll miss an entry being freed by another core while we scan). The
> cost I worried about was when we have many busy entries which can possibly
> be out of the local cache of cpu X eg because the thread that grabbed the
> entry moved to another cpu Y and is updating the record there. This can be
> partially mitigated by putting the user block in a different cache line so
> the cache conflict will happen only once on release.

But how much the cost is? Would you have any actual probe point and workload
about your concerning usecases? I think we can use perf to measure actual cost.

I would like to know the actual benefit of this change.
This is important because, in the future, if someone has another idea to
fix your concern, how I can judge it?

Thank you,

--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>