Re: [RESEND RFC PATCH 0/3] Provide fast access to thread specific data

From: Peter Oskolkov
Date: Fri Sep 10 2021 - 12:28:17 EST


On Fri, Sep 10, 2021 at 9:13 AM Prakash Sangappa
<prakash.sangappa@xxxxxxxxxx> wrote:
>
>
>
> > On Sep 10, 2021, at 8:18 AM, Peter Oskolkov <posk@xxxxxxxxxx> wrote:
> >
> > On Wed, Sep 8, 2021 at 5:16 PM Prakash Sangappa
> > <prakash.sangappa@xxxxxxxxxx> wrote:
> >>
> >> Including liunx-kernel..
> >>
> >> Resending RFC. This patchset is not final. I am looking for feedback on
> >> this proposal to share thread specific data for us in latency sensitive
> >> codepath.
> >
> > Hi Prakash,
>
>
> >
> > I'd like to add here that Jann and I have been discussing a similar
> > feature for my UMCG patchset:
> >
> > https://lore.kernel.org/lkml/CAG48ez0mgCXpXnqAUsa0TcFBPjrid-74Gj=xG8HZqj2n+OPoKw@xxxxxxxxxxxxxx/
>
> Hi Peter,
>
> I will take a look.
>
> >
> > In short, due to the need to read/write to the userspace from
> > non-sleepable contexts in the kernel it seems that we need to have some
> > form of per task/thread kernel/userspace shared memory that is pinned,
> > similar to what your sys_task_getshared does.
>
> Exactly. For this reason wanted kernel to allocate the pinned memory.
> Didn’t want to deal with files etc as a large number threads will be using
> the shared structure mechanism.
>
> >
> > Do you think your sys_task_getshared can be tweaked to return an
> > arbitrarily-sized block of memory (subject to overall constraints)
> > rather than a fixed number of "options"?
>
> I suppose it could. How big of a size? We don’t want to hold on to
> arbitrarily large amount of pinned memory. The preference would
> be for the kernel to decide what is going to be shared based on
> what functionality/data sharing is supported. In that sense the size
> is pre defined not something the userspace/application can ask.

There could be a sysctl or some other mechanism that limits the amount
of memory pinned per mm (or per task). Having "options" hardcoded for
such a generally useful feature seems limiting...

>
> I have not looked at your use case.
>
> >
> > On a more general note, we have a kernel extension internally at
> > Google, named "kuchannel", that is similar to what you propose here:
> > per task/thread shared memory with counters and other stat fields that
> > the kernel populates and the userspace reads (and some additional
> > functionality that is not too relevant to the discussion).
>
> We have few other use cases for this we are looking at, which I can
> describe later.
>
> -Prakash
>
> >
> > Thanks,
> > Peter
> >
> > [...]
>