Re: Memory allocation on speculative fastpaths

From: Matthew Wilcox
Date: Tue May 03 2022 - 20:22:21 EST


On Wed, May 04, 2022 at 01:45:11AM +0200, Michal Hocko wrote:
> On Tue 03-05-22 16:15:46, Suren Baghdasaryan wrote:
> > On Tue, May 3, 2022 at 11:28 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> [...]
> > > rcu_read_lock();
> > > vma = vma_lookup();
> > > if (down_read_trylock(&vma->sem)) {
> > > rcu_read_unlock();
> > > } else {
> > > rcu_read_unlock();
> > > mmap_read_lock(mm);
> > > vma = vma_lookup();
> > > down_read(&vma->sem);
> > > }
> > >
> > > ... and we then execute the page table allocation under the protection of
> > > the vma->sem.
> > >
> > > At least, that's what I think we agreed to yesterday.
> >
> > Honestly, I don't remember discussing vma->sem at all.
>
> This is the rangelocking approach that is effectivelly per-VMA. So that
> should help with the most simplistic case where the mmap contention is
> not on the same VMAs which should be the most common case (e.g. faulting
> from several threads while there is mmap happening in the background).
>
> There are cases where this could be too coarse of course and RCU would
> be a long term plan. The above seems easy enough and still probably good
> enough for most cases so a good first step.

It also fixes the low-pri monitoring daemon problem as page faults will
not be blocked by a writer (unless the read_trylock fails).

I see three potential outcomes here from the vma rwsem approach:

- No particular improvement on any workloads.
Result: we try something else.
- Minor gains (5-10%). We benchmark it and discover there's still
significant contention on the vma_sem.
Result: we take those wins and keep going towards a full RCU solution
- Major gains (20-50%).
Result: We're done, break out the champagne.