Re: Wait for mutex to become unlocked

From: Matthew Wilcox
Date: Wed May 04 2022 - 20:38:32 EST


On Thu, May 05, 2022 at 02:22:30AM +0200, Thomas Gleixner wrote:
> > So this is Good. For the vast majority of cases, we avoid taking the
> > mmap read lock and the problem will appear much less often. But we can
> > do Better with a new API. You see, for this case, we don't actually
> > want to acquire the mmap_sem; we're happy to spin a bit, but there's no
> > point in spinning waiting for the writer to finish when we can sleep.
> > I'd like to write this code:
> >
> > again:
> > rcu_read_lock();
> > vma = vma_lookup();
> > if (down_read_trylock(&vma->sem)) {
> > rcu_read_unlock();
> > } else {
> > rcu_read_unlock();
> > rwsem_wait_read(&mm->mmap_sem);
> > goto again;
> > }
> >
> > That is, rwsem_wait_read() puts the thread on the rwsem's wait queue,
> > and wakes it up without giving it the lock. Now this thread will never
> > be able to block any thread that tries to acquire mmap_sem for write.
>
> Never?
>
> if (down_read_trylock(&vma->sem)) {
>
> ---> preemption by writer

Ah! This is a different semaphore. Yes, it can be preempted while
holding the VMA rwsem and block a thread which is trying to modify the
VMA which will then block all threads from faulting _on that VMA_,
but it won't affect page faults on any other VMA. It's only Better,
not Best (the Best approach was proposed on Monday afternoon, and
the other MM developers asked us to only go as far as Better and
see if that was good enough).

> The information gathered from /proc/pid/smaps is unreliable at the point
> where the lock is dropped already today. So it does not make a
> difference whether the VMAs have a 'read me if you really think it's
> useful' sideband information which gets updated when the VMA changes and
> allows to do:

Mmm. I'm not sure that we want to maintain the smaps information on
the off chance that somebody wants to query it.

> But looking at the stuff which gets recomputed and reevaluated in that
> proc/smaps code this makes a lot of sense, because most if not all of
> this information is already known at the point where the VMA is modified
> while holding mmap_sem for useful reasons, no?

I suspect the only way to know is to try to implement it, and then
benchmark it.