Re: [GIT pull] locking/urgent for v5.16-rc3

From: Peter Zijlstra
Date: Mon Nov 29 2021 - 04:04:40 EST


On Sun, Nov 28, 2021 at 09:15:10AM -0800, Linus Torvalds wrote:
> On Sun, Nov 28, 2021 at 8:35 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >
> > - down_read_trylock() is suboptimal when the lock is contended and
> > multiple readers trylock concurrently. That's due to the initial value
> > being read non-atomically which results in at least two compare exchange
> > loops. Making the initial readout atomic reduces this significantly.
> > Whith 40 readers by 11% in a benchmark which enforces contention on
> > mmap_sem.
>
> This was an intentional optimization to avoid unnecessary cache
> protocol cycles for when the lock isn't contended - first getting a
> cacheline for read ownership, only to then get it for write.
>
> But I guess we don't have any good benchmarks for non-contention, so ...
>
> I also hope that maybe modern hardware is smart enough to see "I will
> write to it later" and avoid the "get line for shared access only to
> get it for exclusive access immediately afterwards" issue.

Yes, I raised that same point, otoh those numbers are not showing that.
They did lightly contended, but I suppose not cache-cold.