Re: [GIT pull] locking/urgent for v5.16-rc3

From: Linus Torvalds
Date: Sun Nov 28 2021 - 12:21:00 EST


On Sun, Nov 28, 2021 at 8:35 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> - down_read_trylock() is suboptimal when the lock is contended and
> multiple readers trylock concurrently. That's due to the initial value
> being read non-atomically which results in at least two compare exchange
> loops. Making the initial readout atomic reduces this significantly.
> Whith 40 readers by 11% in a benchmark which enforces contention on
> mmap_sem.

This was an intentional optimization to avoid unnecessary cache
protocol cycles for when the lock isn't contended - first getting a
cacheline for read ownership, only to then get it for write.

But I guess we don't have any good benchmarks for non-contention, so ...

I also hope that maybe modern hardware is smart enough to see "I will
write to it later" and avoid the "get line for shared access only to
get it for exclusive access immediately afterwards" issue.

Linus

Linus