Re: [PATCH RFC 1/3] mutex: Make more scalable by doing less atomic operations

From: Linus Torvalds
Date: Wed Apr 10 2013 - 11:46:32 EST


On Wed, Apr 10, 2013 at 7:09 AM, Robin Holt <holt@xxxxxxx> wrote:
> On Mon, Apr 08, 2013 at 07:38:39AM -0700, Linus Torvalds wrote:
>>
>> I forget where we saw the case where we should *not* read the initial
>> value, though. Anybody remember?
>
> I think you might be remembering ia64. Fairly early on, I recall there
> being a change in the spinlocks where we did not check them before just
> trying to acquire.

No, I think I found the one I was thinking of. It was the x86-32
version of atomic64_xchg() and atomic64_add_return(). We used to
actually read the old value in order to make the cmpxchg succeed on
the first try most of the time, but when it was cold in the cache that
actually hurt us. We were better off just picking a random value as
our first one, even if it resulted in the loop triggering, because for
the cold-cache case that avoids the unnecessary "bring in in shared
for the read, just to write to it later".

Commits 3a8d1788b37435baf6c296f4ea8beb4fa4955f44 and in particular
824975ef190e7dcb77718d1cc2cb53769b16d918.

Of course, a good OoO CPU would see the write happening later to the
same address, and bring things in for exclusive access immediately.
And it's possible that newer CPU's do that, but we did see this as
being an issue.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/