Re: [PATCH] rwsem-spinlock: let rwsem write lock stealable

From: Michel Lespinasse
Date: Thu Jan 31 2013 - 06:58:14 EST


On Wed, Jan 30, 2013 at 1:14 AM, Yuanhan Liu
<yuanhan.liu@xxxxxxxxxxxxxxx> wrote:
> We(Linux Kernel Performance project) found a regression introduced by
> commit 5a50508, which just convert all mutex lock to rwsem write lock.
> The semantics is same, but the results is quite huge in some cases.
> After investigation, we found the root cause: mutex support lock
> stealing. Here is the link for the detailed regression report:
> https://lkml.org/lkml/2013/1/29/84
>
> Ingo suggests to add write lock stealing to rwsem as well:
> "I think we should allow lock-steal between rwsem writers - that
> will not hurt fairness as most rwsem fairness concerns relate to
> reader vs. writer fairness"
>
> I then tried it with rwsem-spinlock first as I found it much easier to
> implement it than lib/rwsem.c. And here I sent out this patch first for
> comments. I'd try lib/rwsem.c later once the change to rwsem-spinlock
> is OK to you guys.

I noticed that you haven't modified __down_write_trylock() - for
consistency with __down_write() you should replace
if (sem->activity == 0 && list_empty(&sem->wait_list)) {
with
if (sem->activity == 0) {

Other than that, I like the idea. I was originally uncomfortable with
doing lock stealing for the rwsem, but I think doing it for writers
only as you propose should be fine. Readers wait for any queued
writers, and in exchange they are guaranteed to get the lock once
they've blocked. You *still* want to check for regressions that this
change might cause - not with anon_vma as this was a mutex not long
ago, but possibly with mmap_sem - but I'm crossing my fingers and
thinking that it'll most likely turn out fine.

I may be able to help with the non-spinlock version of this as I still
remember how this works.

--
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/