Re: [PATCH v7 1/4] qrwlock: A queue read/write lock implementation

From: Waiman Long
Date: Fri Nov 22 2013 - 15:35:27 EST

Next message: Toralf FÃrster: "Re: [uml-devel] fuzz tested 32 bit user mode linux image hangs inradix_tree_next_chunk()"
Previous message: KOSAKI Motohiro: "Re: [PATCH v2 4/4] kill task_struct->did_exec"
In reply to: Linus Torvalds: "Re: [PATCH v7 1/4] qrwlock: A queue read/write lock implementation"
Next in thread: Linus Torvalds: "Re: [PATCH v7 1/4] qrwlock: A queue read/write lock implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/22/2013 02:14 PM, Linus Torvalds wrote:

On Fri, Nov 22, 2013 at 11:04 AM, Waiman Long<Waiman.Long@xxxxxx> wrote:
In term of single-thread performance (no contention), a 256K
lock/unlock loop was run on a 2.4GHz and 2.93Ghz Westmere x86-64
CPUs. The following table shows the average time (in ns) for a single
lock/unlock sequence (including the looping and timing overhead):

Lock Type 2.4GHz 2.93GHz
--------- ------ -------
Ticket spinlock 14.9 12.3
Read lock 17.0 13.5
Write lock 17.0 13.5
Queue read lock 16.0 13.4
Queue write lock 9.2 7.8

Can you verify for me that you re-did those numbers? Because it used
to be that the fair queue write lock was slower than the numbers you
now quote..

Was the cost of the fair queue write lock purely in the extra
conditional testing for whether the lock was supposed to be fair or
not, and now that you dropped that, it's fast? If so, then that's an
extra argument for the old conditional fair/unfair being complete
garbage.

Yes, the extra latency of the fair lock in earlier patch is due to the need to do a second cmpxchg(). That can be avoided by doing a read first, but that is not good for good cache. So I optimized it for the default unfair lock. By supporting only one version, there is no need to do a second cmpxchg anymore.

Alternatively, maybe you just took the old timings, and the above
numbers are for the old unfair code, and *not* for the actual patch
you sent out?

So please double-check and verify.

Linus

I reran the timing test on the 2.93GHz processor. The timing is the practically the same. I reused the old one for the 2.4GHz processor.

Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Toralf FÃrster: "Re: [uml-devel] fuzz tested 32 bit user mode linux image hangs inradix_tree_next_chunk()"
Previous message: KOSAKI Motohiro: "Re: [PATCH v2 4/4] kill task_struct->did_exec"
In reply to: Linus Torvalds: "Re: [PATCH v7 1/4] qrwlock: A queue read/write lock implementation"
Next in thread: Linus Torvalds: "Re: [PATCH v7 1/4] qrwlock: A queue read/write lock implementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]