Re: [RFC] Cancellable MCS spinlock rework

From: Jason Low
Date: Thu Jul 03 2014 - 17:54:57 EST


On Thu, 2014-07-03 at 17:35 -0400, Waiman Long wrote:
> On 07/03/2014 04:51 PM, Jason Low wrote:
> > On Thu, 2014-07-03 at 16:35 -0400, Waiman Long wrote:
> >> On 07/03/2014 02:34 PM, Jason Low wrote:
> >>> On Thu, 2014-07-03 at 10:09 -0700, Davidlohr Bueso wrote:
> >>>> On Thu, 2014-07-03 at 09:31 +0200, Peter Zijlstra wrote:
> >>>>> On Wed, Jul 02, 2014 at 10:30:03AM -0700, Jason Low wrote:
> >>>>>> Would potentially reducing the size of the rw semaphore structure by 32
> >>>>>> bits (for all architectures using optimistic spinning) be a nice
> >>>>>> benefit?
> >>>>> Possibly, although I had a look at the mutex structure and we didn't
> >>>>> have a hole to place it in, unlike what you found with the rwsem.
> >>>> Yeah, and currently struct rw_semaphore is the largest lock we have in
> >>>> the kernel. Shaving off space is definitely welcome.
> >>> Right, especially if it could help things like xfs inode.
> >>>
> >> I do see a point in reducing the size of the rwsem structure. However, I
> >> don't quite understand the point of converting pointers in the
> >> optimistic_spin_queue structure to atomic_t.
> > Converting the pointers in the optimistic_spin_queue to atomic_t would
> > mean we're fully operating on atomic operations instead of using the
> > potentially racy cmpxchg + ACCESS_ONCE stores on the pointers.
>
> Yes, the ACCESS_ONCE macro for data store does have problem on some
> architectures. However, I prefer a more holistic solution to solve this
> problem rather than a workaround by changing the pointers to atomic_t's.
> It is because even if we make the change, we are still not sure if that
> will work for those architectures as we have no machine to verify that.
> Why not let the champions of those architectures to propose changes
> instead of making some untested changes now and penalize commonly used
> architectures like x86.

So I initially was thinking that converting to atomic_t would not result
in reducing performance on other architecture. However, you do have a
point in your first post that converting the encoded cpu number to the
pointer may add a little bit of overhead (in the contended cases).

If converting pointers to atomic_t in the optimistic_spin_queue
structure does affect performance for commonly used architectures, then
I agree that we should avoid that and only convert what's stored in
mutex/rwsem.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/