Re: [PATCH v2] rwsem: Support optimistic spinning

From: Peter Zijlstra
Date: Wed Apr 30 2014 - 06:01:16 EST


On Mon, Apr 28, 2014 at 03:09:01PM -0700, Davidlohr Bueso wrote:
> __visible
> struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
> {
> - long count, adjustment = -RWSEM_ACTIVE_WRITE_BIAS;
> + long count;
> struct rwsem_waiter waiter;
> struct task_struct *tsk = current;
> + bool waiting = true;
> +
> + /* undo write bias from down_write operation, stop active locking */
> + count = rwsem_atomic_update(-RWSEM_ACTIVE_WRITE_BIAS, sem);
> +
> + /* do optimistic spinning and steal lock if possible */
> + if (rwsem_optimistic_spin(sem))
> + goto done;

Why done, why not return? Afaict there's not yet been a change to the
state.

>
> /* set up my own style of waitqueue */
> waiter.task = tsk;
> @@ -204,34 +382,29 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
>
> raw_spin_lock_irq(&sem->wait_lock);
> if (list_empty(&sem->wait_list))
> - adjustment += RWSEM_WAITING_BIAS;
> + waiting = false;
> list_add_tail(&waiter.list, &sem->wait_list);
>
> /* we're now waiting on the lock, but no longer actively locking */
> - count = rwsem_atomic_update(adjustment, sem);
> + if (waiting)
> + count = ACCESS_ONCE(sem->count);
> + else
> + count = rwsem_atomic_update(RWSEM_WAITING_BIAS, sem);
> +

Is there a reason we must delay this? Why not do away with the waiting
variable and do it where we check the list_empty() ?

If there is a reason -- eg. we must order the list op vs the count op,
then there's a comment missing.

> - /* If there were already threads queued before us and there are no
> + /*
> + * If there were already threads queued before us and there are no
> * active writers, the lock must be read owned; so we try to wake
> - * any read locks that were queued ahead of us. */
> - if (count > RWSEM_WAITING_BIAS &&
> - adjustment == -RWSEM_ACTIVE_WRITE_BIAS)
> + * any read locks that were queued ahead of us.
> + */
> + if ((count > RWSEM_WAITING_BIAS) && waiting)
> sem = __rwsem_do_wake(sem, RWSEM_WAKE_READERS);
>
> /* wait until we successfully acquire the lock */
> set_task_state(tsk, TASK_UNINTERRUPTIBLE);

We should really use set_current_state(), there is no way tsk is
anything other than current, and using set_task_state() implies we're
changing someone else's state.

> while (true) {
> - if (!(count & RWSEM_ACTIVE_MASK)) {
> - /* Try acquiring the write lock. */
> - count = RWSEM_ACTIVE_WRITE_BIAS;
> - if (!list_is_singular(&sem->wait_list))
> - count += RWSEM_WAITING_BIAS;
> -
> - if (sem->count == RWSEM_WAITING_BIAS &&
> - cmpxchg(&sem->count, RWSEM_WAITING_BIAS, count) ==
> - RWSEM_WAITING_BIAS)
> - break;
> - }
> -
> + if (rwsem_try_write_lock(count, sem))
> + break;
> raw_spin_unlock_irq(&sem->wait_lock);
>
> /* Block until there are no active lockers. */
> @@ -245,8 +418,8 @@ struct rw_semaphore __sched *rwsem_down_write_failed(struct rw_semaphore *sem)
>
> list_del(&waiter.list);
> raw_spin_unlock_irq(&sem->wait_lock);
> +done:
> tsk->state = TASK_RUNNING;

__set_current_state(TASK_RUNNING);

Also, I would really expect this to be done right after the wait loop,
not outside of the lock.

> -
> return sem;
> }

Otherwise this looks ok I suppose.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/