Re: Performance regression from switching lock to rw-sem foranon-vma tree

From: Ingo Molnar
Date: Fri Jun 28 2013 - 05:38:18 EST



* Tim Chen <tim.c.chen@xxxxxxxxxxxxxxx> wrote:

> I tried some tweaking that checks sem->count for read owned lock. Even
> though it reduces the percentage of acquisitions that need sleeping by
> 8.14% (from 18.6% to 10.46%), it increases the writer acquisition
> blocked count by 11%. This change still doesn't boost throughput and has
> a tiny regression for the workload.
>
> Opt Spin Opt Spin
> (with tweak)
> Writer acquisition blocked count 7359040 8168006
> Blocked by reader 0.55% 0.52%
> Lock acquired first attempt (lock stealing) 16.92% 19.70%
> Lock acquired second attempt (1 sleep) 17.60% 9.32%
> Lock acquired after more than 1 sleep 1.00% 1.14%
> Lock acquired with optimistic spin 64.48% 69.84%
> Optimistic spin abort 1 11.77% 1.14%
> Optimistic spin abort 2 6.81% 9.22%
> Optimistic spin abort 3 0.02% 0.10%

So lock stealing+spinning now acquires the lock successfully ~90% of the
time, the remaining sleeps are:

> Lock acquired second attempt (1 sleep) ...... 9.32%

And the reason these sleeps are mostly due to:

> Optimistic spin abort 2 ..... 9.22%

Right?

So this particular #2 abort point is:

| preempt_disable();
| for (;;) {
| owner = ACCESS_ONCE(sem->owner);
| if (owner && !rwsem_spin_on_owner(sem, owner))
| break; <--------------------------- abort (2)

Next step would be to investigate why we decide to not spin there, why
does rwsem_spin_on_owner() fail?

If I got all the patches right, rwsem_spin_on_owner() is this:

+static noinline
+int rwsem_spin_on_owner(struct rw_semaphore *lock, struct task_struct *owner)
+{
+ rcu_read_lock();
+ while (owner_running(lock, owner)) {
+ if (need_resched())
+ break;
+
+ arch_mutex_cpu_relax();
+ }
+ rcu_read_unlock();
+
+ /*
+ * We break out the loop above on need_resched() and when the
+ * owner changed, which is a sign for heavy contention. Return
+ * success only when lock->owner is NULL.
+ */
+ return lock->owner == NULL;
+}

where owner_running() is similar to the mutex spinning code: it in the end
checks owner->on_cpu - like the mutex code.

If my analysis is correct so far then it might be useful to add two more
stats: did rwsem_spin_on_owner() fail because lock->owner == NULL [owner
released the rwsem], or because owner_running() failed [owner went to
sleep]?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/