Re: [PATCH v6 04/46] percpu_rwlock: Implement the core design ofPer-CPU Reader-Writer Locks

From: Srivatsa S. Bhat
Date: Mon Feb 25 2013 - 14:28:43 EST


Hi Lai,

On 02/25/2013 09:23 PM, Lai Jiangshan wrote:
> Hi, Srivatsa,
>
> The target of the whole patchset is nice for me.

Cool! Thanks :-)

> A question: How did you find out the such usages of
> "preempt_disable()" and convert them? did all are converted?
>

Well, I scanned through the source tree for usages which implicitly
disabled CPU offline and converted them over. Its not limited to uses
of preempt_disable() alone - even spin_locks, rwlocks, local_irq_disable()
etc also help disable CPU offline. So I tried to dig out all such uses
and converted them. However, since the merge window is open, a lot of
new code is flowing into the tree. So I'll have to rescan the tree to
see if there are any more places to convert.

> And I think the lock is too complex and reinvent the wheel, why don't
> you reuse the lglock?

lglocks? No way! ;-) See below...

> I wrote an untested draft here.
>
> Thanks,
> Lai
>
> PS: Some HA tools(I'm writing one) which takes checkpoints of
> virtual-machines frequently, I guess this patchset can speedup the
> tools.
>
> From 01db542693a1b7fc6f9ece45d57cb529d9be5b66 Mon Sep 17 00:00:00 2001
> From: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> Date: Mon, 25 Feb 2013 23:14:27 +0800
> Subject: [PATCH] lglock: add read-preference local-global rwlock
>
> locality via lglock(trylock)
> read-preference read-write-lock via fallback rwlock_t
>
> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> ---
> include/linux/lglock.h | 31 +++++++++++++++++++++++++++++++
> kernel/lglock.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 76 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/lglock.h b/include/linux/lglock.h
> index 0d24e93..30fe887 100644
> --- a/include/linux/lglock.h
> +++ b/include/linux/lglock.h
> @@ -67,4 +67,35 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu);
> void lg_global_lock(struct lglock *lg);
> void lg_global_unlock(struct lglock *lg);
>
> +struct lgrwlock {
> + unsigned long __percpu *fallback_reader_refcnt;
> + struct lglock lglock;
> + rwlock_t fallback_rwlock;
> +};
> +
> +#define DEFINE_LGRWLOCK(name) \
> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
> + = __ARCH_SPIN_LOCK_UNLOCKED; \
> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
> + struct lgrwlock name = { \
> + .fallback_reader_refcnt = &name ## _refcnt, \
> + .lglock = { .lock = &name ## _lock } }
> +
> +#define DEFINE_STATIC_LGRWLOCK(name) \
> + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \
> + = __ARCH_SPIN_LOCK_UNLOCKED; \
> + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); \
> + static struct lgrwlock name = { \
> + .fallback_reader_refcnt = &name ## _refcnt, \
> + .lglock = { .lock = &name ## _lock } }
> +
> +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name)
> +{
> + lg_lock_init(&lgrw->lglock, name);
> +}
> +
> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw);
> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw);
> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw);
> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw);
> #endif
> diff --git a/kernel/lglock.c b/kernel/lglock.c
> index 6535a66..463543a 100644
> --- a/kernel/lglock.c
> +++ b/kernel/lglock.c
> @@ -87,3 +87,48 @@ void lg_global_unlock(struct lglock *lg)
> preempt_enable();
> }
> EXPORT_SYMBOL(lg_global_unlock);
> +
> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw)
> +{
> + struct lglock *lg = &lgrw->lglock;
> +
> + preempt_disable();
> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
> + if (likely(arch_spin_trylock(this_cpu_ptr(lg->lock)))) {
> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_);
> + return;
> + }
> + read_lock(&lgrw->fallback_rwlock);
> + }
> +
> + __this_cpu_inc(*lgrw->fallback_reader_refcnt);
> +}
> +EXPORT_SYMBOL(lg_rwlock_local_read_lock);
> +
> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw)
> +{
> + if (likely(!__this_cpu_read(*lgrw->fallback_reader_refcnt))) {
> + lg_local_unlock(&lgrw->lglock);
> + return;
> + }
> +
> + if (!__this_cpu_dec_return(*lgrw->fallback_reader_refcnt))
> + read_unlock(&lgrw->fallback_rwlock);
> +
> + preempt_enable();
> +}
> +EXPORT_SYMBOL(lg_rwlock_local_read_unlock);
> +

If I read the code above correctly, all you are doing is implementing a
recursive reader-side primitive (ie., allowing the reader to call these
functions recursively, without resulting in a self-deadlock).

But the thing is, making the reader-side recursive is the least of our
problems! Our main challenge is to make the locking extremely flexible
and also safe-guard it against circular-locking-dependencies and deadlocks.
Please take a look at the changelog of patch 1 - it explains the situation
with an example.

> +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw)
> +{
> + lg_global_lock(&lgrw->lglock);

This does a for-loop on all CPUs and takes their locks one-by-one. That's
exactly what we want to prevent, because that is the _source_ of all our
deadlock woes in this case. In the presence of perfect lock ordering
guarantees, this wouldn't have been a problem (that's why lglocks are
being used successfully elsewhere in the kernel). In the stop-machine()
removal case, the over-flexibility of preempt_disable() forces us to provide
an equally flexible locking alternative. Hence we can't use such per-cpu
locking schemes.

You might note that, for exactly this reason, I haven't actually used any
per-cpu _locks_ in this synchronization scheme, though it is named as
"per-cpu rwlocks". The only per-cpu component here are the refcounts, and
we consciously avoid waiting/spinning on them (because then that would be
equivalent to having per-cpu locks, which are deadlock-prone). We use
global rwlocks to get the deadlock-safety that we need.

> + write_lock(&lgrw->fallback_rwlock);
> +}
> +EXPORT_SYMBOL(lg_rwlock_global_write_lock);
> +
> +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw)
> +{
> + write_unlock(&lgrw->fallback_rwlock);
> + lg_global_unlock(&lgrw->lglock);
> +}
> +EXPORT_SYMBOL(lg_rwlock_global_write_unlock);
>

Regards,
Srivatsa S. Bhat

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/