Re: [PATCH 2/2] percpu-refcount: implement percpu_tryget() alongwith percpu_ref_kill_and_confirm()

From: Kent Overstreet
Date: Wed Jun 12 2013 - 17:08:33 EST

On Wed, Jun 12, 2013 at 01:46:27PM -0700, Tejun Heo wrote:
> From de3c0749e2c1960afcc433fc5da136b85c8bd896 Mon Sep 17 00:00:00 2001
> From: Tejun Heo <tj@xxxxxxxxxx>
> Date: Wed, 12 Jun 2013 13:37:42 -0700
> Implement percpu_tryget() which succeeds iff the refcount hasn't been
> killed yet. Because the refcnt is per-cpu, different CPUs may have
> different perceptions on when the counter has been killed and tryget()
> may continue to succeed for a while after percpu_ref_kill() returns.

I don't feel very comfortable with saying percpu_ref_tryget() succeeds
"iff the refcount hasn't been killed yet". That's something I would say
about e.g. atomic_inc_not_zero(), but percpu_ref_tryget() doesn't do
that sort of synchronization which is what iff implies to me.

If the user does need some kind of strict ordering between
percpu_ref_kill() and percpu_ref_tryget(), they'd have to insert some
memory barriers - tryget() certainly doesn't have any.

That said, I haven't seen near enough actual uses to know whether this
would be an issue in practice, or what a better description would be. I
mean, tryget() does always get you a valid ref...

Maybe emphasize that tryget() succeeds iff this cpu hasn't seen
percpu_ref_kill() done yet? I dunno.

> For use cases where it's necessary to know when all CPUs start to see
> the refcnt as dead, percpu_ref_kill_and_confirm() is added. The new
> function takes an extra argument @confirm_kill which is invoked when
> the refcnt is guaranteed to be viewed as killed on all CPUs.
> While this isn't the prettiest interface, it doesn't force synchronous
> wait and is much safer than requiring the caller to do its own
> call_rcu().

Yeah, this seems... icky to me. I'm going to withhold judgement until I
see how it's used, maybe there isn't any other way but I'd like to try
and find something prettier.

> /**
> - * percpu_ref_kill - safely drop initial ref
> + * percpu_ref_kill_and_confirm - drop the initial ref and schedule confirmation
> * @ref: percpu_ref to kill
> + * @confirm_kill: optional confirmation callback
> *
> - * Must be used to drop the initial ref on a percpu refcount; must be called
> - * precisely once before shutdown.
> + * Equivalent to percpu_ref_kill() but also schedules kill confirmation if
> + * @confirm_kill is not NULL. @confirm_kill, which may not block, will be
> + * called after @ref is seen as dead from all CPUs - all further
> + * invocations of percpu_ref_tryget() will fail. See percpu_ref_tryget()
> + * for more details.
> *
> - * Puts @ref in non percpu mode, then does a call_rcu() before gathering up the
> - * percpu counters and dropping the initial ref.
> + * It's guaranteed that there will be at least one full RCU grace period
> + * between the invocation of this function and @confirm_kill and the caller
> + * can piggy-back their RCU release on the callback.
> */
> -void percpu_ref_kill(struct percpu_ref *ref)
> +void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
> + percpu_ref_func_t *confirm_kill)

Passing release to percpu_ref_init() and confirm_kill to
percpu_ref_kill() is inconsistent. Can we pass them both to

Also, given that confirm_kill is an optional thing I don't see why
you're renaming percpu_ref_kill() -> percpu_ref_kill_and_confirm(). Most
users (certainly aio, I think the module code too) don't have any use
for confirm kill, I don't want to rename it for an ugly optional thing.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at