Re: [PATCH 23/32] Generic dynamic per cpu refcounting

From: Kent Overstreet
Date: Fri Feb 08 2013 - 16:27:17 EST


On Fri, Feb 08, 2013 at 03:49:02PM +0100, Jens Axboe wrote:
> On Fri, Feb 08 2013, Tejun Heo wrote:
> > (cc'ing Andrew)
> >
> > On Wed, Dec 26, 2012 at 06:00:02PM -0800, Kent Overstreet wrote:
> > > This implements a refcount with similar semantics to
> > > atomic_get()/atomic_dec_and_test(), that starts out as just an atomic_t
> > > but dynamically switches to per cpu refcounting when the rate of
> > > gets/puts becomes too high.
> > >
> > > It also implements two stage shutdown, as we need it to tear down the
> > > percpu counts. Before dropping the initial refcount, you must call
> > > percpu_ref_kill(); this puts the refcount in "shutting down mode" and
> > > switches back to a single atomic refcount with the appropriate barriers
> > > (synchronize_rcu()).
> > >
> > > It's also legal to call percpu_ref_kill() multiple times - it only
> > > returns true once, so callers don't have to reimplement shutdown
> > > synchronization.
> > >
> > > For the sake of simplicity/efficiency, the heuristic is pretty simple -
> > > it just switches to percpu refcounting if there are more than x gets
> > > in one second (completely arbitrarily, 4096).
> > >
> > > It'd be more correct to count the number of cache misses or something
> > > else more profile driven, but doing so would require accessing the
> > > shared ref twice per get - by just counting the number of gets(), we can
> > > stick that counter in the high bits of the refcount and increment both
> > > with a single atomic64_add(). But I expect this'll be good enough in
> > > practice.
> > >
> > > Signed-off-by: Kent Overstreet <koverstreet@xxxxxxxxxx>
> >
> > What's the status of this series? The percpu-refcnt part is still
> > going through review and the merge window is opening up pretty soon.
> > Kent, Andrew?
>
> I'd feel a lot better deferring the whole aio/dio performance series for
> one merge window. There's very little point in rushing it, and I don't
> think it's been reviewed/tested enough yet.

It could probably use more review, but it has been sitting in linux-next
and the issues that showed up there are all fixed. You going to help
review it? :)

I'm not really set on it going in this merge cycle, but testing wise I
do think it's in pretty good shape and I'm not sure where we're going to
get more testing from before it goes in.

And Andrew - apologies for not getting you the benchmarks you asked for,
getting hardware for it has turned out to be more troublesome than I
expected. Still don't know what's going on with that.

(It turned out though that when I readded that second kiocb refcount
that io_submit_one() owns, that was a real performance regression. So
I've got a patch that refactors aio_rw_vect_retry() so it can safely be
dropped, and I'm going to _try_ to refactor cancellation so that I can
get rid of kiocb refcounting entirely. But I'm not really in a big hurry
to pile more stuff into this patch series, I'd prefer to get what I've
got in...)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/