Re: [PATCH 4/4] idr: Percpu ida

From: Kent Overstreet
Date: Tue Jun 18 2013 - 14:26:48 EST


On Tue, Jun 18, 2013 at 02:14:53PM +0000, Christoph Lameter wrote:
> On Mon, 17 Jun 2013, Kent Overstreet wrote:
>
> > +static inline unsigned alloc_local_tag(struct percpu_ida *pool,
> > + struct percpu_ida_cpu *tags)
> > +{
> > + int tag = -ENOSPC;
> > +
> > + spin_lock(&tags->lock);
> > + if (tags->nr_free)
> > + tag = tags->freelist[--tags->nr_free];
> > + spin_unlock(&tags->lock);
> > +
> > + return tag;
> > +}
>
> This could be made much faster by avoiding real atomics (coming with
> spinlocks) and using per cpu atomics instead. Slub f.e. uses a single
> linked per cpu list managed via this_cpu_cmpxchg.

Actually, we do need the atomic ops - they're protecting against a
different cpu grabbing our freelist with steal_tags().

The alternative (which Andrew Morton suggested and Jens implemented in
his fork of an earlier version of this code) would be to use IPIs, but
there are reasonable use cases (high ratio of cpus:nr_tags) where tag
stealing is going to be reasonably common so we don't want it to suck
too much. Plus, steal_tags() needs to try different cpu's freelists
until it finds one with tags, IPIs would make that loop... problematic.

I actually originally used cmpxchg(), but then later in review I
realized I had an ABA and couldn't figure out how to solve it, so that's
when I switched to spinlocks (which Andrew Morton preferred anyways).

But I just realized a few minutes ago I've seen your solution to ABA
with a stack, in the slub code - double word cmpxchg() with a
transaction id :) I may try that again just to see how the code looks,
but I'm not sure the difference in performance will be big enough to
matter, give that this lock should basically never be contended.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/