Re: [PATCH] Percpu tag allocator

From: Tejun Heo
Date: Thu Jun 13 2013 - 15:15:23 EST

Hello, Andrew.

On Thu, Jun 13, 2013 at 12:04:39PM -0700, Andrew Morton wrote:
> > The thing is that id[r|a] guarantee that the lowest available slot is
> > allocated
> That isn't the case for ida_get_new_above() - the caller gets to
> control the starting index.

Hmmm? get_new_above() is the same, it must allocate the first
available ID above the given low bound - used to exclude unused or
reserved IDs.

> The worst outcome here is that idr.c remains unimproved and we merge a
> new allocator which does basically the same thing.

The lowest number guarantee makes them different. Maybe tag
allocation can be layered on top as a caching layer, I don't know, but
at any rate we need at least two different operation modes.

> The best outcome is that idr.c gets improved and we don't have to merge
> duplicative code.
> So please, let's put aside the shiny new thing for now and work out how
> we can use the existing tag allocator for these applications. If we
> make a genuine effort to do this and decide that it's fundamentally
> hopeless then this is the time to start looking at new implementations.
> (I can think of at least two ways of making ida_get_new_above() an
> order of magnitude faster for this application and I'm sure you guys
> can as well.)

Oh, I'm sure the current id[r|a] can be improved upon a lot but I'm
very skeptical one can reach the level of scalability necessary for,
say, pci-e attached extremely high-iops devices while still keeping
the lowest number allocation, which can't be achieved without strong
synchronization on each alloc/free.

Maybe we can layer things so that we have percpu layer on top of
id[r|a] and, say, mapping id to point is still done by idr, or the
percpu tag allocator uses ida for tag chunk allocations, but it's
still gonna be something extra on top.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at