Re: [PATCH V4 1/1] rcu: introduce kfree_rcu()

From: Paul E. McKenney
Date: Tue Mar 15 2011 - 08:19:55 EST


On Tue, Mar 15, 2011 at 01:02:09PM +0100, Arnd Bergmann wrote:
> On Tuesday 15 March 2011, Paul E. McKenney wrote:
> > > Another alternative might be to encode the difference between a
> > > function pointer and an offset in one of the lower bits of the address.
> >
> > We discussed this some time back, and it turned out that there were
> > CPUs that could legitimately have any combination of low-order bits
> > set -- functions could start at any byte address.
> >
> > If this has changed, I would prefer to use the low-order bits, but
> > if it has not, we can't. :-(
>
> Ok, I see.
>
> I just had another idea, which may or may not have new problems:
>
> static inline void *kzalloc_rcu(size_t len, gfp_t flags)
> {
> struct rcu_head *head = kzalloc(len + sizeof (struct rcu_head), flags);
> return head + 1;
> }
>
> void __kfree_rcu(struct rcu_head *head)
> {
> kfree(head);
> }
>
> static inline void kfree_rcu(void *p)
> {
> struct rcu_head *head = p - sizeof (struct rcu_head);
> call_rcu(head, __kfree_rcu);
> }
>
> The only disadvantage I can see right now is that it messes
> with the alignment of the structure.

And it makes use of statically allocated structures a bit clunky.

The other approach I could imagine would be to create the RCU callback
functions on the fly at compile/link time by creating a new section into
which offsets and places for function pointers are placed. A link-time
utility could scan the contents of the section, generate the needed
functions, compile them, and place pointers to them into the section.

One disadvantage of this approach (in addition to the changes required
to kbuild) is that it would not allow rcu_barrier() to be removed.

Yet another approach is to use the low-order bit of the rcu_head pointer,
given that the rcu_head structure does have to be aligned. If this bit
is set, then the function pointer could be interpreted as an offset.
This approach might also allow a slab_free_rcu() to be constructed, given
that the full 32 bits of the function pointer would be available.
For example, if the upper 16 bits are zero, the low-order 16 bits are
the offset. If the upper 16 bits are 0x1, then the low-order 16 bits
might be an index that selects the desired slab cache.

Other possible approaches?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/