Re: [RFC] Refcounting of objects part of a lockfree collection

From: Dipankar Sarma
Date: Wed Jul 14 2004 - 10:24:00 EST


On Wed, Jul 14, 2004 at 07:26:14AM -0700, Greg KH wrote:
> On Wed, Jul 14, 2004 at 01:56:22PM +0530, Dipankar Sarma wrote:
> > Well, the kref has the same get/put race if used in a lock-free
> > look-up. When you do a kref_get() it is assumed that another
> > cpu will not see a 1-to-0 transition of the reference count.
>
> You mean kref_put(), right?

No, I meant kref_get(). See below.

> > If that indeed happens, ->release() will get invoked more
> > than once for that object which is bad.
>
> As kref_put() uses a atomic_t, how can that transistion happen twice?
>
> What can happen is kref_get() and kref_put() can race if the last
> kref_put() happens at the same time that kref_get(). But that is solved
> by having the caller guarantee that this can not happen (see my 2004 OLS
> paper for more info about this.)

Yes, and how do the callers guarantee that ? Using a lock, right ?
What Kiran's patch does is to allow those callers to use lock-free
algorithms. Let's look at the race -

---------------------------------------------------------------

CPU #0 CPU #1
------ ------

my_put() from a user who did my_lookup() ...

[ ->count is 1 ]

In my_lookup() ... atomic_dec_and_test(&m->count)

[ ->count is 0 ]
m = my_get(my_list[i]);

[ ->count is 1 ] call_rcu(&m->head, free, m);

return m;

[This CPU can now context
switch and allow RCU to
proceed]

free(m);

Somebody dereferences m and
invalid memory reference
---------------------------------------------------------------

This can happen if my_lookup() is lock-free.

> > The other issue is that there are many refcounted data structures
> > like dentry, dst_entry, file etc. that do not use kref.
>
> At this time, sure. But you could always change that :)
> (and yes, to do so, we can always shrink the size of struct kref if
> really needed...)

How are you going to shrink it ? You need the ->release() method
and that is a nice way for drivers to get rid of objects.

>
> > If everybody were to use kref, we could possibly apply Kiran's
> > lock-free extensions to kref itself and be done with it.
>
> Ok, sounds like a plan to me. Having 2 refcount implementations in the
> kernel that work alike, yet a bit different, is not acceptable. Please
> rework struct kref to do this.

And I suspect that Andrew thwak me for trying to increase dentry size :)
Anyway, the summary is this - Kiran is not trying to introduce
a new refcounting API. He is just adding lock-free support from
an existing refcounting mechanism that is used in VFS. If kref users need to do
lock-free lookup, sure we should add it to kref_xxx APIs also.

> > Until then, we need the lock-free refcounting support from non-kref
> > refcounting objects.
>
> We've lived without it until now somehow :)

Actually, we already use lock-free refcounting in route cache, dcache. In those
cases, we work around this race using a different algorithm.

Thanks
Dipankar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/