Re: [PATCH] bsg: call idr_pre_get() outside the lock.

From: James Bottomley
Date: Thu Sep 16 2010 - 22:55:47 EST


On Thu, 2010-09-16 at 16:37 -0700, Andrew Morton wrote:
> On Wed, 01 Sep 2010 23:26:01 +0900
> Naohiro Aota <naota@xxxxxxxxx> wrote:
>
> > The idr_pre_get() kernel-doc says "This function should be called
> > prior to locking and calling the idr_get_new* functions.", but the
> > idr_pre_get() calling in bsg_register_queue() is put inside
> > mutex_lock(). Let's fix it.
> >
>
> The idr_pre_get() kerneldoc is wrong. Or at least, misleading.
>
> The way this all works is that we precharge the tree via idr_pre_get()
> and we do this outside locks so we can use GFP_KERNEL. Then we take
> the lock (a spinlock!) and then try to add an element to the tree,
> which will consume objects from the pool which was prefilled by
> idr_pre_get().
>
> There's an obvious race here where someone else can get in and steal
> objects from the prefilled pool. We can fix that with a retry loop:
>
>
> again:
> if (idr_pre_get(..., GFP_KERNEL) == NULL)
> return -ENOMEM; /* We're really out-of-memory */
> spin_lock(lock);
> if (idr_get_new(...) == -EAGAIN) {
> spin_unlock(lock);
> goto again; /* Someone stole our preallocation! */
> }
> ...
>
> this way we avoid the false -ENOMEM which the race would have caused.
> We only declare -ENOMEM when we're REALLY out of memory.
>
>
> But none of this is needed when a sleeping lock is used (as long as the
> sleeping lock isn't taken on the the VM pageout path, of course). In
> that case we can use the sleeping lock to prevent the above race.
>
> > diff --git a/block/bsg.c b/block/bsg.c
> > index 82d5882..5fd8dd1 100644
> > --- a/block/bsg.c
> > +++ b/block/bsg.c
> > @@ -1010,13 +1010,11 @@ int bsg_register_queue(struct request_queue *q, struct device *parent,
> > bcd = &q->bsg_dev;
> > memset(bcd, 0, sizeof(*bcd));
> >
> > - mutex_lock(&bsg_mutex);
> > -
> > ret = idr_pre_get(&bsg_minor_idr, GFP_KERNEL);
> > - if (!ret) {
> > - ret = -ENOMEM;
> > - goto unlock;
> > - }
> > + if (!ret)
> > + return -ENOMEM;
> > +
> > + mutex_lock(&bsg_mutex);
> >
> > ret = idr_get_new(&bsg_minor_idr, bcd, &minor);
> > if (ret < 0)
>
> So the old code was OK.
>
> The new code, however, is not OK because it is vulnerable to the above
> race wherein another CPU or thread comes in and steals all of this
> thread's preallocation.

Hmm, you're right ... I've dropped the patch.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/