Re: SLUB regression in current Linus

From: Pekka Enberg
Date: Wed May 25 2011 - 01:22:27 EST


On Wed, May 25, 2011 at 2:03 AM, Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, May 24, 2011 at 4:52 AM, James Morris <jmorris@xxxxxxxxx> wrote:
>>
>> Reverting the patch appears to fix the hang for me, although I'm not sure
>> what the actual problem is.
>>
>> This is on a quad-core Opteron (1352). Let me know if you need any further
>> info.
>
> That whole "deactivate_slab()" + "c->page = NULL" that that patch does
> looks bogus.
>
> Look at __slab_alloc: we have:
>
>
>        page = c->page;
>        if (!page)
>                goto new_slab;
>
>        slab_lock(page);
>        if (unlikely(!node_match(c, node)))
>                goto another_slab;
>
> and let's assume we have two users racing on that "c->page". The
> "slab_lock()" is going to work for one of them, right?
>
> Ok, so the one it works for will then hit
>
>        if (kmem_cache_debug(s))
>                goto debug;
>
> and thus get to the new "deactivate_slab(s,c) + c->page = NULL" and
> then unlock the page.
>
> In the meantime, the one that wasn't able to lock the page will now go
> forward, but will not have "node_match()" any more, so it does that
> "goto another_slab".
>
> Which does "deactivate_slab(s,c)" again, and now c->page is NULL, so
> that totally breaks.
>
> What am I missing?
>
> That patch seems to be just broken piece-of-s%^!
>
> Christoph, Pekka, please tell me why I shouldn't immediately revert
> it. What am I missing?

It's safe to revert it, yes. Christoph? AFAICT Linus is correct here.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/