Re: Double free of struct sk_buff reported by SLAB_CONSISTENCY_CHECKS with init_on_free

From: Thibaut Sautereau
Date: Wed Nov 06 2019 - 04:29:25 EST


On Tue, Nov 05, 2019 at 12:01:15PM -0500, Laura Abbott wrote:
> On 11/5/19 10:02 AM, Vlastimil Babka wrote:
> > On 11/5/19 3:32 PM, Thibaut Sautereau wrote:
> > > On Tue, Nov 05, 2019 at 10:00:39AM +0100, Vlastimil Babka wrote:
> > > > On 11/4/19 6:03 PM, Thibaut Sautereau wrote:
> > > > > The BUG only happens when using `slub_debug=F` on the command-line (to
> > > > > enable SLAB_CONSISTENCY_CHECKS), otherwise the double free is not
> > > > > reported and the system keeps running.
> > > >
> > > > You could change slub_debug parameter to:
> > > > slub_debug=FU,skbuff_head_cache
> > > >
> > > > That will also print out who previously allocated and freed the double
> > > > freed object. And limit all the tracking just to the affected cache.
> > >
> > > Thanks, I did not know about that.
> > >
> > > However, as kind of expected, I get a BUG due to a NULL pointer
> > > dereference in print_track():
> >
> > Ah, I didn't read properly your initial mail, that there's a null
> > pointer deference during the consistency check.
> >
> > ...
> >
> > > > >
> > > > > Bisection points to the following commit: 1b7e816fc80e ("mm: slub: Fix
> > > > > slab walking for init_on_free"), and indeed the BUG is not triggered
> > > > > when init_on_free is disabled.
> > > >
> > > > That could be either buggy SLUB code, or the commit somehow exposed a
> > > > real bug in skbuff users.
> > >
> > > Right. At first I thought about some incompatibility between
> > > init_on_free and SLAB_CONSISTENCY_CHECKS, but in that case why would it
> > > only happen with skbuff_head_cache?
> >
> > That's curious, yeah.
> >
> > > On the other hand, if it's a bug in
> > > skbuff users, why is the on_freelist() check in free_consistency_check()
> > > not detecting anything when init_on_free is disabled?
> >
> > I vaguely suspect the code in the commit 1b7e816fc80e you bisected,
> > where in slab_free_freelist_hook() in the first iteration, we have void
> > *p = NULL; and set_freepointer(s, object, p); will thus write NULL into
> > the freelist. Is is the NULL we are crashing on? The code seems to
> > assume that the freelist is rewritten later in the function, but that
> > part is only active with some CONFIG_ option(s), none of which might be
> > enabled in your case?
> > But I don't really understand what exactly this function is supposed to
> > do. Laura, does my theory make sense?
> >
> > Thanks,
> > Vlastimil
> >
>
> The note about getting re-written is referring to the fact that the trick
> with the bulk free is that we build the detached freelist and then when
> we do the cmpxchg it's getting (correctly) updated there.

Thank you Laura for this clarification.

> But looking at this again, I realize this function still has a more
> fundamental problem: walking the freelist like this means we actually
> end up reversing the list so head and tail are no longer pointing
> to the correct blocks. I was able to reproduce the issue by writing a
> simple kmem_cache_bulk_alloc/kmem_cache_bulk_free function. I'm
> guessing that the test of ping with an unusual size was enough to
> regularly trigger a non-trivial bulk alloc/free.
>
> The fix I gave before fixed part of the problem but not all of it.
> At this point we're basically duplicating the work of the loop
> below so I think we can just combine it. Was there a reason this
> wasn't just combined in the first place?

Good catch about the freelist inversion, I was actually starting to
wonder whether it was really intended. Anyway, I tested your patch (see
one tiny inline comment below) and it seems to run fine for now, without
me being able to reproduce the bug anymore.

> diff --git a/mm/slub.c b/mm/slub.c
> index dac41cf0b94a..1510b86b2e7e 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -1431,13 +1431,17 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> void *next = *head;
> void *old_tail = *tail ? *tail : *head;
> int rsize;
> + next = *head;

`next` is already set a few lines above.

> - if (slab_want_init_on_free(s)) {
> - void *p = NULL;
> + /* Head and tail of the reconstructed freelist */
> + *head = NULL;
> + *tail = NULL;
> - do {
> - object = next;
> - next = get_freepointer(s, object);
> + do {
> + object = next;
> + next = get_freepointer(s, object);
> +
> + if (slab_want_init_on_free(s)) {
> /*
> * Clear the object and the metadata, but don't touch
> * the redzone.
> @@ -1447,29 +1451,8 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> : 0;
> memset((char *)object + s->inuse, 0,
> s->size - s->inuse - rsize);
> - set_freepointer(s, object, p);
> - p = object;
> - } while (object != old_tail);
> - }
> -
> -/*
> - * Compiler cannot detect this function can be removed if slab_free_hook()
> - * evaluates to nothing. Thus, catch all relevant config debug options here.
> - */
> -#if defined(CONFIG_LOCKDEP) || \
> - defined(CONFIG_DEBUG_KMEMLEAK) || \
> - defined(CONFIG_DEBUG_OBJECTS_FREE) || \
> - defined(CONFIG_KASAN)
> - next = *head;
> -
> - /* Head and tail of the reconstructed freelist */
> - *head = NULL;
> - *tail = NULL;
> -
> - do {
> - object = next;
> - next = get_freepointer(s, object);
> + }
> /* If object's reuse doesn't have to be delayed */
> if (!slab_free_hook(s, object)) {
> /* Move object to the new freelist */
> @@ -1484,9 +1467,6 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s,
> *tail = NULL;
> return *head != NULL;
> -#else
> - return true;
> -#endif
> }
> static void *setup_object(struct kmem_cache *s, struct page *page,
>

--
Thibaut Sautereau
CLIP OS developer