Re: [PATCH v3 1/6] lib/stackdepot: allow requesting early initialization dynamically

From: Mike Rapoport
Date: Wed Apr 06 2022 - 11:21:57 EST


On Tue, Apr 05, 2022 at 02:40:03PM -0700, David Rientjes wrote:
> On Mon, 4 Apr 2022, Vlastimil Babka wrote:
>
> > In a later patch we want to add stackdepot support for object owner
> > tracking in slub caches, which is enabled by slub_debug boot parameter.
> > This creates a bootstrap problem as some caches are created early in
> > boot when slab_is_available() is false and thus stack_depot_init()
> > tries to use memblock. But, as reported by Hyeonggon Yoo [1] we are
> > already beyond memblock_free_all(). Ideally memblock allocation should
> > fail, yet it succeeds, but later the system crashes, which is a
> > separately handled issue.
> >
> > To resolve this boostrap issue in a robust way, this patch adds another
> > way to request stack_depot_early_init(), which happens at a well-defined
> > point of time. In addition to build-time CONFIG_STACKDEPOT_ALWAYS_INIT,
> > code that's e.g. processing boot parameters (which happens early enough)
> > can call a new function stack_depot_want_early_init(), which sets a flag
> > that stack_depot_early_init() will check.
> >
> > In this patch we also convert page_owner to this approach. While it
> > doesn't have the bootstrap issue as slub, it's also a functionality
> > enabled by a boot param and can thus request stack_depot_early_init()
> > with memblock allocation instead of later initialization with
> > kvmalloc().
> >
> > As suggested by Mike, make stack_depot_early_init() only attempt
> > memblock allocation and stack_depot_init() only attempt kvmalloc().
> > Also change the latter to kvcalloc(). In both cases we can lose the
> > explicit array zeroing, which the allocations do already.
> >
> > As suggested by Marco, provide empty implementations of the init
> > functions for !CONFIG_STACKDEPOT builds to simplify the callers.
> >
> > [1] https://lore.kernel.org/all/YhnUcqyeMgCrWZbd@xxxxxxxxxxxxxxxxxxx-northeast-1.compute.internal/
> >
> > Reported-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> > Suggested-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > Suggested-by: Marco Elver <elver@xxxxxxxxxx>
> > Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
> > Reviewed-by: Marco Elver <elver@xxxxxxxxxx>
> > Reviewed-and-tested-by: Hyeonggon Yoo <42.hyeyoo@xxxxxxxxx>
> > Reviewed-by: Mike Rapoport <rppt@xxxxxxxxxxxxx>
> > ---
> > include/linux/stackdepot.h | 26 ++++++++++++---
> > lib/stackdepot.c | 66 +++++++++++++++++++++++++-------------
> > mm/page_owner.c | 9 ++++--
> > 3 files changed, 72 insertions(+), 29 deletions(-)
> >
> > diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
> > index 17f992fe6355..bc2797955de9 100644
> > --- a/include/linux/stackdepot.h
> > +++ b/include/linux/stackdepot.h
> > @@ -20,18 +20,36 @@ depot_stack_handle_t __stack_depot_save(unsigned long *entries,
> > gfp_t gfp_flags, bool can_alloc);
> >
> > /*
> > - * Every user of stack depot has to call this during its own init when it's
> > - * decided that it will be calling stack_depot_save() later.
> > + * Every user of stack depot has to call stack_depot_init() during its own init
> > + * when it's decided that it will be calling stack_depot_save() later. This is
> > + * recommended for e.g. modules initialized later in the boot process, when
> > + * slab_is_available() is true.
> > *
> > * The alternative is to select STACKDEPOT_ALWAYS_INIT to have stack depot
> > * enabled as part of mm_init(), for subsystems where it's known at compile time
> > * that stack depot will be used.
> > + *
> > + * Another alternative is to call stack_depot_want_early_init(), when the
> > + * decision to use stack depot is taken e.g. when evaluating kernel boot
> > + * parameters, which precedes the enablement point in mm_init().
> > + *
> > + * stack_depot_init() and stack_depot_want_early_init() can be called regardless
> > + * of CONFIG_STACKDEPOT and are no-op when disabled. The actual save/fetch/print
> > + * functions should only be called from code that makes sure CONFIG_STACKDEPOT
> > + * is enabled.
> > */
> > +#ifdef CONFIG_STACKDEPOT
> > int stack_depot_init(void);
> >
> > -#ifdef CONFIG_STACKDEPOT_ALWAYS_INIT
> > -static inline int stack_depot_early_init(void) { return stack_depot_init(); }
> > +void __init stack_depot_want_early_init(void);
> > +
> > +/* This is supposed to be called only from mm_init() */
> > +int __init stack_depot_early_init(void);
> > #else
> > +static inline int stack_depot_init(void) { return 0; }
> > +
> > +static inline void stack_depot_want_early_init(void) { }
> > +
> > static inline int stack_depot_early_init(void) { return 0; }
> > #endif
> >
> > diff --git a/lib/stackdepot.c b/lib/stackdepot.c
> > index bf5ba9af0500..6c4644c9ed44 100644
> > --- a/lib/stackdepot.c
> > +++ b/lib/stackdepot.c
> > @@ -66,6 +66,9 @@ struct stack_record {
> > unsigned long entries[]; /* Variable-sized array of entries. */
> > };
> >
> > +static bool __stack_depot_want_early_init __initdata = IS_ENABLED(CONFIG_STACKDEPOT_ALWAYS_INIT);
> > +static bool __stack_depot_early_init_passed __initdata;
> > +
> > static void *stack_slabs[STACK_ALLOC_MAX_SLABS];
> >
> > static int depot_index;
> > @@ -162,38 +165,57 @@ static int __init is_stack_depot_disabled(char *str)
> > }
> > early_param("stack_depot_disable", is_stack_depot_disabled);
> >
> > -/*
> > - * __ref because of memblock_alloc(), which will not be actually called after
> > - * the __init code is gone, because at that point slab_is_available() is true
> > - */
> > -__ref int stack_depot_init(void)
> > +void __init stack_depot_want_early_init(void)
> > +{
> > + /* Too late to request early init now */
> > + WARN_ON(__stack_depot_early_init_passed);
> > +
> > + __stack_depot_want_early_init = true;
> > +}
> > +
> > +int __init stack_depot_early_init(void)
> > +{
> > + size_t size;
> > +
> > + /* This is supposed to be called only once, from mm_init() */
> > + if (WARN_ON(__stack_depot_early_init_passed))
> > + return 0;
> > +
> > + __stack_depot_early_init_passed = true;
> > +
> > + if (!__stack_depot_want_early_init || stack_depot_disable)
> > + return 0;
> > +
> > + pr_info("Stack Depot early init allocating hash table with memblock_alloc\n");
> > + size = (STACK_HASH_SIZE * sizeof(struct stack_record *));
>
> I think the kvcalloc() in the main init path is very unlikely to fail, but
> perhaps this memblock_alloc() might? If so, a nit might be to include
> this size as part of the printk.

memblock_alloc() is even more unlikely to fail than kvcalloc() ;-)
But printing the size won't hurt.

> Either way:
>
> Acked-by: David Rientjes <rientjes@xxxxxxxxxx>

--
Sincerely yours,
Mike.