Re: [PATCH] debugobjects: call debug_objects_mem_init eariler

From: Waiman Long
Date: Fri Nov 23 2018 - 00:04:23 EST


On 11/22/2018 11:31 PM, Qian Cai wrote:
> The current value of the early boot static pool size, 1024 is not big
> enough for systems with large number of CPUs with timer or/and workqueue
> objects selected. As the results, systems have 60+ CPUs with both timer
> and workqueue objects enabled could trigger "ODEBUG: Out of memory.
> ODEBUG disabled".
>
> Some debug objects are allocated during the early boot. Enabling some
> options like timers or workqueue objects may increase the size required
> significantly with large number of CPUs. For example,
>
> CONFIG_DEBUG_OBJECTS_TIMERS:
> No. CPUs x 2 (worker pool) objects:
> start_kernel
> workqueue_init_early
> init_worker_pool
> init_timer_key
> debug_object_init
>
> No. CPUs objects (CONFIG_HIGH_RES_TIMERS):
> sched_init
> hrtick_rq_init
> hrtimer_init
>
> CONFIG_DEBUG_OBJECTS_WORK:
> No. CPUs x 6 (workqueue) objects:
> workqueue_init_early
> alloc_workqueue
> __alloc_workqueue_key
> alloc_and_link_pwqs
> init_pwq
>
> Also, plus No. CPUs objects:
> perf_event_init
> __init_srcu_struct
> init_srcu_struct_fields
> init_srcu_struct_nodes
> __init_work
>
> However, none of the things are actually used or required beofre
> debug_objects_mem_init() is invoked.
>
> According to tglx,
> "the reason why the call is at this place in start_kernel() is
> historical. It's because back in the days when debugobjects were added
> the memory allocator was enabled way later than today. So we can just
> move the debug_objects_mem_init() call right before sched_init()."
>
> Afterwards, when calling debug_objects_mem_init(), interrupts have
> already been disabled and lockdep_init() will only be called later, so
> no need to worry about interrupts in
> debug_objects_replace_static_objects().
>
> Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Qian Cai <cai@xxxxxx>
> ---
> init/main.c | 3 ++-
> lib/debugobjects.c | 8 --------
> 2 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/init/main.c b/init/main.c
> index ee147103ba1b..f2c35dc50851 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -600,6 +600,8 @@ asmlinkage __visible void __init start_kernel(void)
> /* trace_printk can be enabled here */
> early_trace_init();
>
> + debug_objects_mem_init();
> +
> /*
> * Set up the scheduler prior starting any interrupts (such as the
> * timer interrupt). Full topology setup happens at smp_init()
> @@ -697,7 +699,6 @@ asmlinkage __visible void __init start_kernel(void)
> #endif
> page_ext_init();
> kmemleak_init();
> - debug_objects_mem_init();
> setup_per_cpu_pageset();
> numa_policy_init();
> acpi_early_init();
> diff --git a/lib/debugobjects.c b/lib/debugobjects.c
> index 70935ed91125..cc5818ced652 100644
> --- a/lib/debugobjects.c
> +++ b/lib/debugobjects.c
> @@ -1132,13 +1132,6 @@ static int __init debug_objects_replace_static_objects(void)
> hlist_add_head(&obj->node, &objects);
> }
>
> - /*
> - * When debug_objects_mem_init() is called we know that only
> - * one CPU is up, so disabling interrupts is enough
> - * protection. This avoids the lockdep hell of lock ordering.
> - */
> - local_irq_disable();

I think you should have a comment saying that debug_objects_mm_init() is
called early with only one CPU up and interrupt disabled. So it is safe
to replace static objects without any protection.

> -
> /* Remove the statically allocated objects from the pool */
> hlist_for_each_entry_safe(obj, tmp, &obj_pool, node)
> hlist_del(&obj->node);
> @@ -1158,7 +1151,6 @@ static int __init debug_objects_replace_static_objects(void)
> cnt++;
> }
> }
> - local_irq_enable();
>
> pr_debug("%d of %d active objects replaced\n",
> cnt, obj_pool_used);

-Longman