Re: [PATCH v6 1/4] mm/slub: enable debugging memory wasting of kmalloc

From: Feng Tang
Date: Thu Nov 03 2022 - 04:20:33 EST


On Thu, Nov 03, 2022 at 07:45:49AM +0000, John Thomson wrote:
> On Thu, 3 Nov 2022, at 07:18, Feng Tang wrote:
> > On Wed, Nov 02, 2022 at 04:16:06PM +0900, Hyeonggon Yoo wrote:
> >> On Wed, Nov 02, 2022 at 02:08:09PM +0800, Feng Tang wrote:
> > [...]
> >> > > transfer started ......................................... transfer ok, time=2.11s
> >> > > setting up elf image... OK
> >> > > jumping to kernel code
> >> > > zimage at: 80B842A0 810B4BC0
> >> > >
> >> > > Uncompressing Linux at load address 80001000
> >> > >
> >> > > Copy device tree to address 80B80EE0
> >> > >
> >> > > Now, booting the kernel...
> >> > >
> >> > > [ 0.000000] Linux version 6.1.0-rc3+ (john@john) (mipsel-buildroot-linux-gnu-gcc.br_real (Buildroot 2021.11-4428-g6b6741b) 12.2.0, GNU ld (GNU Binutils) 2.39) #73 SMP Wed Nov 2 05:10:01 AEST 2022
> >> > > [ 0.000000] ------------[ cut here ]------------
> >> > > [ 0.000000] WARNING: CPU: 0 PID: 0 at mm/slub.c:3416 kmem_cache_alloc+0x5a4/0x5e8
> >> > > [ 0.000000] Modules linked in:
> >> > > [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3+ #73
> >> > > [ 0.000000] Stack : 810fff78 80084d98 00000000 00000004 00000000 00000000 80889d04 80c90000
> >> > > [ 0.000000] 80920000 807bd328 8089d368 80923bd3 00000000 00000001 80889cb0 00000000
> >> > > [ 0.000000] 00000000 00000000 807bd328 8084bcb1 00000002 00000002 00000001 6d6f4320
> >> > > [ 0.000000] 00000000 80c97d3d 80c97d68 fffffffc 807bd328 00000000 00000000 00000000
> >> > > [ 0.000000] 00000000 a0000000 80910000 8110a0b4 00000000 00000020 80010000 80010000
> >> > > [ 0.000000] ...
> >> > > [ 0.000000] Call Trace:
> >> > > [ 0.000000] [<80008260>] show_stack+0x28/0xf0
> >> > > [ 0.000000] [<8070c958>] dump_stack_lvl+0x60/0x80
> >> > > [ 0.000000] [<8002e184>] __warn+0xc4/0xf8
> >> > > [ 0.000000] [<8002e210>] warn_slowpath_fmt+0x58/0xa4
> >> > > [ 0.000000] [<801c0fac>] kmem_cache_alloc+0x5a4/0x5e8
> >> > > [ 0.000000] [<8092856c>] prom_soc_init+0x1fc/0x2b4
> >> > > [ 0.000000] [<80928060>] prom_init+0x44/0xf0
> >> > > [ 0.000000] [<80929214>] setup_arch+0x4c/0x6a8
> >> > > [ 0.000000] [<809257e0>] start_kernel+0x88/0x7c0
> >> > > [ 0.000000]
> >> > > [ 0.000000] ---[ end trace 0000000000000000 ]---
> >> > > [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3
> >> > > [ 0.000000] printk: bootconsole [early0] enabled
> >> > >
> >> > > Thank you for working through this with me.
> >> > > I will try to address the root cause in mt7621.c.
> >> > > It looks like other arch/** soc_device_register users use postcore_initcall, device_initcall,
> >> > > or the ARM DT_MACHINE_START .init_machine. A quick hack to use postcore_initcall in mt7621
> >> > > avoided this zero ptr kmem_cache passed to kmem_cache_alloc_lru.
> >> >
> >> > If IIUC, the prom_soc_init() is only called once in kernel, can the
> >> > 'soc_dev_attr' just be defined as a global data structure instead
> >> > of calling kzalloc(), as its size is small only containing 7 pointers.
> >>
> >> But soc_device_registers() too uses kmalloc. I think calling it
> >> after slab initialization will be best solution - if that is correct.
> >
> > Yes, you are right, there is other kmalloc() down the call chain.
> >
> > Hi John,
> >
> > Will you verify and submit a patch for your proposal of deferring
> > calling prom_soc_init()? thanks
> >
> > - Feng
>
> Hi Feng,
>
> My proposed mt7621.c changes are RFC here:
> https://lore.kernel.org/lkml/20221103050538.1930758-1-git@xxxxxxxxxxxxxxxxxxxxxxxxxxx/

Great!

> That series lets me boot the v6.1-rc3 kernel. I have only tried it with my config (as sent earlier). If there are other suspect config settings that I should test, please let me know?
> I used device_initcall, but postcore_initcall also works fine.

I'm not sure which order is better, due to lack of mips platform
knowledge.

> I rephrased Vlastimil's explanation and used it in patch 3 description.
> I have not referenced a Fixes tag yet (unsure which/if any I should use)

With older version, the kernel boots fine with soc_dev_init() not
being actually called, and I don't know if they also need to get
this called.

Thanks,
Feng

> Cheers,
> --
> John Thomson
>