Re: [PATCH v7 00/70] Introducing the Maple Tree

From: Liam Howlett
Date: Mon Apr 25 2022 - 15:59:16 EST


* Yu Zhao <yuzhao@xxxxxxxxxx> [220425 14:06]:
> On Wed, Apr 20, 2022 at 7:43 AM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote:
> >
> > * Yu Zhao <yuzhao@xxxxxxxxxx> [220419 19:23]:
> > > On Tue, Apr 19, 2022 at 5:18 PM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote:
> > > >
> > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220419 17:59]:
> > > > > On Tue, Apr 19, 2022 at 9:51 AM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > * Yu Zhao <yuzhao@xxxxxxxxxx> [220416 15:30]:
> > > > > > > On Sat, Apr 16, 2022 at 9:19 AM Liam Howlett <liam.howlett@xxxxxxxxxx> wrote:
> > > > > > > >
> > > > > > >
> > > > > > > <snipped>
> > > > > > >
> > > > > > > > How did you hit this issue? Just on boot?
> > > > > > >
> > > > > > > I was hoping this is known to you or you have something I can verify for you.
> > > > > >
> > > > > >
> > > > > > Thanks, yes. I believe that both crashes are the same root cause. The
> > > > > > cause is that I was not cleaning up after the kmem bulk allocation
> > > > > > failure on my side. Please test with this patch.
> > > > >
> > > > > Thanks. I applied this patch and hit a LOCKDEP and then a BUG_ON:
> > > > >
> > > > > lib/maple_tree.c:847 suspicious rcu_dereference_protected() usage!
> > > > > Call Trace:
> > > > > <TASK>
> > > > > dump_stack_lvl+0x6c/0x9a
> > > > > dump_stack+0x10/0x12
> > > > > lockdep_rcu_suspicious+0x12c/0x140
> > > > > __mt_destroy+0x96/0xd0
> > > > > exit_mmap+0x2a0/0x360
> > > > > __mmput+0x34/0x100
> > > > > mmput+0x2f/0x40
> > > > > free_bprm+0x64/0xe0
> > > > > kernel_execve+0x129/0x330
> > > > > call_usermodehelper_exec_async+0xd8/0x130
> > > > > ? proc_cap_handler+0x210/0x210
> > > > > ret_from_fork+0x1f/0x30
> > > > > </TASK>
> > > >
> > > > Thanks - I'm not sure how this got through, but this should fix it.
> > > >
> > > > This should be added to 4236a642ad185 to avoid the LOCKDEP issue.
> > > >
> > > > --- a/mm/mmap.c
> > > > +++ b/mm/mmap.c
> > > > @@ -3163,9 +3163,9 @@ void exit_mmap(struct mm_struct *mm)
> > > >
> > > > BUG_ON(count != mm->map_count);
> > > >
> > > > - mmap_write_unlock(mm);
> > > > trace_exit_mmap(mm);
> > > > __mt_destroy(&mm->mm_mt);
> > > > + mmap_write_unlock(mm);
> > > > vm_unacct_memory(nr_accounted);
> > > > }
> > >
> > > Will try this.
> >
> >
> > Andrew,
> >
> > Please add this fix to the commit 4236a642ad185 "mm: start tracking VMAs
> > with maple tree"
> >
> > I've attached the patch for your convenience.
>
> Hi Liam,
>
> I assume you are still looking at the BUG_ON problem. I'll restart my
> testing once you have something for me to try.
>
> Thanks.

No, The above fix stopped the suspicious rcu dereference. I've found
another issue in the mlock() code which I've also fixed.. but I needed
to change my allocations from within the immap rwsem lock as it triggers
a potential lockdep issue on high memory usage - lockdep complains about
fs-reclaim lock. I've a patch set that works but I'm working through
making it bisectable. I think the easiest thing is to integrate these
fixes and the others sent to Andrew into a v8. I hope to have this done
by the end of the day tomorrow.

Thanks,
Liam