Re: [PATCH] mm: slub: fix a deadlock warning in kmem_cache_destroy

From: Xin Long
Date: Tue Jan 18 2022 - 03:01:07 EST


On Mon, Jan 17, 2022 at 9:13 PM Juri Lelli <juri.lelli@xxxxxxxxxx> wrote:
>
> Hi,
>
> On 17/01/22 13:40, Vlastimil Babka wrote:
> > +CC Clark
> >
> > On 1/17/22 10:33, Sebastian Andrzej Siewior wrote:
> > > On 2022-01-17 16:32:46 [+0800], Xin Long wrote:
> > >> another issue. From the code analysis, this issue does exist on the
> > >> upstream kernel, though I couldn't build an upstream RT kernel for the
> > >> testing.
> > >
> > > This should also reproduce in v5.16 since the commit in question is
> > > there.
> >
> > Yeah. I remember we had some issues with the commit during development, but
> > I'd hope those were resolved and the commit that's ultimately merged got the
> > fixes, see this subthread:
> >
> > https://lore.kernel.org/all/0b36128c-3e12-77df-85fe-a153a714569b@xxxxxxxxxxx/
> >
> > >> > > CPU0 CPU1
> > >> > > ---- ----
> > >> > > cpus_read_lock()
> > >> > > kn->active++
> > >> > > cpus_read_lock() [a]
> > >> > > wait until kn->active == 0
> > >> > >
> > >> > > Although cpu_hotplug_lock is a RWSEM, [a] will not block in there. But as
> > >> > > lockdep annotations are added for cpu_hotplug_lock, a deadlock warning
> > >> > > would be detected:
> > >
> > > The cpu_hotplug_lock is a per-CPU RWSEM. The lock in [a] will block if
> > > there is a writer pending.
> > >
> > >> > > ======================================================
> > >> > > WARNING: possible circular locking dependency detected
> > >> > > ------------------------------------------------------
> > >> > > dmsetup/1832 is trying to acquire lock:
> > >> > > ffff986f5a0f9f20 (kn->count#144){++++}-{0:0}, at: kernfs_remove+0x1d/0x30
> > >> > >
> > >> > > but task is already holding lock:
> > >> > > ffffffffa43817c0 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_destroy+0x2a/0x120
> > >> > >
> > >
> > > I tried to create & destroy a cryptarget which creates/destroy a cache
> > > via bio_put_slab(). Either the callchain is different or something else
> > > is but I didn't see a lockdep warning.
> >
> > RHEL-8 kernel seems to be 4.18, unless RT uses a newer one. Could be some
> > silently relevant backport is missing? How about e.g. 59450bbc12be ("mm,
> > slab, slub: stop taking cpu hotplug lock") ?
>
> Hummm, looks like we have backported commit 59450bbc12be in RHEL-8.
>
> Xin Long, would you be able to check if you still see the lockdep splat
> with latest upstream RT?
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-5.16.y-rt
Hi, Juri,

Thanks for sharing the RT kernel repo.

I just tried with this kernel, and I couldn't reproduce it on my env.
But I don't see how the upstream RT kernel can avoid the call trace.

As this warning was triggered when the system was shutting down, it might
not be reproduced on it due to some timing change.

>
> Thanks!
> Juri
>