Re: [PATCH] mm: slub: fix a deadlock warning in kmem_cache_destroy

From: Juri Lelli
Date: Mon Jan 17 2022 - 08:13:16 EST


Hi,

On 17/01/22 13:40, Vlastimil Babka wrote:
> +CC Clark
>
> On 1/17/22 10:33, Sebastian Andrzej Siewior wrote:
> > On 2022-01-17 16:32:46 [+0800], Xin Long wrote:
> >> another issue. From the code analysis, this issue does exist on the
> >> upstream kernel, though I couldn't build an upstream RT kernel for the
> >> testing.
> >
> > This should also reproduce in v5.16 since the commit in question is
> > there.
>
> Yeah. I remember we had some issues with the commit during development, but
> I'd hope those were resolved and the commit that's ultimately merged got the
> fixes, see this subthread:
>
> https://lore.kernel.org/all/0b36128c-3e12-77df-85fe-a153a714569b@xxxxxxxxxxx/
>
> >> > > CPU0 CPU1
> >> > > ---- ----
> >> > > cpus_read_lock()
> >> > > kn->active++
> >> > > cpus_read_lock() [a]
> >> > > wait until kn->active == 0
> >> > >
> >> > > Although cpu_hotplug_lock is a RWSEM, [a] will not block in there. But as
> >> > > lockdep annotations are added for cpu_hotplug_lock, a deadlock warning
> >> > > would be detected:
> >
> > The cpu_hotplug_lock is a per-CPU RWSEM. The lock in [a] will block if
> > there is a writer pending.
> >
> >> > > ======================================================
> >> > > WARNING: possible circular locking dependency detected
> >> > > ------------------------------------------------------
> >> > > dmsetup/1832 is trying to acquire lock:
> >> > > ffff986f5a0f9f20 (kn->count#144){++++}-{0:0}, at: kernfs_remove+0x1d/0x30
> >> > >
> >> > > but task is already holding lock:
> >> > > ffffffffa43817c0 (slab_mutex){+.+.}-{3:3}, at: kmem_cache_destroy+0x2a/0x120
> >> > >
> >
> > I tried to create & destroy a cryptarget which creates/destroy a cache
> > via bio_put_slab(). Either the callchain is different or something else
> > is but I didn't see a lockdep warning.
>
> RHEL-8 kernel seems to be 4.18, unless RT uses a newer one. Could be some
> silently relevant backport is missing? How about e.g. 59450bbc12be ("mm,
> slab, slub: stop taking cpu hotplug lock") ?

Hummm, looks like we have backported commit 59450bbc12be in RHEL-8.

Xin Long, would you be able to check if you still see the lockdep splat
with latest upstream RT?

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git linux-5.16.y-rt

Thanks!
Juri