Re: [PATCH] mm/slub: fix a deadlock in shuffle_freelist()

From: Qian Cai
Date: Thu Sep 26 2019 - 08:29:40 EST


On Wed, 2019-09-25 at 18:45 +0200, Peter Zijlstra wrote:
> On Wed, Sep 25, 2019 at 11:18:47AM -0400, Qian Cai wrote:
> > On Wed, 2019-09-25 at 11:31 +0200, Peter Zijlstra wrote:
> > > On Fri, Sep 13, 2019 at 12:27:44PM -0400, Qian Cai wrote:
> > > > -> #3 (batched_entropy_u32.lock){-.-.}:
> > > > lock_acquire+0x31c/0x360
> > > > _raw_spin_lock_irqsave+0x7c/0x9c
> > > > get_random_u32+0x6c/0x1dc
> > > > new_slab+0x234/0x6c0
> > > > ___slab_alloc+0x3c8/0x650
> > > > kmem_cache_alloc+0x4b0/0x590
> > > > __debug_object_init+0x778/0x8b4
> > > > debug_object_init+0x40/0x50
> > > > debug_init+0x30/0x29c
> > > > hrtimer_init+0x30/0x50
> > > > init_dl_task_timer+0x24/0x44
> > > > __sched_fork+0xc0/0x168
> > > > init_idle+0x78/0x26c
> > > > fork_idle+0x12c/0x178
> > > > idle_threads_init+0x108/0x178
> > > > smp_init+0x20/0x1bc
> > > > kernel_init_freeable+0x198/0x26c
> > > > kernel_init+0x18/0x334
> > > > ret_from_fork+0x10/0x18
> > > >
> > > > -> #2 (&rq->lock){-.-.}:
> > >
> > > This relation is silly..
> > >
> > > I suspect the below 'works'...
> >
> > Unfortunately, the relation is still there,
> >
> > copy_process()->rt_mutex_init_task()->"&p->pi_lock"
> >
> > [24438.676716][ÂÂÂÂT2] -> #2 (&rq->lock){-.-.}:
> > [24438.676727][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b4/0xbf0
> > [24438.676736][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂlock_acquire+0x130/0x360
> > [24438.676754][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x54/0x80
> > [24438.676771][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂtask_fork_fair+0x60/0x190
> > [24438.676788][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂsched_fork+0x128/0x270
> > [24438.676806][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂcopy_process+0x7a4/0x1bf0
> > [24438.676823][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ_do_fork+0xac/0xac0
> > [24438.676841][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkernel_thread+0x70/0xa0
> > [24438.676858][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂrest_init+0x4c/0x42c
> > [24438.676884][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂstart_kernel+0x778/0x7c0
> > [24438.676902][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂstart_here_common+0x1c/0x334
>
> That's the 'where we took #2 while holding #1' stacktrace and not
> relevant to our discussion.

Oh, you were talking about took #3 while holding #2. Anyway, your patch is
working fine so far. Care to post/merge it officially or do you want me to post
it?

>
> > [24438.675836][ÂÂÂÂT2] -> #4 (batched_entropy_u64.lock){-...}:
> > [24438.675860][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b4/0xbf0
> > [24438.675878][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂlock_acquire+0x130/0x360
> > [24438.675906][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ_raw_spin_lock_irqsave+0x70/0xa0
> > [24438.675923][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂget_random_u64+0x60/0x100
> > [24438.675944][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂadd_to_free_area_random+0x164/0x1b0
> > [24438.675962][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂfree_one_page+0xb24/0xcf0
> > [24438.675980][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__free_pages_ok+0x448/0xbf0
> > [24438.675999][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂdeferred_init_maxorder+0x404/0x4a4
> > [24438.676018][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂdeferred_grow_zone+0x158/0x1f0
> > [24438.676035][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x1dc8/0x1e10
> > [24438.676063][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x1d8/0x1940
> > [24438.676083][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂallocate_slab+0x130/0x2740
> > [24438.676091][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂnew_slab+0xa8/0xe0
> > [24438.676101][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkmem_cache_open+0x254/0x660
> > [24438.676119][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__kmem_cache_create+0x44/0x2a0
> > [24438.676136][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂcreate_boot_cache+0xcc/0x110
> > [24438.676154][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkmem_cache_init+0x90/0x1f0
> > [24438.676173][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂstart_kernel+0x3b8/0x7c0
> > [24438.676191][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂstart_here_common+0x1c/0x334
> > [24438.676208][ÂÂÂÂT2]Â
> > [24438.676208][ÂÂÂÂT2] -> #3 (&(&zone->lock)->rlock){-.-.}:
> > [24438.676221][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__lock_acquire+0x5b4/0xbf0
> > [24438.676247][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂlock_acquire+0x130/0x360
> > [24438.676264][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ_raw_spin_lock+0x54/0x80
> > [24438.676282][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂrmqueue_bulk.constprop.23+0x64/0xf20
> > [24438.676300][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂget_page_from_freelist+0x718/0x1e10
> > [24438.676319][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__alloc_pages_nodemask+0x1d8/0x1940
> > [24438.676339][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂalloc_page_interleave+0x34/0x170
> > [24438.676356][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂallocate_slab+0xd1c/0x2740
> > [24438.676374][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂnew_slab+0xa8/0xe0
> > [24438.676391][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ___slab_alloc+0x580/0xef0
> > [24438.676408][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__slab_alloc+0x64/0xd0
> > [24438.676426][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkmem_cache_alloc+0x5c4/0x6c0
> > [24438.676444][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂfill_pool+0x280/0x540
> > [24438.676461][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__debug_object_init+0x60/0x6b0
> > [24438.676479][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂhrtimer_init+0x5c/0x310
> > [24438.676497][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂinit_dl_task_timer+0x34/0x60
> > [24438.676516][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__sched_fork+0x8c/0x110
> > [24438.676535][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂinit_idle+0xb4/0x3c0
> > [24438.676553][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂidle_thread_get+0x78/0x120
> > [24438.676572][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂbringup_cpu+0x30/0x230
> > [24438.676590][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂcpuhp_invoke_callback+0x190/0x1580
> > [24438.676618][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂdo_cpu_up+0x248/0x460
> > [24438.676636][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂsmp_init+0x118/0x1c0
> > [24438.676662][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkernel_init_freeable+0x3f8/0x8dc
> > [24438.676681][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂkernel_init+0x2c/0x154
> > [24438.676699][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂret_from_kernel_thread+0x5c/0x74
> > [24438.676716][ÂÂÂÂT2]Â
> > [24438.676716][ÂÂÂÂT2] -> #2 (&rq->lock){-.-.}:
>
> This then shows we now have:
>
> rq->lock
> zone->lock.rlock
> batched_entropy_u64.lock
>
> Which, to me, appears to be distinctly different from the last time,
> which was:
>
> rq->lock
> batched_entropy_u32.lock
>
> Notable: "u32" != "u64".
>
> But #3 has:
>
> > [24438.676516][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂ__sched_fork+0x8c/0x110
> > [24438.676535][ÂÂÂÂT2]ÂÂÂÂÂÂÂÂinit_idle+0xb4/0x3c0
>
> Which seems to suggest you didn't actually apply the patch; or rather,
> if you did, i'm not immediately seeing where #2 is acquired.
>