Re: [PATCH] slab: fix the DEADLOCK issue on l3 alien lock

From: Pekka Enberg
Date: Tue Sep 11 2012 - 12:29:49 EST


On Tue, Sep 11, 2012 at 5:50 AM, Michael Wang
<wangyun@xxxxxxxxxxxxxxxxxx> wrote:
> On 09/08/2012 04:39 PM, Pekka Enberg wrote:
>> On Fri, Sep 7, 2012 at 1:29 AM, Paul E. McKenney
>> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
>>> On Thu, Sep 06, 2012 at 11:05:11AM +0800, Michael Wang wrote:
>>>> On 09/05/2012 09:55 PM, Christoph Lameter wrote:
>>>>> On Wed, 5 Sep 2012, Michael Wang wrote:
>>>>>
>>>>>> Since the cachep and cachep->slabp_cache's l3 alien are in the same lock class,
>>>>>> fake report generated.
>>>>>
>>>>> Ahh... That is a key insight into why this occurs.
>>>>>
>>>>>> This should not happen since we already have init_lock_keys() which will
>>>>>> reassign the lock class for both l3 list and l3 alien.
>>>>>
>>>>> Right. I was wondering why we still get intermitted reports on this.
>>>>>
>>>>>> This patch will invoke init_lock_keys() after we done enable_cpucache()
>>>>>> instead of before to avoid the fake DEADLOCK report.
>>>>>
>>>>> Acked-by: Christoph Lameter <cl@xxxxxxxxx>
>>>>
>>>> Thanks for your review.
>>>>
>>>> And add Paul to the cc list(my skills on mailing is really poor...).
>>>
>>> Tested-by: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
>>
>> I'd also like to tag this for the stable tree to avoid bogus lockdep
>> reports. How far back in release history should we queue this?
> Hi, Pekka
>
> Sorry for the delayed reply, I try to find out the reason for commit
> 30765b92 but not get it yet, so I add Peter to the cc list.
>
> The below patch for release 3.0.0 is the one to cause the bogus report.
>
> commit 30765b92ada267c5395fc788623cb15233276f5c
> Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Date: Thu Jul 28 23:22:56 2011 +0200
>
> slab, lockdep: Annotate the locks before using them
>
> Fernando found we hit the regular OFF_SLAB 'recursion' before we
> annotate the locks, cure this.
>
> The relevant portion of the stack-trace:
>
> > [ 0.000000] [<c085e24f>] rt_spin_lock+0x50/0x56
> > [ 0.000000] [<c04fb406>] __cache_free+0x43/0xc3
> > [ 0.000000] [<c04fb23f>] kmem_cache_free+0x6c/0xdc
> > [ 0.000000] [<c04fb2fe>] slab_destroy+0x4f/0x53
> > [ 0.000000] [<c04fb396>] free_block+0x94/0xc1
> > [ 0.000000] [<c04fc551>] do_tune_cpucache+0x10b/0x2bb
> > [ 0.000000] [<c04fc8dc>] enable_cpucache+0x7b/0xa7
> > [ 0.000000] [<c0bd9d3c>] kmem_cache_init_late+0x1f/0x61
> > [ 0.000000] [<c0bba687>] start_kernel+0x24c/0x363
> > [ 0.000000] [<c0bba0ba>] i386_start_kernel+0xa9/0xaf
>
> Reported-by: Fernando Lopez-Lezcano <nando@xxxxxxxxxxxxxxxxxx>
> Acked-by: Pekka Enberg <penberg@xxxxxxxxxx>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Link: http://lkml.kernel.org/r/1311888176.2617.379.camel@laptop
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
> It moved init_lock_keys() before we build up the alien, so we failed to
> reclass it.

I've queued the patch for v3.7. Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/