Re: [BUG] perf_event: circular lock dependency

From: Stephane Eranian
Date: Thu Jan 28 2010 - 04:48:44 EST


On Thu, Jan 28, 2010 at 10:32 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Thu, 2010-01-28 at 10:19 +0100, Stephane Eranian wrote:
>
>> On Intel Core, one of my test programs generate this kind of
>> warning when it unmaps the sampling buffer after it has closed
>> the events fds.
>
>> [ 1729.441066] the existing dependency chain (in reverse order) is:
>> [ 1729.441092]
>> [ 1729.441093] -> #1 (&mm->mmap_sem){++++++}:
>> [ 1729.441123] Â Â Â Â[<ffffffff81077f97>] validate_chain+0xc17/0x1360
>> [ 1729.441151] Â Â Â Â[<ffffffff81078a53>] __lock_acquire+0x373/0xb30
>> [ 1729.441170] Â Â Â Â[<ffffffff810792ac>] lock_acquire+0x9c/0x100
>> [ 1729.441189] Â Â Â Â[<ffffffff810e74a4>] might_fault+0x84/0xb0
>> [ 1729.441207] Â Â Â Â[<ffffffff810c3605>] perf_read+0x135/0x2d0
>> [ 1729.441225] Â Â Â Â[<ffffffff8110c604>] vfs_read+0xc4/0x180
>> [ 1729.441245] Â Â Â Â[<ffffffff8110ca10>] sys_read+0x50/0x90
>> [ 1729.441263] Â Â Â Â[<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
>> [ 1729.441284]
>> [ 1729.441284] -> #0 (&ctx->mutex){+.+...}:
>> [ 1729.441313] Â Â Â Â[<ffffffff810786cd>] validate_chain+0x134d/0x1360
>> [ 1729.441332] Â Â Â Â[<ffffffff81078a53>] __lock_acquire+0x373/0xb30
>> [ 1729.441351] Â Â Â Â[<ffffffff810792ac>] lock_acquire+0x9c/0x100
>> [ 1729.441369] Â Â Â Â[<ffffffff81442e59>] mutex_lock_nested+0x69/0x340
>> [ 1729.441389] Â Â Â Â[<ffffffff810c2ebd>] perf_event_release_kernel+0x2d/0xe0
>> [ 1729.441409] Â Â Â Â[<ffffffff810c2f8b>] perf_release+0x1b/0x20
>> [ 1729.441426] Â Â Â Â[<ffffffff8110d051>] __fput+0x101/0x230
>> [ 1729.441444] Â Â Â Â[<ffffffff8110d457>] fput+0x17/0x20
>> [ 1729.441462] Â Â Â Â[<ffffffff810e98d1>] remove_vma+0x51/0x90
>> [ 1729.441480] Â Â Â Â[<ffffffff810ea708>] do_munmap+0x2e8/0x340
>> [ 1729.441498] Â Â Â Â[<ffffffff810ebac0>] sys_munmap+0x50/0x80
>> [ 1729.441516] Â Â Â Â[<ffffffff81002ceb>] system_call_fastpath+0x16/0x1b
>> [ 1729.441535]
>
> Crap, the thing is right.. you've been using group reads, which require
> holding the ctx->mutex to ensure the group doesn't change while you're
> reading it, leading to this inversion thing...
>
Correct, I am using PERF_READ_GROUP and PERF_SAMPLE_READ.

> Not sure where to break this loop though, the hacky way is pushing all
> of perf_event_release_kernel() into a work, but that's yucky.. Let me
> ponder this a bit more.
>
>



--
Stephane Eranian | EMEA Software Engineering
Google France | 38 avenue de l'OpÃra | 75002 Paris
Tel : +33 (0) 1 42 68 53 00
This email may be confidential or privileged. If you received this
communication by mistake, please
don't forward it to anyone else, please erase all copies and
attachments, and please let me know that
it went to the wrong person. Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/