Re: mm: uninterruptable tasks hanged on mmap_sem

From: Dmitry Vyukov
Date: Wed Feb 03 2016 - 08:19:05 EST


On Wed, Feb 3, 2016 at 2:11 PM, Jiri Kosina <jikos@xxxxxxxxxx> wrote:
> On Tue, 2 Feb 2016, Dmitry Vyukov wrote:
>
>> Hello,
>>
>> If the following program run in a parallel loop, eventually it leaves
>> hanged uninterruptable tasks on mmap_sem.
>>
>> [ 4074.740298] sysrq: SysRq : Show Locks Held
>> [ 4074.740780] Showing all locks held in the system:
>> ...
>> [ 4074.762133] 1 lock held by a.out/1276:
>> [ 4074.762427] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff816df89c>]
>> __mm_populate+0x25c/0x350
>> [ 4074.763149] 1 lock held by a.out/1147:
>> [ 4074.763438] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff816b3bbc>]
>> vm_mmap_pgoff+0x12c/0x1b0
>> [ 4074.764164] 1 lock held by a.out/1284:
>> [ 4074.764447] #0: (&mm->mmap_sem){++++++}, at: [<ffffffff816df89c>]
>> __mm_populate+0x25c/0x350
>> [ 4074.765287]
>
> I've now tried to reproduce this on 4.5-rc1 (with the lock_fdc() fix
> applied), and I am not seeing any stuck tasks so far.
>
> Could you please provide more details about the reproduction scenario?
> Namely, how many parallel invocations are you typically running (and how
> many cores does the system in question have), and what is the typical time
> that you need for the problem to appear?


Qemu with 4 cores, 32 parallel processes, it took 20 seconds (2409
program executions) to hang 2 of them just now.