Re: mm: kernel BUG at include/linux/swapops.h:131!

From: Konstantin Khlebnikov
Date: Thu Dec 26 2013 - 01:19:40 EST


Bob Liu <bob.liu@xxxxxxxxxx> wrote:
>On 12/24/2013 03:45 PM, Joonsoo Kim wrote:
>> On Tue, Dec 24, 2013 at 03:07:05PM +0900, Joonsoo Kim wrote:
>>> On Mon, Dec 23, 2013 at 10:01:10PM -0500, Sasha Levin wrote:
>>>> On 12/23/2013 09:51 PM, Joonsoo Kim wrote:
>>>>> On Mon, Dec 23, 2013 at 12:24:02PM -0500, Sasha Levin wrote:
>>>>>>> Ping?
>>>>>>>
>>>>>>> I've also Cc'ed the "this page shouldn't be locked at all" team.
>>>>> Hello,
>>>>>
>>>>> I can't find the reason of this problem.
>>>>> If it is reproducible, how about bisecting?
>>>>
>>>> While it reproduces under fuzzing it's pretty hard to bisect it
>with
>>>> the amount of issues uncovered by trinity recently.
>>>>
>>>> I can add any debug code to the site of the BUG if that helps.
>>>
>>> Good!
>>> It will be helpful to add dump_page() in migration_entry_to_page().
>>>
>>> Thanks.
>>>
>>
>> Minchan teaches me that there is possible race condition between
>> fork and migration.
>>
>> Please consider following situation.
>>
>>
>> Process A (do migration) Process B (parents) Process C (child)
>>
>> try_to_unmap() for migration <begin> fork
>> setup migration entry to B's vma
>> ...
>> try_to_unmap() for migration <end>
>> move_to_new_page()
>>
>> link new vma
>> into interval tree
>> remove_migration_ptes() <begin>
>> check and clear migration entry on C's vma
>> ... copy_one_pte:
>> ... now, B and C have migration entry
>> ...
>> ...
>> check and clear migration entry on B's vma
>> ...
>> ...
>> remove_migration_ptes() <end>
>>
>>
>> Eventually, migration entry on C's vma is left.
>> And then, when C exits, above BUG_ON() can be triggered.
>>
>
>Yes, Looks like this is a potential race condition.
>
>> I'm not sure the I am right, so please think of it together. :)
>> And I'm not sure again that above assumption is related to this
>trigger report,
>> since this may exist for a long time.
>>
>> So my question to mm folks is is above assumption possible and do we
>have
>> any protection mechanism on this race?
>>
>
>I think we can down_read(&mm->mmap_sem) before remove_migration_ptes()
>to fix this issue, but I don't have time to verify it currently.

Hmm. This kind of race looks impossible: dup_mmap() always places child's
vma in into rmap tree after parent's one. For file-vma it's done explicitly
(vma_interval_tree_insert_after), for anon vma it's true because rb-tree
insert function goes to right branch if elements are equal.

Thus remove_migration_ptes() sees parent's pte first:
If child has the copy this function will check it after that.
And they are already synchronized with parent's and child's pte locks.ï

Sorry for double posting, gmail cannot into plain text =)

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/