Re: [linux-next: Tree for Jun 1] __khugepaged_exit rwsem_down_write_failed lockup

From: Sergey Senozhatsky
Date: Thu Jun 02 2016 - 21:29:29 EST


On (06/03/16 10:00), Sergey Senozhatsky wrote:
> a good find by Vlastimil.
>
> Ebru, can you also re-visit __collapse_huge_page_swapin()? it's called
> from collapse_huge_page() under the down_read(&mm->mmap_sem), is there
> any reason to do the nested down_read(&mm->mmap_sem)?
>
> collapse_huge_page()
> ...
> down_read(&mm->mmap_sem);
> result = hugepage_vma_revalidate(mm, vma, address);
> if (result)
> goto out;
>
> pmd = mm_find_pmd(mm, address);
> if (!pmd) {
> result = SCAN_PMD_NULL;
> goto out;
> }
>
> if (allocstall == curr_allocstall && swap != 0) {
> if (!__collapse_huge_page_swapin(mm, vma, address, pmd)) {
> {
> : if (ret & VM_FAULT_RETRY) {
> : down_read(&mm->mmap_sem);
> : ^^^^^^^^^

oh... it's in a loop

for (_address = address; _address < address + HPAGE_PMD_NR*PAGE_SIZE;
pte++, _address += PAGE_SIZE) {
ret = do_swap_page()
if (ret & VM_FAULT_RETRY) {
down_read(&mm->mmap_sem);
^^^^^^^^^
...
}
}

so there can be multiple sem->count++ in __collapse_huge_page_swapin(),
and you don't know how many sem->count-- you need to do later? is this
correct or am I hallucinating?

-ss

> : if (hugepage_vma_revalidate(mm, vma, address))
> : return false;
> : }
> }
>
> up_read(&mm->mmap_sem);
> goto out;
> }
> }
>
> up_read(&mm->mmap_sem);
>
>
>
> so if __collapse_huge_page_swapin() retruns true we have:
> - down_read() twice, up_read() once?
>
> the locking rules here are a bit confusing. (I didn't have my morning coffee yet).
>
> -ss
>