Re: [PATCH] thp: close race between split and zap huge pages

From: Andrea Arcangeli
Date: Thu Apr 17 2014 - 16:16:28 EST


Hi everyone,

On Wed, Apr 16, 2014 at 12:48:56AM +0300, Kirill A. Shutemov wrote:
> - pmd = mm_find_pmd(mm, address);
> - if (!pmd)
> + pgd = pgd_offset(mm, address);
> + if (!pgd_present(*pgd))
> return NULL;
> + pud = pud_offset(pgd, address);
> + if (!pud_present(*pud))
> + return NULL;
> + pmd = pmd_offset(pud, address);

This fix looks good to me and it was another potential source of
trouble making the BUG_ON flakey. But the rmap_walk out of order
problem still exists too I think. Possibly the testcase doesn't
exercise that.

> - if (pmd_none(*pmd))
> + if (!pmd_present(*pmd))
> goto unlock;

pmd_present is a bit slower, but functionally it's equivalent, the
pmd_present check is just more pedantic (kind of defining the
invariants for how a mapped pmd should look like).

If we'd add native THP swapout later !pmd_present would be more
correct for the VM calls to page_check_address_pmd, but something
would need changing anyway if split_huge_page is the callee as I don't
think we can skip the conversion from trans huge swap entry to linear
swap entries and the pmd2pte conversion.

The main reason that most places that could run into a trans huge pmd
would use pmd_none and never pmd_present is that originally
pmd_present wouldn't check _PAGE_PSE and _PAGE_PRESENT can be
temporarily be cleared with pmdp_invalidate on trans huge pmds. Now
pmd_present is safe too so there's no problem in using it on trans
huge pmds.

So either pmd_none !pmd_present are fine, the functional fix is the
part above.

Thanks!
Andrea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/