Re: [PATCH 3/3] mm, thp: Do not loose dirty bit in __split_huge_pmd_locked()

From: Kirill A. Shutemov
Date: Thu Jun 15 2017 - 04:47:12 EST


On Wed, Jun 14, 2017 at 05:31:31PM +0200, Andrea Arcangeli wrote:
> Hello,
>
> On Wed, Jun 14, 2017 at 04:18:57PM +0200, Martin Schwidefsky wrote:
> > Could we change pmdp_invalidate to make it return the old pmd entry?
>
> That to me seems the simplest fix to avoid losing the dirty bit.
>
> I earlier suggested to replace pmdp_invalidate with something like
> old_pmd = pmdp_establish(pmd_mknotpresent(pmd)) (then tlb flush could
> then be conditional to the old pmd being present). Making
> pmdp_invalidate return the old pmd entry would be mostly equivalent to
> that.
>
> The advantage of not changing pmdp_invalidate is that we could skip a
> xchg which is more costly in __split_huge_pmd_locked and
> madvise_free_huge_pmd so perhaps there's a point to keep a variant of
> pmdp_invalidate that doesn't use xchg internally (and in turn can't
> return the old pmd value atomically).
>
> If we don't want new messy names like pmdp_establish we could have a
> __pmdp_invalidate that returns void, and pmdp_invalidate that returns
> the old pmd and uses xchg (and it'd also be backwards compatible as
> far as the callers are concerned). So those places that don't need the
> old value returned and can skip the xchg, could simply
> s/pmdp_invalidate/__pmdp_invalidate/ to optimize.

We have few pmdp_invalidate() callers:

- clear_soft_dirty_pmd();
- madvise_free_huge_pmd();
- change_huge_pmd();
- __split_huge_pmd_locked();

Only madvise_free_huge_pmd() doesn't care about old pmd.

__split_huge_pmd_locked() actually needs to check dirty after
pmdp_invalidate(), see patch 3/3 of the patchset.

I don't think it worth introduce one more primitive only for
madvise_free_huge_pmd().

I'll stick with single pmdp_invalidate() that returns old value.

--
Kirill A. Shutemov