Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility

From: Yang Shi
Date: Wed Jun 12 2019 - 16:04:14 EST




On 6/12/19 11:44 AM, Hugh Dickins wrote:
On Mon, 10 Jun 2019, Yang Shi wrote:
On 6/7/19 8:58 PM, Hugh Dickins wrote:
Yes, that is correct; and correctly placed. But a little more is needed:
see how mm/memory.c's transhuge_vma_suitable() will only allow a pmd to
be used instead of a pte if the vma offset and size permit. smaps should
not report a shmem vma as THPeligible if its offset or size prevent it.

And I see that should also be fixed on anon vmas: at present smaps
reports even a 4kB anon vma as THPeligible, which is not right.
Maybe a test like transhuge_vma_suitable() can be added into
transparent_hugepage_enabled(), to handle anon and shmem together.
I say "like transhuge_vma_suitable()", because that function needs
an address, which here you don't have.
Thanks for the remind. Since we don't have an address I'm supposed we just
need check if the vma's size is big enough or not other than other alignment
check.

And, I'm wondering whether we could reuse transhuge_vma_suitable() by passing
in an impossible address, i.e. -1 since it is not a valid userspace address.
It can be used as and indicator that this call is from THPeligible context.
Perhaps, but sounds like it will abuse and uglify transhuge_vma_suitable()
just for smaps. Would passing transhuge_vma_suitable() the address
((vma->vm_end & HPAGE_PMD_MASK) - HPAGE_PMD_SIZE)
give the the correct answer in all cases?

Yes, it looks better.


The anon offset situation is interesting: usually anon vm_pgoff is
initialized to fit with its vm_start, so the anon offset check passes;
but I wonder what happens after mremap to a different address - does
transhuge_vma_suitable() then prevent the use of pmds where they could
actually be used? Not a Number#1 priority to investigate or fix here!
but a curiosity someone might want to look into.
Will mark on my TODO list.

Even with your changes
ShmemPmdMapped: 4096 kB
THPeligible: 0
will easily be seen: THPeligible reflects whether a huge page can be
allocated and mapped by pmd in that vma; but if something else already
allocated the huge page earlier, it will be mapped by pmd in this vma
if offset and size allow, whatever THPeligible says. We could change
transhuge_vma_suitable() to force ptes in that case, but it would be
a silly change, just to make what smaps shows easier to explain.
Where did this come from? From the commit log? If so it is the example for
the wrong smap output. If that case really happens, I think we could document
it since THPeligible should just show the current status.
Please read again what I explained there: it's not necessarily an example
of wrong smaps output, it's reasonable smaps output for a reasonable case.

Yes, maybe Documentation/filesystems/proc.txt should explain "THPeligble"
a little better - "eligible for allocating THP pages" rather than just
"eligible for THP pages" would be good enough? we don't want to write
a book about the various cases.

Yes, I agree.


Oh, and the "THPeligible" output lines up very nicely there in proc.txt:
could the actual alignment of that 0 or 1 be fixed in smaps itself too?

Sure.

Thanks,
Yang


Thanks,
Hugh