Re: TTM huge page-faults WAS: Re: [RFC PATCH 1/2] x86: Don't let pgprot_modify() change the page encryption bit

From: Koenig, Christian
Date: Tue Sep 24 2019 - 08:03:15 EST


Am 11.09.19 um 17:08 schrieb Thomas HellstrÃm (VMware):
> On 9/11/19 4:06 PM, Koenig, Christian wrote:
>> Am 11.09.19 um 12:10 schrieb Thomas HellstrÃm (VMware):
>> [SNIP]
>>>>> The problem seen in TTM is that we want to be able to change the
>>>>> vm_page_prot from the fault handler, but it's problematic since we
>>>>> have the mmap_sem typically only in read mode. Hence the fake vma
>>>>> hack. From what I can tell it's reasonably well-behaved, since
>>>>> pte_modify() skips the bits TTM updates, so mprotect() and mremap()
>>>>> works OK. I think split_huge_pmd may run into trouble, but we don't
>>>>> support it (yet) with TTM.
>>>> Ah! I actually ran into this while implementing huge page support for
>>>> TTM and never figured out why that doesn't work. Dropped CPU huge page
>>>> support because of this.
>>> By incident, I got slightly sidetracked the other day and started
>>> looking at this as well. Got to the point where I figured out all the
>>> hairy alignment issues and actually got huge_fault() calls, but never
>>> implemented the handler. I think that's definitely something worth
>>> having. Not sure it will work for IO memory, though, (split_huge_pmd
>>> will just skip non-page-backed memory) but if we only support
>>> VM_SHARED (non COW) vmas there's no reason to split the huge pmds
>>> anyway. Definitely something we should have IMO.
>> Well our primary use case would be IO memory, cause system memory is
>> only optionally allocate as huge page but we nearly always allocate VRAM
>> in chunks of at least 2MB because we otherwise get a huge performance
>> penalty.
>
> But that system memory option is on by default, right? In any case, a
> request for a huge_fault
> would probably need to check that there is actually an underlying
> huge_page and otherwise fallback to ordinary faults.
>
> Another requirement would be for IO memory allocations to be
> PMD_PAGE_SIZE aligned in the mappable aperture, to avoid fallbacks to
> ordinary faults. Probably increasing fragmentation somewhat. (Seems
> like pmd entries can only point to PMD_PAGE_SIZE aligned physical
> addresses) Would that work for you?

Yeah, we do it this way anyway.

Regards,
Christian.