Re: [RFC V2 08/12] mm: Add new VMA flag VM_CDM

From: Anshuman Khandual
Date: Mon Jan 30 2017 - 23:24:16 EST


On 01/31/2017 12:22 AM, Jerome Glisse wrote:
> On Mon, Jan 30, 2017 at 09:05:49AM +0530, Anshuman Khandual wrote:
>> VMA which contains CDM memory pages should be marked with new VM_CDM flag.
>> These VMAs need to be identified in various core kernel paths for special
>> handling and this flag will help in their identification.
>>
>> Signed-off-by: Anshuman Khandual <khandual@xxxxxxxxxxxxxxxxxx>
>
>
> Why doing this on vma basis ? Why not special casing all those path on page
> basis ?

The primary motivation being the cost. Wont it be too expensive to account
for and act on individual pages rather than on the VMA as a whole ? For
example page_to_nid() seemed pretty expensive when tried to tag VMA on
individual page fault basis.

>
> After all you can have a big vma with some pages in it being cdm and other
> being regular page. The CPU process might migrate to different CPU in a
> different node and you might still want to have the regular page to migrate
> to this new node and keep the cdm page while the device is still working
> on them.

Right, that is the ideal thing to do. But wont it be better to split the
big VMA into smaller chunks and tag them appropriately so that those VMAs
tagged would contain as much CDM pages as possible for them to be likely
restricted from auto NUMA, KSM etc.

>
> This is just an example, same can apply for ksm or any other kernel feature
> you want to special case. Maybe we can store a set of flag in node that
> tells what is allowed for page in node (ksm, hugetlb, migrate, numa, ...).
>
> This would be more flexible and the policy choice can be left to each of
> the device driver.

Hmm, thats another way of doing the special cases. The other way as Dave
had mentioned before is to classify coherent memory property into various
kinds and store them for each node and implement a predefined set of
restrictions for each kind of coherent memory which might include features
like auto NUMA, HugeTLB, KSM etc. Maintaining two different property sets
one for the kind of coherent memory and the other being for each special
cases) wont be too complicated ?