Re: [RFC V2 02/12] mm: Isolate HugeTLB allocations away from CDM nodes

From: Dave Hansen
Date: Mon Jan 30 2017 - 20:37:23 EST


On 01/30/2017 05:03 PM, Anshuman Khandual wrote:
> On 01/30/2017 10:49 PM, Dave Hansen wrote:
>> On 01/29/2017 07:35 PM, Anshuman Khandual wrote:
>>> HugeTLB allocation/release/accounting currently spans across all the nodes
>>> under N_MEMORY node mask. Coherent memory nodes should not be part of these
>>> allocations. So use system_ram() call to fetch system RAM only nodes on the
>>> platform which can then be used for HugeTLB allocation purpose instead of
>>> N_MEMORY node mask. This isolates coherent device memory nodes from HugeTLB
>>> allocations.
>>
>> Does this end up making it impossible to use hugetlbfs to access device
>> memory?
>
> Right, thats the implementation at the moment. But going forward if we need
> to have HugeTLB pages on the CDM node, then we can implement through the
> sysfs interface from individual NUMA node paths instead of changing the
> generic HugeTLB path. I wrote this up in the cover letter but should also
> have mentioned in the comment section of this patch as well. Does this
> approach look okay ?

The cover letter is not the most approachable document I've ever seen. :)

> "Now, we ensure complete HugeTLB allocation isolation from CDM nodes. Going
> forward if we need to support HugeTLB allocation on CDM nodes on targeted
> basis, then we would have to enable those allocations through the
> /sys/devices/system/node/nodeN/hugepages/hugepages-16384kB/nr_hugepages
> interface while still ensuring isolation from other generic sysctl and
> /sys/kernel/mm/hugepages/hugepages-16384kB/nr_hugepages interfaces."

That would be passable if that's the only way you can allocate hugetlbfs
pages. But we also have the fault-based allocations that can pull stuff
right out of the buddy allocator. This approach would break that path
entirely.

FWIW, I think you really need to separate the true "CDM" stuff that's
*really* device-specific from the parts of this from which you really
just want to implement isolation.