Re: [RFC PATCH 0/5] Memory access profiler(IBS) driven NUMA balancing

From: Bharata B Rao
Date: Sun Feb 12 2023 - 22:24:10 EST


On 2/13/2023 8:26 AM, Huang, Ying wrote:
> Bharata B Rao <bharata@xxxxxxx> writes:
>
>> On 2/8/2023 11:33 PM, Peter Zijlstra wrote:
>>> On Wed, Feb 08, 2023 at 01:05:28PM +0530, Bharata B Rao wrote:
>>>
>>>
>>>> - Hardware provided access information could be very useful for driving
>>>> hot page promotion in tiered memory systems. Need to check if this
>>>> requires different tuning/heuristics apart from what NUMA balancing
>>>> already does.
>>>
>>> I think Huang Ying looked at that from the Intel POV and I think the
>>> conclusion was that it doesn't really work out. What you need is
>>> frequency information, but the PMU doesn't really give you that. You
>>> need to process a *ton* of PMU data in-kernel.
>>
>> What I am doing here is to feed the access data into NUMA balancing which
>> already has the logic to aggregate that at task and numa group level and
>> decide if that access is actionable in terms of migrating the page. In this
>> context, I am not sure about the frequency information that you and Dave
>> are mentioning. AFAIU, existing NUMA balancing takes care of taking
>> action, IBS becomes an alternative source of access information to NUMA
>> hint faults.
>
> We do need frequency information to determine whether a page is hot
> enough to be migrated to the fast memory (promotion). What PMU provided
> is just "recently" accessed pages, not "frequently" accessed pages. For
> current NUMA balancing implementation, please check
> NUMA_BALANCING_MEMORY_TIERING in should_numa_migrate_memory(). In
> general, it estimates the page access frequency via measuring the
> latency between page table scanning and page fault, the shorter the
> latency, the higher the frequency. This isn't perfect, but provides a
> starting point. You need to consider how to get frequency information
> via PMU. For example, you may count access number for each page, aging
> them periodically, and get hot threshold via some statistics.

For the tiered memory hot page promotion case of NUMA balancing, we will
have to maintain frequency information in software when such information
isn't available from the hardware.

Regards,
Bharata.