Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

From: Mike Kravetz
Date: Fri Aug 21 2020 - 17:13:08 EST


On 8/21/20 1:47 PM, Song Bao Hua (Barry Song) wrote:
>
>
>> -----Original Message-----
>> From: Song Bao Hua (Barry Song)
>> Sent: Saturday, August 22, 2020 7:27 AM
>> To: 'Mike Kravetz' <mike.kravetz@xxxxxxxxxx>; hch@xxxxxx;
>> m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx; will@xxxxxxxxxx;
>> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
>> akpm@xxxxxxxxxxxxxxxxxxxx
>> Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
>> linux-kernel@xxxxxxxxxxxxxxx; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
>> huangdaode <huangdaode@xxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>
>> Subject: RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
>> per-NUMA CMA
>>
>>
>>
>>> -----Original Message-----
>>> From: Mike Kravetz [mailto:mike.kravetz@xxxxxxxxxx]
>>> Sent: Saturday, August 22, 2020 5:53 AM
>>> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; hch@xxxxxx;
>>> m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx; will@xxxxxxxxxx;
>>> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
>>> akpm@xxxxxxxxxxxxxxxxxxxx
>>> Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
>>> linux-kernel@xxxxxxxxxxxxxxx; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
>>> huangdaode <huangdaode@xxxxxxxxxx>; Linuxarm
>> <linuxarm@xxxxxxxxxx>
>>> Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
>>> per-NUMA CMA
>>>
>>> Hi Barry,
>>> Sorry for jumping in so late.
>>>
>>> On 8/21/20 4:33 AM, Barry Song wrote:
>>>>
>>>> with per-numa CMA, smmu will get memory from local numa node to save
>>> command
>>>> queues and page tables. that means dma_unmap latency will be shrunk
>>> much.
>>>
>>> Since per-node CMA areas for hugetlb was introduced, I have been thinking
>>> about the limited number of CMA areas. In most configurations, I believe
>>> it is limited to 7. And, IIRC it is not something that can be changed at
>>> runtime, you need to reconfig and rebuild to increase the number. In
>> contrast
>>> some configs have NODES_SHIFT set to 10. I wasn't too worried because of
>>> the limited hugetlb use case. However, this series is adding another user
>>> of per-node CMA areas.
>>>
>>> With more users, should try to sync up number of CMA areas and number of
>>> nodes? Or, perhaps I am worrying about nothing?
>>
>> Hi Mike,
>> The current limitation is 8. If the server has 4 nodes and we enable both
>> pernuma
>> CMA and hugetlb, the last node will fail to get one cma area as the default
>> global cma area will take 1 of 8. So users need to change menuconfig.
>> If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one
>> node
>> will fail to get cma.
>>
>> We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb
>> enabled) +
>> MAX_NODES(if pernuma cma enabled) if we don't expect users to change
>> config, but
>> right now hugetlb has not an option in Kconfig to enable or disable like
>> pernuma cma
>> has DMA_PERNUMA_CMA.
>
> I would prefer we make some changes like:
>
> config CMA_AREAS
> int "Maximum count of the CMA areas"
> depends on CMA
> + default 19 if NUMA
> default 7
> help
> CMA allows to create CMA areas for particular purpose, mainly,
> used as device private area. This parameter sets the maximum
> number of CMA area in the system.
>
> - If unsure, leave the default value "7".
> + If unsure, leave the default value "7" or "19" if NUMA is used.
>
> 1+ CONFIG_CMA_AREAS should be quite enough for almost all servers in the markets.
>
> If 2 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*2 + 1 = 5
> If 4 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*4 + 1 = 9 -> default ARM64 config.
> If 8 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*8 + 1 = 17
>
> The default value is supporting the most common case and is not going to support those servers
> with NODES_SHIFT=10, they can make their own config just like users need to increase CMA_AREAS
> if they add many cma areas in device tree in a system even without NUMA.
>
> How do you think, mike?

I'm OK with that. I really did not want to sidetrach this series. It is
just something I thought about when looking at the hugetlb code. My 'to do'
list includes looking at a way to make the number of CMA areas dynamic.
--
Mike Kravetz