RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

From: Song Bao Hua (Barry Song)
Date: Fri Aug 21 2020 - 16:47:48 EST




> -----Original Message-----
> From: Song Bao Hua (Barry Song)
> Sent: Saturday, August 22, 2020 7:27 AM
> To: 'Mike Kravetz' <mike.kravetz@xxxxxxxxxx>; hch@xxxxxx;
> m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx; will@xxxxxxxxxx;
> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
> akpm@xxxxxxxxxxxxxxxxxxxx
> Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> huangdaode <huangdaode@xxxxxxxxxx>; Linuxarm <linuxarm@xxxxxxxxxx>
> Subject: RE: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
> per-NUMA CMA
>
>
>
> > -----Original Message-----
> > From: Mike Kravetz [mailto:mike.kravetz@xxxxxxxxxx]
> > Sent: Saturday, August 22, 2020 5:53 AM
> > To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>; hch@xxxxxx;
> > m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx; will@xxxxxxxxxx;
> > ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
> > akpm@xxxxxxxxxxxxxxxxxxxx
> > Cc: iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; linux-arm-kernel@xxxxxxxxxxxxxxxxxxx;
> > linux-kernel@xxxxxxxxxxxxxxx; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> > huangdaode <huangdaode@xxxxxxxxxx>; Linuxarm
> <linuxarm@xxxxxxxxxx>
> > Subject: Re: [PATCH v7 0/3] make dma_alloc_coherent NUMA-aware by
> > per-NUMA CMA
> >
> > Hi Barry,
> > Sorry for jumping in so late.
> >
> > On 8/21/20 4:33 AM, Barry Song wrote:
> > >
> > > with per-numa CMA, smmu will get memory from local numa node to save
> > command
> > > queues and page tables. that means dma_unmap latency will be shrunk
> > much.
> >
> > Since per-node CMA areas for hugetlb was introduced, I have been thinking
> > about the limited number of CMA areas. In most configurations, I believe
> > it is limited to 7. And, IIRC it is not something that can be changed at
> > runtime, you need to reconfig and rebuild to increase the number. In
> contrast
> > some configs have NODES_SHIFT set to 10. I wasn't too worried because of
> > the limited hugetlb use case. However, this series is adding another user
> > of per-node CMA areas.
> >
> > With more users, should try to sync up number of CMA areas and number of
> > nodes? Or, perhaps I am worrying about nothing?
>
> Hi Mike,
> The current limitation is 8. If the server has 4 nodes and we enable both
> pernuma
> CMA and hugetlb, the last node will fail to get one cma area as the default
> global cma area will take 1 of 8. So users need to change menuconfig.
> If the server has 8 nodes, we enable one of pernuma cma and hugetlb, one
> node
> will fail to get cma.
>
> We may set the default number of CMA areas as 8+MAX_NODES(if hugetlb
> enabled) +
> MAX_NODES(if pernuma cma enabled) if we don't expect users to change
> config, but
> right now hugetlb has not an option in Kconfig to enable or disable like
> pernuma cma
> has DMA_PERNUMA_CMA.

I would prefer we make some changes like:

config CMA_AREAS
int "Maximum count of the CMA areas"
depends on CMA
+ default 19 if NUMA
default 7
help
CMA allows to create CMA areas for particular purpose, mainly,
used as device private area. This parameter sets the maximum
number of CMA area in the system.

- If unsure, leave the default value "7".
+ If unsure, leave the default value "7" or "19" if NUMA is used.

1+ CONFIG_CMA_AREAS should be quite enough for almost all servers in the markets.

If 2 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*2 + 1 = 5
If 4 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*4 + 1 = 9 -> default ARM64 config.
If 8 numa nodes, and both hugetlb cma and pernuma cma is enabled, we need 2*8 + 1 = 17

The default value is supporting the most common case and is not going to support those servers
with NODES_SHIFT=10, they can make their own config just like users need to increase CMA_AREAS
if they add many cma areas in device tree in a system even without NUMA.

How do you think, mike?

Thanks
Barry