RE: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve per-numa CMA

From: Song Bao Hua (Barry Song)
Date: Fri Aug 21 2020 - 05:13:50 EST




> -----Original Message-----
> From: Will Deacon [mailto:will@xxxxxxxxxx]
> Sent: Friday, August 21, 2020 8:47 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> Cc: hch@xxxxxx; m.szyprowski@xxxxxxxxxxx; robin.murphy@xxxxxxx;
> ganapatrao.kulkarni@xxxxxxxxxx; catalin.marinas@xxxxxxx;
> iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>;
> linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> huangdaode <huangdaode@xxxxxxxxxx>; Jonathan Cameron
> <jonathan.cameron@xxxxxxxxxx>; Nicolas Saenz Julienne
> <nsaenzjulienne@xxxxxxx>; Steve Capper <steve.capper@xxxxxxx>; Andrew
> Morton <akpm@xxxxxxxxxxxxxxxxxxxx>; Mike Rapoport <rppt@xxxxxxxxxxxxx>
> Subject: Re: [PATCH v6 1/2] dma-contiguous: provide the ability to reserve
> per-numa CMA
>
> On Fri, Aug 21, 2020 at 02:26:14PM +1200, Barry Song wrote:
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt
> b/Documentation/admin-guide/kernel-parameters.txt
> > index bdc1f33fd3d1..3f33b89aeab5 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -599,6 +599,15 @@
> > altogether. For more information, see
> > include/linux/dma-contiguous.h
> >
> > + pernuma_cma=nn[MG]
> > + [ARM64,KNL]
> > + Sets the size of kernel per-numa memory area for
> > + contiguous memory allocations. A value of 0 disables
> > + per-numa CMA altogether. DMA users on node nid will
> > + first try to allocate buffer from the pernuma area
> > + which is located in node nid, if the allocation fails,
> > + they will fallback to the global default memory area.
>
> What is the default behaviour if this option is not specified? Seems like
> that should be mentioned here.
>
> > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> > index 847a9d1fa634..db7a37ed35eb 100644
> > --- a/kernel/dma/Kconfig
> > +++ b/kernel/dma/Kconfig
> > @@ -118,6 +118,16 @@ config DMA_CMA
> > If unsure, say "n".
> >
> > if DMA_CMA
> > +
> > +config DMA_PERNUMA_CMA
> > + bool "Enable separate DMA Contiguous Memory Area for each NUMA
> Node"
>
> I don't understand the need for this config option. If you have DMA_DMA and
> you have NUMA, why wouldn't you want this enabled?

Christoph preferred this in previous patchset in order to be able to remove all of the code
in the text if users don't use pernuma CMA.

>
> > + help
> > + Enable this option to get pernuma CMA areas so that devices like
> > + ARM64 SMMU can get local memory by DMA coherent APIs.
> > +
> > + You can set the size of pernuma CMA by specifying
> "pernuma_cma=size"
> > + on the kernel's command line.
> > +
> > comment "Default contiguous memory area size:"
> >
> > config CMA_SIZE_MBYTES
> > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > index cff7e60968b9..89b95f10e56d 100644
> > --- a/kernel/dma/contiguous.c
> > +++ b/kernel/dma/contiguous.c
> > @@ -69,6 +69,19 @@ static int __init early_cma(char *p)
> > }
> > early_param("cma", early_cma);
> >
> > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > +
> > +static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
> > +static phys_addr_t pernuma_size_bytes __initdata;
> > +
> > +static int __init early_pernuma_cma(char *p)
> > +{
> > + pernuma_size_bytes = memparse(p, &p);
> > + return 0;
> > +}
> > +early_param("pernuma_cma", early_pernuma_cma);
> > +#endif
> > +
> > #ifdef CONFIG_CMA_SIZE_PERCENTAGE
> >
> > static phys_addr_t __init __maybe_unused
> cma_early_percent_memory(void)
> > @@ -96,6 +109,34 @@ static inline __maybe_unused phys_addr_t
> cma_early_percent_memory(void)
> >
> > #endif
> >
> > +#ifdef CONFIG_DMA_PERNUMA_CMA
> > +void __init dma_pernuma_cma_reserve(void)
> > +{
> > + int nid;
> > +
> > + if (!pernuma_size_bytes)
> > + return;
>
> If this is useful (I assume it is), then I think we should have a non-zero
> default value, a bit like normal CMA does via CMA_SIZE_MBYTES.

The patchet used to have a CONFIG_PERNUMA_CMA_SIZE in kernel/dma/Kconfig, but Christoph was not comfortable
with it:
https://lore.kernel.org/linux-iommu/20200728115231.GA793@xxxxxx/

Would you mind to hardcode the value in CONFIG_CMDLINE in arch/arm64/Kconfig as Christoph mentioned:
config CMDLINE
default "pernuma_cma=16M"

If you also don't like the change in arch/arm64/Kconfig CMDLINE, I guess I have to depend on users' setting in
cmdline just like hugetlb_cma.

>
> > + for_each_node_state(nid, N_ONLINE) {
>
> for_each_online_node() {
>
> > + int ret;
> > + char name[20];
>
> 20?
>
> Ah, wait, this is copy-pasta from hugetlb_cma_reserve(). Can you factor out
> the common parts at all?

Actually I have a "#define CMA_MAX_NAME 64" in this commit:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=18e98e56f440

the 20 in hugetlb_cma_reserve() was also made by me. If you are not comfortable, I can
move to CMA_MAX_NAME. do you think it does really matter here? 20 seems to be long
enough for this scenario.

Thanks
Barry