Re: [PATCH] mm,page_alloc,cma: configurable CMA utilization

From: Minchan Kim
Date: Wed Feb 08 2023 - 17:00:59 EST


On Sun, Feb 05, 2023 at 09:22:28PM -0800, Chris Goldsworthy wrote:
> On Wed, Feb 01, 2023 at 03:47:58PM -0800, Minchan Kim wrote:
> > Hi Chris,
> >
> > On Tue, Jan 31, 2023 at 08:06:28PM -0800, Chris Goldsworthy wrote:
> > > We're operating in a resource constrained environment, and we want to maximize
> > > the amount of memory free / headroom for GFP_KERNEL allocations on our SoCs,
> > > which are especially important for DMA allocations that use an IOMMU. We need a
> > > large amount of CMA on our SoCs for various reasons (e.g. for devices not
> > > upstream of an IOMMU), but whilst that CMA memory is not in use, we want to
> > > route all GFP_MOVABLE allocations to the CMA regions, which will free up memory
> > > for GFP_KERNEL allocations.
> >
> > I like this patch for different reason but for the specific problem you
> > mentioned, How about making reclaimer/compaction aware of the problem:
> >
> > IOW, when the GFP_KERNEL/DMA allocation happens but not enough memory
> > in the zones, let's migrates movable pages in those zones into CMA
> > area/movable zone if they are plenty of free memory.
> >
> > I guess you considered but did you observe some problems?
>
> Hi Minchan,
>
> This is not an approach we've considered. If you have a high-level idea of the
> key parts of vmscan.c you'd need to touch to implement this, could you point me
> to them?

I think the problem is not specific with CMA but also movable zone.
If movable pages are charged into non-movable zones, the problem wil
happen. So what I suggested was if reclaimers(e.g., background/direct
reclaimers) found the request was GFP_KERNEL but there are not enough
free pages in the zone and lower zones but has movable pages in there,
migrate them into the CMA area and/or movable zones to make room for
the GFP_KERNEL allocation before the final failure.

It needs touch wakeup_kswapd/kcompactd to trigger the migration and
reclaim/compaction needs to deal with the commmand. I couldn't say
where are good places to change until I look at further details but
I thought it's more general solution.

>
> I guess one drawback with this approach is that as soon as kswapd starts,
> psi_memstall_enter() is called, which can eventually lead to LMKD running in
> user space, which we want to minimize. One aim of what we're doing this is to
> delay the calling of psi_memstall_enter().

LMKD running would be not a problem, I think but you are worry about
LMKD decide killing apps due to wrong signal? I think it's orthgonal
issue. Actually, it's long time problem for userspace memory manager
since they don't know where the memory pressure comes from with what
constrains. This is the GFP_KERNEL constraint but LMKD can kill apps
which consumes much memory for movable zones or CMA area so cannot
help the memory pressure. Furthermore, LMKD has bunch of knobs to
affect decision to kill apps. PSI is just event to wake up LMKD,
not decision policy.

>
> It would be beneficial though on top of our change: if someone called
> cma_alloc() and migrated out of the CMA regions, changing kswapd to behave like
> this would move things back into the CMA regions after cma_release() is called
> (instead of having to kill a user space process to have the CMA re-utilized upon
> further user space actions).
>
> Thanks,
>
> Chris.