Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining

From: Marcelo Tosatti
Date: Thu Jan 26 2023 - 13:38:22 EST


On Thu, Jan 26, 2023 at 08:45:36AM +0100, Michal Hocko wrote:
> On Wed 25-01-23 15:22:00, Marcelo Tosatti wrote:
> [...]
> > Remote draining reduces interruptions whether CPU
> > is marked as isolated or not:
> >
> > - Allows isolated CPUs from benefiting of pcp caching.
> > - Removes the interruption to non isolated CPUs. See for example
> >
> > https://lkml.org/lkml/2022/6/13/2769
>
> This is talking about page allocato per cpu caches, right? In this patch
> we are talking about memcg pcp caches. Are you sure the same applies
> here?

Both can stall the users of the drain operation.

"Minchan Kim tested this independently and reported;

My workload is not NOHZ CPUs but run apps under heavy memory
pressure so they goes to direct reclaim and be stuck on
drain_all_pages until work on workqueue run."

Therefore using a workqueue to drain memcg pcps also depends on the
remote CPU executing that work item in time (which can stall
the following). No?

===

7 3141 mm/memory.c <<wp_page_copy>>
if (mem_cgroup_charge(page_folio(new_page), mm, GFP_KERNEL))
8 4118 mm/memory.c <<do_anonymous_page>>
if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
9 4577 mm/memory.c <<do_cow_fault>>
if (mem_cgroup_charge(page_folio(vmf->cow_page), vma->vm_mm,
10 621 mm/migrate_device.c <<migrate_vma_insert_page>>
if (mem_cgroup_charge(page_folio(page), vma->vm_mm, GFP_KERNEL))
11 710 mm/shmem.c <<shmem_add_to_page_cache>>
error = mem_cgroup_charge(folio, charge_mm, gfp);