Re: [PATCH v5 3/3] mm/oom_kill: allow process_mrelease to run under mmap_lock protection

From: Jason Gunthorpe
Date: Thu Dec 09 2021 - 15:48:20 EST


On Thu, Dec 09, 2021 at 11:13:25AM -0800, Suren Baghdasaryan wrote:
> With exit_mmap holding mmap_write_lock during free_pgtables call,
> process_mrelease does not need to elevate mm->mm_users in order to
> prevent exit_mmap from destrying pagetables while __oom_reap_task_mm
> is walking the VMA tree. The change prevents process_mrelease from
> calling the last mmput, which can lead to waiting for IO completion
> in exit_aio.
>
> Signed-off-by: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> ---
> changes in v5
> - Removed Fixes: tag, per Michal Hocko
> - Added Acked-by's
>
> mm/oom_kill.c | 27 +++++++++++++++------------
> 1 file changed, 15 insertions(+), 12 deletions(-)

Reviewed-by: Jason Gunthorpe <jgg@xxxxxxxxxx>

There are mmget_not_zero's all over the place, can others be cleaned
after this series goes ahead too?

It seems like anything doing the mmget just to look at the vma list
under the mmap lock is now fine with only a mmgrab?

A few I know about:

drivers/infiniband/core/umem_odp.c: if (!mmget_not_zero(umem->owning_mm)) {

This is because mmu_interval_notifier_insert() might call
mm_take_all_locks() which was unsafe with concurrent exit_mmap

drivers/infiniband/core/umem_odp.c: if (!owning_process || !mmget_not_zero(owning_mm)) {

This is because it calls hmm_range_fault() which iterates over the vma
list which is safe now

drivers/iommu/iommu-sva-lib.c: return mmget_not_zero(mm);
drivers/iommu/iommu-sva-lib.c: return ioasid_find(&iommu_sva_pasid, pasid, __mmget_not_zero);

It calls find_extend_vma() - but also it doesn't seem to have a mmgrab when it
does that mmget. The rcu is messed up here too, so humm.

Jason