Re: [PATCH] sched/numa: Fix NULL pointer access to mm_struct durng task swap
From: Michal Hocko
Date: Thu Jul 03 2025 - 05:29:34 EST
On Thu 03-07-25 09:26:08, Peter Zijlstra wrote:
> On Thu, Jul 03, 2025 at 12:32:47AM +0800, Chen Yu wrote:
>
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 8988d38d46a3..4e06bb955dad 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3364,7 +3364,14 @@ static void __migrate_swap_task(struct task_struct *p, int cpu)
> > {
> > __schedstat_inc(p->stats.numa_task_swapped);
> > count_vm_numa_event(NUMA_TASK_SWAP);
> > - count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> > + /* exiting task has NULL mm */
> > + if (!(p->flags & PF_EXITING)) {
> > + WARN_ONCE(!p->mm, "swap task %d %s %x has no mm\n",
> > + p->pid, p->comm, p->flags);
> > +
> > + if (p->mm)
> > + count_memcg_event_mm(p->mm, NUMA_TASK_SWAP);
> > + }
>
> Aside from the things already mentioned by Andrew and Michal; why not
> simply do something like:
>
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 87b6688f124a..8396ebfab0d5 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -969,7 +969,7 @@ static inline void count_memcg_events_mm(struct mm_struct *mm,
> {
> struct mem_cgroup *memcg;
>
> - if (mem_cgroup_disabled())
> + if (mem_cgroup_disabled() || !mm)
> return;
This would imply mm check for all other users that know their mm is
valid as they are operating on vma->mm or current task.
But thinking about this some more, this would be racy same as the
PF_EXITING check. This is not my area but is this performance sensitive
path that couldn't live with the proper find_lock_task_mm?
I do not see other race free way to deal with a remote task exit race.
--
Michal Hocko
SUSE Labs