Re: [PATCH 1/6] mm: Replace PF_MEMALLOC_NOIO with memalloc_noio

From: Matthew Wilcox
Date: Thu Jun 25 2020 - 08:34:38 EST


On Thu, Jun 25, 2020 at 02:22:39PM +0200, Michal Hocko wrote:
> On Thu 25-06-20 12:31:17, Matthew Wilcox wrote:
> > We're short on PF_* flags, so make memalloc_noio its own bit where we
> > have plenty of space.
>
> I do not mind moving that outside of the PF_* space. Unless I
> misremember all flags in this space were intented to be set only on the
> current which rules out any RMW races and therefore they can be
> lockless. I am not sure this holds for the bitfield you are adding this
> to. At least in_memstall seem to be set on external task as well. But
> this would require double checking. Maybe that is not really intended or
> just a bug.

I was going from the comment:

/* Unserialized, strictly 'current' */
(which you can't see from the context of the diff, but is above the block)

The situation with ->flags is a little more ambiguous:

/*
* Only the _current_ task can read/write to tsk->flags, but other
* tasks can access tsk->flags in readonly mode for example
* with tsk_used_math (like during threaded core dumping).
* There is however an exception to this rule during ptrace
* or during fork: the ptracer task is allowed to write to the
* child->flags of its traced child (same goes for fork, the parent
* can write to the child->flags), because we're guaranteed the
* child is not running and in turn not changing child->flags
* at the same time the parent does it.
*/

but it wasn't unsafe to use the PF_ flags in the way that you were.
It's just crowded.

If in_memstall is set on other tasks, then it should be moved to the
PFA flags, which there are plenty of.

But a quick grep shows it only being read on other tasks and always
set on current:

kernel/sched/psi.c: *flags = current->in_memstall;
kernel/sched/psi.c: * in_memstall setting & accounting needs to be atomic wrt
kernel/sched/psi.c: current->in_memstall = 1;
kernel/sched/psi.c: * in_memstall clearing & accounting needs to be atomic wrt
kernel/sched/psi.c: current->in_memstall = 0;
kernel/sched/psi.c: if (task->in_memstall)
kernel/sched/stats.h: if (p->in_memstall)
kernel/sched/stats.h: if (p->in_memstall)
kernel/sched/stats.h: if (unlikely(p->in_iowait || p->in_memstall)) {
kernel/sched/stats.h: if (p->in_memstall)
kernel/sched/stats.h: if (unlikely(rq->curr->in_memstall))

so I think everything is fine.