Re: [regression] cpuset,mm: update tasks' mems_allowed in time(58568d2)

From: David Rientjes
Date: Wed Feb 24 2010 - 16:06:58 EST


On Wed, 24 Feb 2010, Miao Xie wrote:

> >> Sorry, Could you explain what you advised?
> >> I think it is hard to fix this problem by adding a variant, because it is
> >> hard to avoid loading a word of the mask before
> >>
> >> nodes_or(tsk->mems_allowed, tsk->mems_allowed, *newmems);
> >>
> >> and then loading another word of the mask after
> >>
> >> tsk->mems_allowed = *newmems;
> >>
> >> unless we use lock.
> >>
> >> Maybe we need a rw-lock to protect task->mems_allowed.
> >>
> >
> > I meant that we need to define synchronization only for configurations
> > that do not do atomic nodemask_t stores, it's otherwise unnecessary.
> > We'll need to load and store tsk->mems_allowed via a helper function that
> > is defined to take the rwlock for such configs and only read/write the
> > nodemask for others.
> >
>
> By investigating, we found that it is hard to guarantee the consistent between
> mempolicy and mems_allowed because mempolicy was designed as a self-update function.
> it just can be changed by one's self. Maybe we must change the implement of mempolicy.
>

Before your change, cpuset nodemask changes were serialized on
manage_mutex which would, in turn, serialize the rebinding of each
attached task's mempolicy. update_nodemask() is now serialized on
cgroup_lock(), which also protects scan_for_empty_cpusets(), so the cpuset
code protects it adequately. If a concurrent mempolicy change from a
user's set_mempolicy() happens, however, it could introduce an
inconsistency between them.

If we protect current->mems_allowed with a rwlock or seqlock for configs
where MAX_NUMNODES > BITS_PER_LONG, then we can always guarantee that we
get the entire nodemask. The same problem is present for
current->cpus_allowed, however, with NR_CPUS > BITS_PER_LONG. We must be
able to safely dereference both masks without the chance of returning
nodes_empty() or cpus_empty().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/