Re: [PATCH] cpuset: mm: Remove memory barrier damage from the pageallocator

From: Mel Gorman
Date: Fri Mar 02 2012 - 12:43:51 EST


On Fri, Mar 02, 2012 at 10:19:55AM -0600, Christoph Lameter wrote:
> On Fri, 2 Mar 2012, Mel Gorman wrote:
>
> > diff --git a/include/linux/cpuset.h b/include/linux/cpuset.h
> > index e9eaec5..ba6d217 100644
> > --- a/include/linux/cpuset.h
> > +++ b/include/linux/cpuset.h
> > @@ -92,38 +92,25 @@ extern void cpuset_print_task_mems_allowed(struct task_struct *p);
> > * reading current mems_allowed and mempolicy in the fastpath must protected
> > * by get_mems_allowed()
> > */
> > -static inline void get_mems_allowed(void)
> > +static inline unsigned long get_mems_allowed(void)
> > {
> > - current->mems_allowed_change_disable++;
> > -
> > - /*
> > - * ensure that reading mems_allowed and mempolicy happens after the
> > - * update of ->mems_allowed_change_disable.
> > - *
> > - * the write-side task finds ->mems_allowed_change_disable is not 0,
> > - * and knows the read-side task is reading mems_allowed or mempolicy,
> > - * so it will clear old bits lazily.
> > - */
> > - smp_mb();
> > + return atomic_read(&current->mems_allowed_seq);
> > }
> >
> > -static inline void put_mems_allowed(void)
> > +/*
> > + * If this returns false, the operation that took place after get_mems_allowed
> > + * may have failed. It is up to the caller to retry the operation if
> > + * appropriate
> > + */
> > +static inline bool put_mems_allowed(unsigned long seq)
> > {
> > - /*
> > - * ensure that reading mems_allowed and mempolicy before reducing
> > - * mems_allowed_change_disable.
> > - *
> > - * the write-side task will know that the read-side task is still
> > - * reading mems_allowed or mempolicy, don't clears old bits in the
> > - * nodemask.
> > - */
> > - smp_mb();
> > - --ACCESS_ONCE(current->mems_allowed_change_disable);
> > + return seq == atomic_read(&current->mems_allowed_seq);
> > }
>
> Use seqlock instead of the counter? Seems that you are recoding much of
> what a seqlock does. A seqlock also allows you to have a writer that sort
> of blocks the reades if necessary.
>

I considered using a seqlock but it isn't cheap. The read side is heavy
with the possibility that it starts spinning and incurs a read barrier
(looking at read_seqbegin()) here. The retry block incurs another read
barrier so basically it would not be no better than what is there currently
(which at a 4% performance hit, sucks)

In the case of seqlocks, a reader will backoff if a writer is in progress
but the page allocator doesn't need that which is why I felt it was ok
to special case. Instead, it will try allocate a page while the update
is in progress and only take special action if the allocation will fail.
Allocation failure is an unusual situation that can trigger application
exit or an OOM so it's ok to treat it as a slow path. A normal seqlock
would retry unconditionally and potentially have to handle the case
where it needs to free the page before retrying which is pointless.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/