Re: [PATCH v2 1/1] mm: fix the theoretical compound_lock() vsprep_new_page() race

From: Oleg Nesterov
Date: Thu Jan 09 2014 - 14:45:34 EST


On 01/09, Andrea Arcangeli wrote:
>
> On Thu, Jan 09, 2014 at 03:04:47PM +0100, Oleg Nesterov wrote:
> > OK. Even if I am right, we can probably make another fix.
>
> I think the confusion here was to think this was related to the futex
> code, it isn't. This was just a generic theoretical problem found
> doing the futex cleanups but it's not related to the futex code.

Yes, yes, sure. I mentioned get_futex_key() just for example.

> > put_compound_page() and __get_page_tail() can do yet another PageTail()
> > check _before_ compound_lock().
>
> The above alternate fix looks good to me too.
>
> Only thing to sort out is in the common code (not just x86) then we
> may need a smp_mb() between PageTail check and the bit_spin_lock... We
> just can't risk writing the bit_spin_lock before reading PageTail.

I do not think we need mb() in between... other callers of compound_lock()
look fine, get/put(page_tail) can't have the false positive after successful
get_page_unless_zero(), and recently it was documented that the kernel can
rely on the control dependency to serialize LOAD + STORE.

But we probably need barrier() in between, we can't use ACCESS_ONCE().

> And regardless of gup_fast, like Linus said, for increased NUMA
> fairness we could move the compound lock from page->flags to an hashed
> array of proper spinlocks sized in function of ram. The contention on
> these locks is so low that I doubt we can run into lock starvation,
> but because the contention is so low, the array would be fine as well,
> and it would be more theoretically correct for NUMA usages than the
> bit spinlock. So this problem also goes away if we convert the
> bit_spin_lock to an hashed array of spin_lock.

Yes. But in this case I really think we should cleanup get/put first
and add the helper, like the patch I mentioned does.

> I personally prefer to keep the complexity in one place so adding to
> get/put_page

OK. I'll send v3.

> > Although personally I'd prefer this patch. And if we change get/put
> > I think it would be better to do this on top of
> >
> > "[PATCH -mm 6/7] mm: thp: introduce get_lock_thp_head()"
> > http://marc.info/?l=linux-kernel&m=138739438800899
>
> Not against the cleanups of course, but about the order, it gets
> harder to backport it for distros if applied after the cleanups.

Oh, I don't think this highly theoreitical fix should be backported
but I agree, lets fix the bug first.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/