Re: [patch] mm: fix anon_vma races

From: Nick Piggin
Date: Fri Oct 17 2008 - 21:53:48 EST


On Sat, Oct 18, 2008 at 01:13:16AM +0100, Hugh Dickins wrote:
> On Fri, 17 Oct 2008, Linus Torvalds wrote:
> > would be more obvious in the place where we actually fetch that "anon_vma"
> > pointer again and actually derefernce it.
> >
> > HOWEVER:
> >
> > - there are potentially multiple places that do that, and putting it in
> > the anon_vma_prepare() thing not only matches things with the
> > smp_wmb(), making that whole pairing much more obvious, but it also
> > means that we're guaranteed that any anon_vma user will have done the
> > smp_read_barrier_depends(), since they all have to do that prepare
> > thing anyway.
>
> No, it's not so that any anon_vma user would have done the
> smp_read_barrier_depends() placed in anon_vma_prepare().
>
> Anyone faulting in a page would have done it (swapoff? that
> assumes it's been done, let's not worry about it right now).
>
> But they're doing it to make the page's ptes accessible to
> memory reclaim, and the CPU doing memory reclaim will not
> (unless by coincidence) have done that anon_vma_prepare() -
> it's just reading the links which the faulters are providing.

Yes, that's a very important flaw you point out with the fix. Good
spotting.

Actually another thing I was staying awake thinking about was the
pairwise consistency problem. "Apparently" Linux is supposed to
support arbitrary pairwise consistency.

This means.
CPU0
anon_vma.initialized = 1;
smp_wmb()
vma->anon_vma = anon_vma;

CPU1
if (vma->anon_vma)
page->anon_vma = vma->anon_vma;

CPU2
if (page->anon_vma) {
smp_read_barrier_depends();
assert(page->anon_vma.initialized);
}

The assertion may trigger because the store from CPU0 may not have
propograted to CPU2 before the stores from CPU1.

But after thinking about this a bit more, I think Linux would be
broken all over the map under such ordering schemes. I think we'd
have to mandate causal consistency. Are there any architectures we
run on where this is not guaranteed? (I think recent clarifications
to x86 ordering give us CC on that architecture).

powerpc, ia64, alpha, sparc, arm, mips? (cced linux-arch)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/