Re: [PATCH -v2] rmap: make anon_vma_prepare link in all theanon_vmas of a mergeable VMA

From: Borislav Petkov
Date: Sat Apr 10 2010 - 15:00:37 EST


From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Date: Sat, Apr 10, 2010 at 11:21:39AM -0700

> On Sat, 10 Apr 2010, Linus Torvalds wrote:
> > On Sat, 10 Apr 2010, Borislav Petkov wrote:
> > >
> > > And I got an oops again, this time the #GP from couple of days ago.
> >
> > Oh damn. So the list corruption really does happen still.
>
> Ho humm.
>
> Maybe I'm crazy, but something started bothering me. And I started
> wondering: when is the 'page->mapping' of an anonymous page actually
> cleared?
>
> The thing is, the mapping of an anonymous page is actually cleared only
> when the page is _freed_, in "free_hot_cold_page()".
>
> Now, let's think about that. And in particular, let's think about how that
> relates to the freeing of the 'anon_vma' that the page->mapping points to.
>
> The way the anon_vma is freed is when the mapping is torn down, and we do
> roughly:
>
> tlb = tlb_gather_mmu(mm,..)
> ..
> unmap_vmas(&tlb, vma ..
> ..
> free_pgtables()
> ..
> tlb_finish_mmu(tlb, start, end);
>
> and we actually unmap all the pages in "unmap_vmas()", and then _after_
> unmapping all the pages we do the "unlink_anon_vmas(vma);" in
> "free_pgtables()". Fine so far - the anon_vma stay around until after the
> page has been happily unmapped.
>
> But "unmapped all the pages" is _not_ actually the same as "free'd all the
> pages". The actual _freeing_ of the page happens generally in
> tlb_finish_mmu(), because we can free the page only after we've flushed
> any TLB entries.
>
> So what we have in that tlb_gather structure is a list of _pending_ pages
> to be freed, while we already actually free'd the anon_vmas earlier!
>
> Now, the thing is, tlb_gather_mmu() begins a preempt-safe region (because
> we use a per-cpu variable), but as far as I can tell it is _not_ an
> RCU-safe region.
>
> So I think we might actually get a real RCU freeing event while this all
> happens. So now the 'anon_vma' that 'page->mapping' points to has not just
> been released back to the SLUB caches, the page itself might have been
> released too.

So, if I understand you correctly, the list_head anon_vma gets freed
_before_ the page descriptor itself, therefore we still get a valid
page->mapping in page_lock_anon_vma(). Maybe that explains the funny
patterns in %r13. But how do they come to exist when the anon_vma is
freed, shouldn't there be LIST_POISON or something recognizable?

Anyways, testing...

--
Regards/Gruss,
Boris.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/