Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmas of a mergeable VMA

From: KOSAKI Motohiro
Date: Tue Apr 13 2010 - 06:37:01 EST


Hi Linus,

> On Sun, 11 Apr 2010, Rik van Riel wrote:
> >
> > Another thing I just thought of.
> >
> > The anon_vma struct will not be reused for something completely
> > different due to the SLAB_DESTROY_BY_RCU flag that the anon_vma_cachep
> > is created with.
>
> Rik, we _know_ it got re-used by something totally different. That's
> clearly the problem. The page->mapping pointer does _not_ point to an
> anon_vma any more. That's the problem here.
>
> What we need to figure out is how we have a page on the LRU list that is
> still marked as 'mapped' that has that stale mapping pointer.
>
> I can easily see how the stale mapping pointer happens for a non-mapped
> page. That part is trivial. Here's a simple case:
>
> - vmscan does that whole "isolate LRU pages", and one of them is a (at
> that time mapped) anonymous page. It's now not on any LRU lists at all.
>
> - vmscan ends up waiting for pageout and/or writeback while holding that
> list of pages.
>
> - in the meantime, the process that had the page exists or unmaps,
> unmapping the page and freeing the vma and the anon_vma.
>
> - vmscan eventually gets to the page, and does that page_referenced()
> dance. page->mapping points to something that is long long gone (as in
> "IO access lifetimes", so we're talking something that has been freed
> literally milliseconds ago, rather than any RCU delays)
>
> So I can see the stale page->mapping pointer happening. That part is even
> trivial. What I don't see is how the page would be still marked 'mapped'.
> Everything that actually free's the vma/anon_vmas should also have
> unmapped the page before that - even if it didn't _free_ the page.

Sorry, Now I'm lost what discuss in this crazy long thread.
IIUC, If the page->mapping was freed millisecns ago, following (1)
check returen false and we never touch page->mapping literally.

Am I missing something?


===================================================================
struct anon_vma *page_lock_anon_vma(struct page *page)
{
struct anon_vma *anon_vma;
unsigned long anon_mapping;

rcu_read_lock();
anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
goto out;
if (!page_mapped(page)) /* (1) here */
goto out;

anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
spin_lock(&anon_vma->lock);
return anon_vma;
out:
rcu_read_unlock();
return NULL;
}
=================================================


And, I think your following patch seems incorrect.
The added page_mapped() is called after spinlock(anon_vma->lock),
it mean check-after-dereference. such check doesn't prevent invalid
pointer dereference, I think.

perhaps, I'm missing anything. I have to reread this thread at all from
first.

---
diff --git a/mm/rmap.c b/mm/rmap.c
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -302,7 +302,11 @@ struct anon_vma *page_lock_anon_vma(struct page *page)

anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
spin_lock(&anon_vma->lock);
- return anon_vma;
+
+ if (page_mapped(page))
+ return anon_vma;
+
+ spin_unlock(&anon_vma->lock);
out:
rcu_read_unlock();
return NULL;







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/