Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmasof a mergeable VMA

From: Linus Torvalds
Date: Sun Apr 11 2010 - 13:20:52 EST




On Sun, 11 Apr 2010, Linus Torvalds wrote:
>
> But I think the bug you see might be exactly the race between
> page_mapped() and actually getting the anon_vma spinlock. I'd have
> expected that window to be too small to ever hit, though, which is why I
> find it a bit unlikely. But it would explain why you _sometimes_
> actually get a hung spinlock too - you never get the spinlock at all,
> and somebody replaced the data with something that the spinlock code
> thinks is a locked spinlock - but is no longer a spinlock at all ]

Actually, so if it's that race, then we might get rid of the oops with
this total hack.

NOTE! If this is the race, then the hack really is just a hack, because it
doesn't really solve anything. We still take the spinlock, and if bad
things has happened, _that_ can still very much fail, and you get the
watchdog lockup message instead. So this doesn't really fix anything.

But if this patch changes behavior, and you no longer see the oops, that
tells us _something_. I'm not sure how useful that "something" is, but it
at least means that there are no _mapped_ pages that have that stale
anon_vma pointer in page->mapping.

Conversely, if you still see the oops (rather than the watchdog), that
means that we actually have pages that are still marked mapped, and that
despite that mapped state have a stale page->mapping pointer. I actually
find that the more likely case, because otherwise the window is _so_ small
that I don't see how you can hit the oops so reliably.

Anyway - probably worth testing, along with the verify_vma() patch. If
nothing else, if there is no new behavior, even that tells us something.
Even if that "something" is not a huge piece of information.

Linus

---
diff --git a/mm/rmap.c b/mm/rmap.c
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -302,7 +302,11 @@ struct anon_vma *page_lock_anon_vma(struct page *page)

anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
spin_lock(&anon_vma->lock);
- return anon_vma;
+
+ if (page_mapped(page))
+ return anon_vma;
+
+ spin_unlock(&anon_vma->lock);
out:
rcu_read_unlock();
return NULL;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/