[PATCH] anon_vma list locking bug

From: Hugh Dickins
Date: Fri Jun 25 2004 - 17:37:32 EST


Vladimir Saveliev reported anon_vma_unlink list_del BUG (LKML 24 June).
His testing is still in progress, but we believe it comes from a nasty
locking deficiency I introduced in 2.6.7's anon_vma_prepare.

Andrea's original anon_vma_prepare was fine, it needed no anon_vma lock
because it was always linking a freshly allocated structure; but my
find_mergeable enhancement let it adopt a neighbouring anon_vma, which
of course needs locking against a racing linkage from another mm -
which the earlier adjust_vma fix seems to have made more likely.

Does anon_vma->lock nest inside or outside page_table_lock? Inside, but
that's not obvious without a lock ordering list: instead of listing the
order here, update the list in filemap.c; but a separate patch because
that's less urgent and more likely to get wrong or provoke controversy.

(Could do it with anon_vma lock after dropping page_table_lock, but
a long comment explaining why some code is safe suggests it's not.)

Signed-off-by: Hugh Dickins <hugh@xxxxxxxxxxx>

--- 2.6.7/mm/rmap.c 2004-06-16 06:21:02.000000000 +0100
+++ linux/mm/rmap.c 2004-06-25 22:15:41.164841904 +0100
@@ -18,14 +18,11 @@
*/

/*
- * Locking:
- * - the page->mapcount field is protected by the PG_maplock bit,
- * which nests within the mm->page_table_lock,
- * which nests within the page lock.
- * - because swapout locking is opposite to the locking order
- * in the page fault path, the swapout path uses trylocks
- * on the mm->page_table_lock
+ * Locking: see "Lock ordering" summary in filemap.c.
+ * In swapout, page_map_lock is held on entry to page_referenced and
+ * try_to_unmap, so they trylock for i_mmap_lock and page_table_lock.
*/
+
#include <linux/mm.h>
#include <linux/pagemap.h>
#include <linux/swap.h>
@@ -79,8 +76,12 @@ int anon_vma_prepare(struct vm_area_stru
/* page_table_lock to protect against threads */
spin_lock(&mm->page_table_lock);
if (likely(!vma->anon_vma)) {
+ if (!allocated)
+ spin_lock(&anon_vma->lock);
vma->anon_vma = anon_vma;
list_add(&vma->anon_vma_node, &anon_vma->head);
+ if (!allocated)
+ spin_unlock(&anon_vma->lock);
allocated = NULL;
}
spin_unlock(&mm->page_table_lock);

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/