When isolating pages for migration, migration starts at the start of a[...]
zone while the free scanner starts at the end of the zone. Migration
avoids entering a new zone by never going beyond the free scanned.
Unfortunately, in very rare cases nodes can overlap. When this happens,
migration isolates pages without the LRU lock held, corrupting lists
which will trigger errors in reclaim or during page free such as in the
following oops
The fix is straight-forward. isolate_migratepages() has to make a
similar check to isolate_freepage to ensure that it never isolates
pages from a zone it does not hold the LRU lock for.
This was discovered in a 3.0-based kernel but it affects 3.1.x, 3.2.x
and current mainline.
Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
Cc: <stable@xxxxxxxxxxxxxxx>
---
mm/compaction.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/mm/compaction.c b/mm/compaction.c
index bd6e739..6042644 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -330,8 +330,17 @@ static isolate_migrate_t isolate_migratepages(struct zone *zone,
continue;
nr_scanned++;
- /* Get the page and skip if free */
+ /*
+ * Get the page and ensure the page is within the same zone.
+ * See the comment in isolate_freepages about overlapping
+ * nodes. It is deliberate that the new zone lock is not taken
+ * as memory compaction should not move pages between nodes.
+ */
page = pfn_to_page(low_pfn);
+ if (page_zone(page) != zone)
+ continue;
+
+ /* Skip if free */
if (PageBuddy(page))
continue;