Re: [PATCH 2/8] mm: alloc_contig_freed_pages() added

From: Michal Nazarewicz
Date: Wed Sep 21 2011 - 09:17:57 EST


On Thu, 08 Sep 2011 20:05:52 +0200, Dave Hansen <dave@xxxxxxxxxxxxxxxxxx> wrote:

On Fri, 2011-08-19 at 16:27 +0200, Marek Szyprowski wrote:
+unsigned long alloc_contig_freed_pages(unsigned long start, unsigned long end,
+ gfp_t flag)
+{
+ unsigned long pfn = start, count;
+ struct page *page;
+ struct zone *zone;
+ int order;
+
+ VM_BUG_ON(!pfn_valid(start));
+ zone = page_zone(pfn_to_page(start));

This implies that start->end are entirely contained in a single zone.
What enforces that?

In case of CMA, the __cma_activate_area() function from 6/8 has the check:

151 VM_BUG_ON(!pfn_valid(pfn));
152 VM_BUG_ON(page_zone(pfn_to_page(pfn)) != zone);

This guarantees that CMA will never try to call alloc_contig_freed_pages()
on a region that spans multiple regions.

If some higher layer enforces that, I think we probably need at least
a VM_BUG_ON() in here and a comment about who enforces it.

Agreed.

+ spin_lock_irq(&zone->lock);
+
+ page = pfn_to_page(pfn);
+ for (;;) {
+ VM_BUG_ON(page_count(page) || !PageBuddy(page));
+ list_del(&page->lru);
+ order = page_order(page);
+ zone->free_area[order].nr_free--;
+ rmv_page_order(page);
+ __mod_zone_page_state(zone, NR_FREE_PAGES, -(1UL << order));
+ pfn += 1 << order;
+ if (pfn >= end)
+ break;
+ VM_BUG_ON(!pfn_valid(pfn));
+ page += 1 << order;
+ }

This 'struct page *'++ stuff is OK, but only for small, aligned areas.
For at least some of the sparsemem modes (non-VMEMMAP), you could walk
off of the end of the section_mem_map[] when you cross a MAX_ORDER
boundary. I'd feel a little bit more comfortable if pfn_to_page() was
being done each time, or only occasionally when you cross a section
boundary.

I'm fine with that. I've used pointer arithmetic for performance reasons
but if that may potentially lead to bugs then obviously pfn_to_page() should
be used.

--
Best regards, _ _
.o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o
..o | Computer Science, Michal "mina86" Nazarewicz (o o)
ooo +-----<email/xmpp: mnazarewicz@xxxxxxxxxx>-----ooO--(_)--Ooo--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/