Re: [PATCH v14 4/5] mm: support reporting free page blocks

From: Michal Hocko
Date: Mon Aug 21 2017 - 02:18:22 EST


On Fri 18-08-17 20:23:05, Michael S. Tsirkin wrote:
> On Thu, Aug 17, 2017 at 11:26:55AM +0800, Wei Wang wrote:
[...]
> > +void walk_free_mem_block(void *opaque1,
> > + unsigned int min_order,
> > + void (*visit)(void *opaque2,
>
> You can just avoid opaque2 completely I think, then opaque1 can
> be renamed opaque.
>
> > + unsigned long pfn,
> > + unsigned long nr_pages))
> > +{
> > + struct zone *zone;
> > + struct page *page;
> > + struct list_head *list;
> > + unsigned int order;
> > + enum migratetype mt;
> > + unsigned long pfn, flags;
> > +
> > + for_each_populated_zone(zone) {
> > + for (order = MAX_ORDER - 1;
> > + order < MAX_ORDER && order >= min_order; order--) {
> > + for (mt = 0; mt < MIGRATE_TYPES; mt++) {
> > + spin_lock_irqsave(&zone->lock, flags);
> > + list = &zone->free_area[order].free_list[mt];
> > + list_for_each_entry(page, list, lru) {
> > + pfn = page_to_pfn(page);
> > + visit(opaque1, pfn, 1 << order);
>
> My only concern here is inability of callback to
> 1. break out of list
> 2. remove page from the list

As I've said before this has to be a read only API. You cannot simply
fiddle with the page allocator internals under its feet.

> So I would make the callback bool, and I would use
> list_for_each_entry_safe.

If a bool would tell to break out of the loop then I agree. This sounds
useful.
--
Michal Hocko
SUSE Labs