Re: [PATCH 2/2] Make is_mem_section_removable more conformable withofflining code

From: KAMEZAWA Hiroyuki
Date: Fri Sep 03 2010 - 06:10:31 EST


On Fri, 3 Sep 2010 11:50:49 +0200
Michal Hocko <mhocko@xxxxxxx> wrote:

> On Fri 03-09-10 18:13:27, KAMEZAWA Hiroyuki wrote:
> > On Fri, 3 Sep 2010 10:25:58 +0200
> > Michal Hocko <mhocko@xxxxxxx> wrote:
> >
> > > On Fri 03-09-10 12:14:52, KAMEZAWA Hiroyuki wrote:
> > > [...]
> [...]
> > > Cannot ZONE_MOVABLE contain different MIGRATE_types?
> > >
> > never.
>
> Then I am terribly missing something. Zone contains free lists for
> different MIGRATE_TYPES, doesn't it? Pages allocated from those free
> lists keep the migration type of the list, right?
>
> ZONE_MOVABLE just says whether it makes sense to move pages in that zone
> at all, right?
>

ZONE_MOVABLE means "it only contains MOVABLE pages."
So, we can ignore migrate-type.





> >
> > > > +
> > > > + pfn = page_to_pfn(page);
> > > > + for (found = 0, iter = 0; iter < pageblock_nr_pages; iter++) {
> > > > + unsigned long check = pfn + iter;
> > > > +
> > > > + if (!pfn_valid_within(check)) {
> > > > + iter++;
> > > > + continue;
> > > > + }
> > > > + page = pfn_to_page(check);
> > > > + if (!page_count(page)) {
> > > > + if (PageBuddy(page))
> > >
> > > Why do you check page_count as well? PageBuddy has alwyas count==0,
> > > right?
> > >
> >
> > But PageBuddy() flag is considered to be valid only when page_count()==0.
> > This is for safe handling.
>
> OK. I don't see that documented anywhere but it makes sense. Anyway
> there are some places which don't do this test (e.g.
> isolate_freepages_block, suitable_migration_target, etc.).
>
> >
> >
> > > > + iter += (1 << page_order(page)) - 1;
> > > > + continue;
> > > > + }
> > > > + if (!PageLRU(page))
> > > > + found++;
> > > > + /*
> > > > + * If the page is not RAM, page_count()should be 0.
> > > > + * we don't need more check. This is an _used_ not-movable page.
> > > > + *
> > > > + * The problematic thing here is PG_reserved pages. But if
> > > > + * a PG_reserved page is _used_ (at boot), page_count > 1.
> > > > + * But...is there PG_reserved && page_count(page)==0 page ?
> > >
> > > Can we have PG_reserved && PG_lru?
> >
> > I think never.
> >
> > > I also quite don't understand the comment.
> >
> > There an issue that "remove an memory section which includes memory hole".
> > Then,
> >
> > a page used by bootmem .... PG_reserved.
> > a page of memory hole .... PG_reserved.
> >
> > We need to call page_is_ram() or some for handling this mess.
>
> OK, I see.
>
> >
> >
> > > At this place we are sure that the page is valid and neither
> > > free nor LRU.
> > >
> [...]
> > > > +bool is_pageblock_removable(struct page *page)
> > > > +{
> > > > + struct zone *zone = page_zone(page);
> > > > + unsigned long flags;
> > > > + int num;
> > > > +
> > > > + spin_lock_irqsave(&zone->lock, flags);
> > > > + num = __count_unmovable_pages(zone, page);
> > > > + spin_unlock_irqrestore(&zone->lock, flags);
> > >
> > > Isn't this a problem? The function is triggered from userspace by sysfs
> > > (0444 file) and holds the lock for pageblock_nr_pages. So someone can
> > > simply read the file and block the zone->lock preventing/delaying
> > > allocations for the rest of the system.
> > >
> > But we need to take this. Maybe no panic you'll see even if no-lock.
>
> Yes, I think that this can only lead to a false possitive in sysfs
> interface. Isolating code holds the lock.
>

ok, let's go step by step.

I'm ok that your new patch to be merged. I'll post some clean up and small
bugfix (not related to your patch), later.
(I'll be very busy in this weekend, sorry.)


Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/