Re: [HMM v13 03/18] mm/ZONE_DEVICE/free_hot_cold_page: catch ZONE_DEVICE pages

From: Jerome Glisse
Date: Mon Nov 21 2016 - 07:50:37 EST


On Mon, Nov 21, 2016 at 01:48:26PM +0530, Anshuman Khandual wrote:
> On 11/18/2016 11:48 PM, Jérôme Glisse wrote:
> > Catch page from ZONE_DEVICE in free_hot_cold_page(). This should never
> > happen as ZONE_DEVICE page must always have an elevated refcount.
> >
> > This is to catch refcounting issues in a sane way for ZONE_DEVICE pages.
> >
> > Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx>
> > Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
> > Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
> > ---
> > mm/page_alloc.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 0fbfead..09b2630 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -2435,6 +2435,16 @@ void free_hot_cold_page(struct page *page, bool cold)
> > unsigned long pfn = page_to_pfn(page);
> > int migratetype;
> >
> > + /*
> > + * This should never happen ! Page from ZONE_DEVICE always must have an
> > + * active refcount. Complain about it and try to restore the refcount.
> > + */
> > + if (is_zone_device_page(page)) {
> > + VM_BUG_ON_PAGE(is_zone_device_page(page), page);
> > + page_ref_inc(page);
> > + return;
> > + }
>
> This fixes an issue in the existing ZONE_DEVICE code, should not this
> patch be sent separately not in this series ?
>

Well this is more like a safetynet feature, i can send it separately from the
series. It is not an issue per say as a trap to catch bugs. I had refcounting
bugs while working on this patchset and having this safetynet was helpful to
quickly pin-point issues.

Cheers,
Jérôme