Re: [HMM v13 03/18] mm/ZONE_DEVICE/free_hot_cold_page: catch ZONE_DEVICE pages

From: Anshuman Khandual
Date: Mon Nov 21 2016 - 23:31:16 EST


On 11/21/2016 06:20 PM, Jerome Glisse wrote:
> On Mon, Nov 21, 2016 at 01:48:26PM +0530, Anshuman Khandual wrote:
>> On 11/18/2016 11:48 PM, Jérôme Glisse wrote:
>>> Catch page from ZONE_DEVICE in free_hot_cold_page(). This should never
>>> happen as ZONE_DEVICE page must always have an elevated refcount.
>>>
>>> This is to catch refcounting issues in a sane way for ZONE_DEVICE pages.
>>>
>>> Signed-off-by: Jérôme Glisse <jglisse@xxxxxxxxxx>
>>> Cc: Dan Williams <dan.j.williams@xxxxxxxxx>
>>> Cc: Ross Zwisler <ross.zwisler@xxxxxxxxxxxxxxx>
>>> ---
>>> mm/page_alloc.c | 10 ++++++++++
>>> 1 file changed, 10 insertions(+)
>>>
>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>> index 0fbfead..09b2630 100644
>>> --- a/mm/page_alloc.c
>>> +++ b/mm/page_alloc.c
>>> @@ -2435,6 +2435,16 @@ void free_hot_cold_page(struct page *page, bool cold)
>>> unsigned long pfn = page_to_pfn(page);
>>> int migratetype;
>>>
>>> + /*
>>> + * This should never happen ! Page from ZONE_DEVICE always must have an
>>> + * active refcount. Complain about it and try to restore the refcount.
>>> + */
>>> + if (is_zone_device_page(page)) {
>>> + VM_BUG_ON_PAGE(is_zone_device_page(page), page);
>>> + page_ref_inc(page);
>>> + return;
>>> + }
>>
>> This fixes an issue in the existing ZONE_DEVICE code, should not this
>> patch be sent separately not in this series ?
>>
>
> Well this is more like a safetynet feature, i can send it separately from the
> series. It is not an issue per say as a trap to catch bugs. I had refcounting
> bugs while working on this patchset and having this safetynet was helpful to
> quickly pin-point issues.

Sure at the least move them up in the series as ZONE_DEVICE preparatory
fixes before expanding ZONE_DEVICE framework to accommodate the new
un-addressable memory representation.