Re: [PATCH v3 01/12] mm: dump_page(): better diagnostics for compound pages

From: John Hubbard
Date: Mon Feb 03 2020 - 14:51:16 EST


On 2/3/20 5:16 AM, Kirill A. Shutemov wrote:
> On Fri, Jan 31, 2020 at 07:40:18PM -0800, John Hubbard wrote:
>> A compound page collects the refcount in the head page, while leaving
>> the refcount of each tail page at zero. Therefore, when debugging a
>> problem that involves compound pages, it's best to have diagnostics that
>> reflect that situation. However, dump_page() is oblivious to these
>> points.
>>
>> Change dump_page() as follows:
>>
>> 1) For tail pages, print relevant head page information: refcount, in
>> particular. But only do this if the page is not corrupted so badly
>> that the pointer to the head page is all wrong.
>>
>> 2) Do a separate check to catch any (rare) cases of the tail page's
>> refcount being non-zero, and issue a separate, clear pr_warn() if
>> that ever happens.
>>
>> Suggested-by: Matthew Wilcox <willy@xxxxxxxxxxxxx>
>> Suggested-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
>> Signed-off-by: John Hubbard <jhubbard@xxxxxxxxxx>
>
> Acked-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>

Thanks for looking through all of these!

>
> Few nit-picks below.
>
>> ---
>> mm/debug.c | 34 ++++++++++++++++++++++++++++------
>> 1 file changed, 28 insertions(+), 6 deletions(-)
>>
>> diff --git a/mm/debug.c b/mm/debug.c
>> index ecccd9f17801..beb1c59d784b 100644
>> --- a/mm/debug.c
>> +++ b/mm/debug.c
>> @@ -42,6 +42,32 @@ const struct trace_print_flags vmaflag_names[] = {
>> {0, NULL}
>> };
>>
>> +static void __dump_tail_page(struct page *page, int mapcount)
>> +{
>> + struct page *head = compound_head(page);
>> +
>> + if ((page < head) || (page >= head + MAX_ORDER_NR_PAGES)) {
>
> I'm not sure if we want to use compound_nr() here instead of
> MAX_ORDER_NR_PAGES. Do you have any reasonaing about it?


Yes: compound_nr(page) reads from the struct page, whereas MAX_ORDER_NR_PAGES
is an independent, immutable limit. When checking a struct page for corruption,
it's ideal to avoid relying on data within the struct page, as compound_nr()
would have to do.


>
>> + /*
>> + * Page is hopelessly corrupted, so limit any reporting to
>> + * information about the page itself. Do not attempt to look at
>> + * the head page.
>> + */
>> + pr_warn("page:%px refcount:%d mapcount:%d mapping:%px "
>> + "index:%#lx (corrupted tail page case)\n",
>> + page, page_ref_count(page), mapcount, page->mapping,
>> + page_to_pgoff(page));
>> + } else {
>> + pr_warn("page:%px compound refcount:%d mapcount:%d mapping:%px "
>> + "index:%#lx compound_mapcount:%d\n",
>> + page, page_ref_count(head), mapcount, head->mapping,
>> + page_to_pgoff(head), compound_mapcount(page));
>> + }
>> +
>> + if (page_ref_count(page) != 0)
>> + pr_warn("page:%px PROBLEM: non-zero refcount (==%d) on this "
>> + "tail page\n", page, page_ref_count(page));
>
> Wrap into {}, please.


Fixed, thanks.


thanks,
--
John Hubbard
NVIDIA