Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

From: Minchan Kim
Date: Mon Mar 08 2021 - 10:59:17 EST


On Mon, Mar 08, 2021 at 04:42:43PM +0100, Michal Hocko wrote:
> On Mon 08-03-21 15:13:35, David Hildenbrand wrote:
> > On 08.03.21 15:11, Michal Hocko wrote:
> > > On Mon 08-03-21 14:22:12, David Hildenbrand wrote:
> > > > On 08.03.21 13:49, Michal Hocko wrote:
> > > [...]
> > > > > Earlier in the discussion I have suggested dynamic debugging facility.
> > > > > Documentation/admin-guide/dynamic-debug-howto.rst. Have you tried to
> > > > > look into that direction?
> > > >
> > > > Did you see the previous mail this is based on:
> > > >
> > > > https://lkml.kernel.org/r/YEEUq8ZRn4WyYWVx@xxxxxxxxxx
> > > >
> > > > I agree that "nofail" is misleading. Rather something like
> > > > "dump_on_failure", just a better name :)
> > >
> > > Yeah, I have read through the email thread. I just do not get why we
> > > cannot make it pr_debug() and add -DDYNAMIC_DEBUG_MODULE for
> > > page_alloc.c (I haven't checked whether that is possible for built in
> > > compile units, maybe it is not but from a quick seems it should).
> > >
> > > I really do not like this to be a part of the API. alloc_contig_range is
> >
> > Which API?
>
> Any level of the alloc_contig_range api because I strongly suspect that
> once there is something on the lower levels there will be a push to have
> it in the directly consumed api as well. Besides that I think this is
> just a wrong way to approach the problem.
>
> > It does not affect alloc_contig_range() itself, it's used
> > internally only. Sure, we could simply pr_debug() for each and every
> > migration failure. As long as it's default-disabled, sure.
> >
> > I do agree that we should look into properly including this into the dynamic
> > debugging ifrastructure.
>
> Yeah, unless we learn this is not feasible for some reason, which I do
> not see right now, then let's just make it pr_debug with the runtime
> control.

What do you see the problem? It's the dynamic debugging facility
to enable only when admin want to use it. Otherwise, it's nop
unless is't not enabled. Furthermore, it doesn't need to invent
custom dump_page implementation(including dump_page_owner) by
chaning pr_debug.
Could you clarify your requirement?

https://lore.kernel.org/linux-mm/YEEUq8ZRn4WyYWVx@xxxxxxxxxx/

Since David agreed to drop nofail option in the API, I will
keep the URL patch.