Re: [PATCH] mm: be more verbose for alloc_contig_range faliures

From: Minchan Kim
Date: Mon Mar 08 2021 - 15:28:18 EST


On Mon, Mar 08, 2021 at 07:58:11AM -0800, Minchan Kim wrote:
> On Mon, Mar 08, 2021 at 04:42:43PM +0100, Michal Hocko wrote:
> > On Mon 08-03-21 15:13:35, David Hildenbrand wrote:
> > > On 08.03.21 15:11, Michal Hocko wrote:
> > > > On Mon 08-03-21 14:22:12, David Hildenbrand wrote:
> > > > > On 08.03.21 13:49, Michal Hocko wrote:
> > > > [...]
> > > > > > Earlier in the discussion I have suggested dynamic debugging facility.
> > > > > > Documentation/admin-guide/dynamic-debug-howto.rst. Have you tried to
> > > > > > look into that direction?
> > > > >
> > > > > Did you see the previous mail this is based on:
> > > > >
> > > > > https://lkml.kernel.org/r/YEEUq8ZRn4WyYWVx@xxxxxxxxxx
> > > > >
> > > > > I agree that "nofail" is misleading. Rather something like
> > > > > "dump_on_failure", just a better name :)
> > > >
> > > > Yeah, I have read through the email thread. I just do not get why we
> > > > cannot make it pr_debug() and add -DDYNAMIC_DEBUG_MODULE for
> > > > page_alloc.c (I haven't checked whether that is possible for built in
> > > > compile units, maybe it is not but from a quick seems it should).
> > > >
> > > > I really do not like this to be a part of the API. alloc_contig_range is
> > >
> > > Which API?
> >
> > Any level of the alloc_contig_range api because I strongly suspect that
> > once there is something on the lower levels there will be a push to have
> > it in the directly consumed api as well. Besides that I think this is
> > just a wrong way to approach the problem.
> >
> > > It does not affect alloc_contig_range() itself, it's used
> > > internally only. Sure, we could simply pr_debug() for each and every
> > > migration failure. As long as it's default-disabled, sure.
> > >
> > > I do agree that we should look into properly including this into the dynamic
> > > debugging ifrastructure.
> >
> > Yeah, unless we learn this is not feasible for some reason, which I do
> > not see right now, then let's just make it pr_debug with the runtime
> > control.
>
> What do you see the problem? It's the dynamic debugging facility
> to enable only when admin want to use it. Otherwise, it's nop
> unless is't not enabled. Furthermore, it doesn't need to invent
> custom dump_page implementation(including dump_page_owner) by
> chaning pr_debug.
> Could you clarify your requirement?
>
> https://lore.kernel.org/linux-mm/YEEUq8ZRn4WyYWVx@xxxxxxxxxx/
>
> Since David agreed to drop nofail option in the API, I will
> keep the URL patch.

I posted formal patch with Ccing dynamic debug maintainer.
https://lore.kernel.org/linux-mm/20210308202047.1903802-1-minchan@xxxxxxxxxx/

Let's discuss stuff related to dynamic debug there.