Re: [PATCH v2] mm/page_isolation: fix a deadlock with printk()

From: Qian Cai
Date: Tue Oct 08 2019 - 10:03:06 EST


On Tue, 2019-10-08 at 15:42 +0200, Petr Mladek wrote:
> On Tue 2019-10-08 09:23:52, Qian Cai wrote:
> > On Tue, 2019-10-08 at 09:13 -0400, Steven Rostedt wrote:
> > > On Tue, 8 Oct 2019 10:15:10 +0200
> > > Petr Mladek <pmladek@xxxxxxxx> wrote:
> > >
> > > > There are basically three possibilities:
> > > >
> > > > 1. Do crazy exercises with locks all around the kernel to
> > > > avoid the deadlocks. It is usually not worth it. And
> > > > it is a "whack a mole" approach.
> > > >
> > > > 2. Use printk_deferred() in problematic code paths. It is
> > > > a "whack a mole" approach as well. And we would end up
> > > > with printk_deferred() used almost everywhere.
> > > >
> > > > 3. Always deffer the console handling in printk(). This would
> > > > help also to avoid soft lockups. Several people pushed
> > > > against this last few years because it might reduce
> > > > the chance to see the message in case of system crash.
> > > >
> > > > As I said, there has finally been agreement to always do
> > > > the offload few weeks ago. John Ogness is working on it.
> > > > So we might have the systematic solution for these deadlocks
> > > > rather sooner than later.
> > >
> > > Another solution is to add the printk_deferred() in these places that
> > > cause lockdep splats, and when John's work is done, it would be easy to
> > > grep for them and remove them as they would no longer be needed.
> > >
> > > This way we don't play whack-a-mole forever (only until we have a
> > > proper solution) and everyone is happy that we no longer have these
> > > false positive or I-don't-care lockdep splats which hide real lockdep
> > > splats because lockdep shuts off as soon as it discovers its first
> > > splat.
> >
> > I feel like that is what I trying to do, but there seems a lot of resistances
> > with that approach where pragmatism met with perfectionism.
>
> No, the resistance was against complicated code changes (games with
> locks) and against removing useful messages. Such changes might cause
> more harm than good.

I don't think there is "removing useful messages" in this patch. That one
printk() in __offline_isolated_pages() basically as Michal mentioned it is that
useful, but could be converted to printk_deferred() if anyone objected.

It is more complicated to convert dump_page() to use printk_deferred().

>
> I am not -mm maintainer so I could not guarantee that a patch
> using printk_deferred() will get accepted. But it will have much
> bigger chance than the original patch.
>
> Anyway, printk_deferred() is a lost war. It is temporary solution
> for one particular scenario. But as you said, there might be many
> others. The long term solution is the printk rework.
>
> Best Regards,
> Petr