Re: [PATCH v3 3/4] kdb: Make kdb_printf robust to run in NMI context

From: Sumit Garg
Date: Thu May 28 2020 - 03:42:55 EST


On Wed, 27 May 2020 at 19:56, Daniel Thompson
<daniel.thompson@xxxxxxxxxx> wrote:
>
> On Wed, May 27, 2020 at 11:55:58AM +0530, Sumit Garg wrote:
> > While rounding up CPUs via NMIs, its possible that a rounded up CPU
>
> This problem does not just impact NMI roundup (breakpoints,

I guess here via breakpoints you meant if we add a compiled breakpoint
or runtime breakpoint in console handler code while its holding the
spin lock could lead to a deadlock, correct?

> including
> implicit breakpoint-on-oops can have the same effect).
>

Isn't the breakpoint-on-oops case already handled via bust_spinlocks()
usage in panic handler here [1]?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/panic.c#n207

>
> > maybe holding a console port lock leading to kgdb master CPU stuck in
> > a deadlock during invocation of console write operations. So in order
> > to avoid such a deadlock, enable oops_in_progress prior to invocation
> > of console handlers.
> >
> > Suggested-by: Petr Mladek <pmladek@xxxxxxxx>
> > Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> > ---
> > kernel/debug/kdb/kdb_io.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> > index 349dfcc..f848482 100644
> > --- a/kernel/debug/kdb/kdb_io.c
> > +++ b/kernel/debug/kdb/kdb_io.c
> > @@ -566,7 +566,17 @@ static void kdb_msg_write(char *msg, int msg_len)
> > for_each_console(c) {
> > if (!(c->flags & CON_ENABLED))
> > continue;
> > + /*
> > + * While rounding up CPUs via NMIs, its possible that
>
> Ditto.
>
> > + * a rounded up CPU maybe holding a console port lock
> > + * leading to kgdb master CPU stuck in a deadlock during
> > + * invocation of console write operations. So in order
> > + * to avoid such a deadlock, enable oops_in_progress
> > + * prior to invocation of console handlers.
>
> Actually looking at this comment as a whole I think it spends to many
> words on what and not enough on why (e.g. what the tradeoffs are and
> why we are not using bust_spinlocks() which would be a more obvious
> approach).
>
> Set oops_in_progress to encourage the console drivers to disregard
> their internal spin locks: in the current calling context
> the risk of deadlock is a bigger problem than risks due to
> re-entering the console driver. We operate directly on
> oops_in_progress rather than using bust_spinlocks() because
> the calls bust_spinlocks() makes on exit are not appropriate
> for this calling context.

Sounds reasonable, will use it instead.

-Sumit

>
>
> Daniel.
>
>
> > + */
> > + ++oops_in_progress;
> > c->write(c, msg, msg_len);
> > + --oops_in_progress;
> > touch_nmi_watchdog();
> > }
> > }
> > --
> > 2.7.4
> >