Re: [PATCH v3 3/4] kdb: Make kdb_printf robust to run in NMI context

From: Daniel Thompson
Date: Wed May 27 2020 - 10:26:26 EST


On Wed, May 27, 2020 at 11:55:58AM +0530, Sumit Garg wrote:
> While rounding up CPUs via NMIs, its possible that a rounded up CPU

This problem does not just impact NMI roundup (breakpoints, including
implicit breakpoint-on-oops can have the same effect).


> maybe holding a console port lock leading to kgdb master CPU stuck in
> a deadlock during invocation of console write operations. So in order
> to avoid such a deadlock, enable oops_in_progress prior to invocation
> of console handlers.
>
> Suggested-by: Petr Mladek <pmladek@xxxxxxxx>
> Signed-off-by: Sumit Garg <sumit.garg@xxxxxxxxxx>
> ---
> kernel/debug/kdb/kdb_io.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/kernel/debug/kdb/kdb_io.c b/kernel/debug/kdb/kdb_io.c
> index 349dfcc..f848482 100644
> --- a/kernel/debug/kdb/kdb_io.c
> +++ b/kernel/debug/kdb/kdb_io.c
> @@ -566,7 +566,17 @@ static void kdb_msg_write(char *msg, int msg_len)
> for_each_console(c) {
> if (!(c->flags & CON_ENABLED))
> continue;
> + /*
> + * While rounding up CPUs via NMIs, its possible that

Ditto.

> + * a rounded up CPU maybe holding a console port lock
> + * leading to kgdb master CPU stuck in a deadlock during
> + * invocation of console write operations. So in order
> + * to avoid such a deadlock, enable oops_in_progress
> + * prior to invocation of console handlers.

Actually looking at this comment as a whole I think it spends to many
words on what and not enough on why (e.g. what the tradeoffs are and
why we are not using bust_spinlocks() which would be a more obvious
approach).

Set oops_in_progress to encourage the console drivers to disregard
their internal spin locks: in the current calling context
the risk of deadlock is a bigger problem than risks due to
re-entering the console driver. We operate directly on
oops_in_progress rather than using bust_spinlocks() because
the calls bust_spinlocks() makes on exit are not appropriate
for this calling context.


Daniel.


> + */
> + ++oops_in_progress;
> c->write(c, msg, msg_len);
> + --oops_in_progress;
> touch_nmi_watchdog();
> }
> }
> --
> 2.7.4
>