Re: [PATCH 3/7] kgdb: Add request_nmi() to the io ops table for kgdboc

From: Doug Anderson
Date: Fri Jun 26 2020 - 15:44:35 EST


Hi,

On Tue, Jun 23, 2020 at 3:59 AM Daniel Thompson
<daniel.thompson@xxxxxxxxxx> wrote:
>
> On Tue, Jun 23, 2020 at 02:07:47PM +0530, Sumit Garg wrote:
> > On Mon, 22 Jun 2020 at 21:33, Daniel Thompson
> > <daniel.thompson@xxxxxxxxxx> wrote:
> > > > + irq_set_status_flags(irq, IRQ_NOAUTOEN);
> > > > + res = request_nmi(irq, fn, IRQF_PERCPU, "kgdboc", dev_id);
> > >
> > > Why do we need IRQF_PERCPU here. A UART interrupt is not normally
> > > per-cpu?
> > >
> >
> > Have a look at this comment [1] and corresponding check in
> > request_nmi(). So essentially yes UART interrupt is not normally
> > per-cpu but in order to make it an NMI, we need to request it in
> > per-cpu mode.
> >
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/irq/manage.c#n2112
>
> Thanks! This is clear.
>
> > > > + if (res) {
> > > > + res = request_irq(irq, fn, IRQF_SHARED, "kgdboc", dev_id);
> > >
> > > IRQF_SHARED?
> > >
> > > Currrently there is nothing that prevents concurrent activation of
> > > ttyNMI0 and the underlying serial driver. Using IRQF_SHARED means it
> > > becomes possible for both drivers to try to service the same interrupt.
> > > That risks some rather "interesting" problems.
> > >
> >
> > Could you elaborate more on "interesting" problems?
>
> Er... one of the serial drivers we have allowed the userspace to open
> will, at best, be stone dead and not passing any characters.
>
>
> > BTW, I noticed one more problem with this patch that is IRQF_SHARED
> > doesn't go well with IRQ_NOAUTOEN status flag. Earlier I tested it
> > with auto enable set.
> >
> > But if we agree that both shouldn't be active at the same time due to
> > some real problems(?) then I can rid of IRQF_SHARED as well. Also, I
> > think we should unregister underlying tty driver (eg. /dev/ttyAMA0) as
> > well as otherwise it would provide a broken interface to user-space.
>
> I don't have a particular strong opinion on whether IRQF_SHARED is
> correct or not correct since I think that misses the point.
>
> Firstly, using IRQF_SHARED shows us that there is no interlocking
> between kgdb_nmi and the underlying serial driver. That probably tells
> us about the importance of the interlock than about IRQF_SHARED.
>
> To some extent I'm also unsure that kgdb_nmi could ever actually know
> the correct flags to use in all cases (that was another reason for the
> TODO comment about poll_get_irq() being a bogus API).

I do wonder a little bit if the architecture of the "kgdb_nmi_console"
should change. I remember looking at it in the past and thinking it a
little weird that if I wanted to get it to work I'd need to change my
"console=" command line to go through this new driver and (I guess)
change the agetty I have running on my serial port to point to
ttyNMI0. Is that how it's supposed to work? Then if I want to do a
build without kgdb then I need to go in and change my agetty to point
back at my normal serial port?

It kinda feels like a better way to much of what the driver does would be to:

1. Allow kgdb to sniff incoming serial bytes on a port and look for
its characters. We already have this feature in the kernel to a small
extent for sniffing a break / sysrq character.

2. If userspace doesn't happen to have the serial port open then
ideally we could open the port (using all the standard APIs that
already exist) from in the kernel and just throw away all the bytes
(since we already sniffed them). As soon as userspace tried to open
the port when it would get ownership and if userspace ever closed the
port then we'd start reading / throwing away bytes again.

If we had a solution like that:

a) No serial drivers would need to change.

b) No kernel command line parameters would need to change.

Obviously that solution wouldn't magically get you an NMI, though.
For that I'd presume the right answer would be to add a parameter for
each serial driver that can support it to run its rx interrupt in NMI
mode.

Of course, perhaps I'm just confused and crazy and the above is a
really bad idea.


Speaking of confused: is there actually any way to use the existing
kgdb NMI driver (CONFIG_SERIAL_KGDB_NMI) in mainline without out of
tree patches? When I looked before I assumed it was just me that was
outta luck because I didn't have NMI at the time, but I just did some
grepping and I can't find anyplace in mainline where
"arch_kgdb_ops.enable_nmi" would not be NULL. Did I miss it, or do we
need out-of-tree patches to enable this?


-Doug