[PATCH v3 0/4] arm64: Introduce new IPI as IPI_CALL_NMI_FUNC

From: Sumit Garg
Date: Thu Sep 03 2020 - 11:13:06 EST


With pseudo NMIs support available its possible to configure SGIs to be
triggered as pseudo NMIs running in NMI context. And kernel features
such as kgdb relies on NMI support to round up CPUs which are stuck in
hard lockup state with interrupts disabled.

This patch-set adds support for IPI_CALL_NMI_FUNC which can be triggered
as a pseudo NMI which in turn is leveraged via kgdb to round up CPUs.

After this patch-set we should be able to get a backtrace for a CPU
stuck in HARDLOCKUP. Have a look at an example below from a testcase run
on Developerbox:

$ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

# Enter kdb via Magic SysRq

[11]kdb> btc
btc: cpu status: Currently on cpu 10
Available cpus: 0-7(I), 8, 9(I), 10, 11-23(I)
<snip>
Stack traceback for pid 619
0xffff000871bc9c00 619 618 1 8 R 0xffff000871bca5c0 bash
CPU: 8 PID: 619 Comm: bash Not tainted 5.7.0-rc6-00762-g3804420 #77
Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr 6 2020
Call trace:
dump_backtrace+0x0/0x198
show_stack+0x18/0x28
dump_stack+0xb8/0x100
kgdb_cpu_enter+0x5c0/0x5f8
kgdb_nmicallback+0xa0/0xa8
ipi_kgdb_nmicallback+0x24/0x30
ipi_handler+0x160/0x1b8
handle_percpu_devid_fasteoi_ipi+0x44/0x58
generic_handle_irq+0x30/0x48
handle_domain_nmi+0x44/0x80
gic_handle_irq+0x140/0x2a0
el1_irq+0xcc/0x180
lkdtm_HARDLOCKUP+0x10/0x18
direct_entry+0x124/0x1c0
full_proxy_write+0x60/0xb0
__vfs_write+0x1c/0x48
vfs_write+0xe4/0x1d0
ksys_write+0x6c/0xf8
__arm64_sys_write+0x1c/0x28
el0_svc_common.constprop.0+0x74/0x1f0
do_el0_svc+0x24/0x90
el0_sync_handler+0x178/0x2b8
el0_sync+0x158/0x180

Changes in v3:
- Rebased to Marc's latest IPIs patch-set [1].

[1] https://lkml.org/lkml/2020/9/1/603

Changes since RFC version [1]:
- Switch to use generic interrupt framework to turn an IPI as NMI.
- Dependent on Marc's patch-set [2] which turns IPIs into normal
interrupts.
- Addressed misc. comments from Doug on patch #4.
- Posted kgdb NMI printk() fixup separately which has evolved since
to be solved using different approach via changing kgdb interception
of printk() in common printk() code (see patch [3]).

[1] https://lkml.org/lkml/2020/4/24/328
[2] https://lkml.org/lkml/2020/5/19/710
[3] https://lkml.org/lkml/2020/5/20/418

Sumit Garg (4):
arm64: smp: Introduce a new IPI as IPI_CALL_NMI_FUNC
irqchip/gic-v3: Enable support for SGIs to act as NMIs
arm64: smp: Setup IPI_CALL_NMI_FUNC as a pseudo NMI
arm64: kgdb: Round up cpus using IPI_CALL_NMI_FUNC

arch/arm64/include/asm/kgdb.h | 8 +++++++
arch/arm64/include/asm/smp.h | 1 +
arch/arm64/kernel/kgdb.c | 21 ++++++++++++++++++
arch/arm64/kernel/smp.c | 50 ++++++++++++++++++++++++++++++++++---------
drivers/irqchip/irq-gic-v3.c | 13 +++++++++--
5 files changed, 81 insertions(+), 12 deletions(-)

--
2.7.4