Re: [PATCH] arm64: traps: add tracepoints for unusal exception cases

From: Mark Brown
Date: Mon Mar 08 2021 - 08:34:02 EST


On Fri, Mar 05, 2021 at 09:36:30PM +0900, Sangmoon Kim wrote:

> When kernel panic occurs, a kernel module can use either the
> panic_notifier or die_notifier to obtain the debugging information.

> However, in case of these exceptions like do_undefinstr(), regs and
> esr data are not passed on. Although a module might be able to find
> those data in the console messages, parsing text messages is very
> expensive behavior for a module especially on mobile devices.

> These bare tracepoints allow a module to probe regs and esr information
> for debugging purpose. _tp suffix comes from bare tracepoints of
> sched/core.c

This use case sounds a lot like what the enterprise and Android people
do via pstore - it seems like it would be better for this to integrate
via the interfaces that other systems are using for similar purposes and
then ensure that whatever information is useful is getting passed
through in a format that makes sense. That'd be more structured and
more readily usable by a wider range of systems than something that's
more of a building block, going via the trace infrastructure seems like
a bit of an indirection.

> @@ -832,6 +846,7 @@ void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
> if (regs)
> __show_regs(regs);
>
> + trace_traps_serror_panic_tp(regs, esr);
> nmi_panic(regs, "Asynchronous SError Interrupt");

One of the concerns people have with adding tracepoints is that they can
end up defining ABI so if we *are* going to add any then we need to
think carefully about how they're defined. As things currently stand
they'll pass in the full pt_regs struct which includes not only what's
defined by the hardware but also additional software defined information
we store along with it like the stackframe which would be even more of a
problem if it ends up getting used by someone in a way that ends up as
ABI. These are defined as bare tracehooks which does mitigate against
things ending up getting used in ways that cause problems but people are
still going to worry about things ending up getting relied on one way or
another.

That said it's not clear to me that this will record anything beyond the
pointer directly in the trace buffer so the value might not be useful
for terribly long, that itself feels like it might not be as robust an
interface as it should be.

Attachment: signature.asc
Description: PGP signature