Re: [REGRESSION] 5.16rc1: SA_IMMUTABLE breaks debuggers

From: Kyle Huey
Date: Wed Nov 17 2021 - 20:20:59 EST


On Wed, Nov 17, 2021 at 5:11 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Nov 17, 2021 at 4:37 PM Kyle Huey <me@xxxxxxxxxxxx> wrote:
> >
> > This fixes most of the issues with rr, but it still changes the ptrace
> > behavior for the double-SIGSEGV case
>
> Hmm. I think that's because of how "force_sigsgv()" works.
>
> I absolutely detest that function.
>
> So we have signal_setup_done() doing that
>
> if (failed)
> force_sigsegv(ksig->sig);
>
> and then force_sigsegv() has that completely insane
>
> if (sig == SIGSEGV)
> force_fatal_sig(SIGSEGV);
> else
> force_sig(SIGSEGV);
>
> behavior.
>
> And I think I know the _reason_ for that complete insanity: when
> SIGSEGV takes a SIGSEGV, and there is a handler, we need to stop
> trying to send more SIGSEGV's.

Right, in our test we setup a SIGSEGV handler on an alt stack that
doesn't actually exist, and then overflow the regular stack, and that
would loop forever trying to setup SIGSEGV handlers if it weren't for
force_sigsegv and the sigdfl=true stuff.

> But it does mean that with my change, that second SIGSEGV now ends up
> being that SA_IMMUTABLE kind, so yeah, it broke the debugger test -
> where catching the second SIGSEGV is actually somewhat sensible (ok,
> not really, but at least understandable)
>
> End result: I think we want not a boolean, but a three-way choice for
> that force_sig_info_to_task() thing:

with the following clarifications, yes

> - unconditionally fatal (for things that just want to force an exit
> and used to do do_exit())

no matter what the ptracer wants

> - ignore valid and unblocked handler (for that SIGSEGV recursion
> case, aka force "sigdfl")

but following the usual ptrace rules

> - catching signal ok
>
> So my one-liner isn't sufficient. It wants some kind of nasty enum.
>
> At least the enum can be entirely internal to kernel/signal.c, I
> think. No need to expose this all to anything else.
>
> Linus

Yeah that's one way to solve the problem. I think you're right that
fundamentally the problem here is that what SECCOMP_RET_KILL wants is
not really a signal. To the extent that it wants a signal, what it
really wants is SIGKILL, and the problem here is the code trying to
act like SIGKILL but call it SIGSYS. I assume the ship for fixing that
sailed years ago though.

- Kyle