Re: [RFC] TIF_NOTIFY_RESUME, arch/*/*/*signal*.c and all such

From: Oleg Nesterov
Date: Wed Apr 25 2012 - 08:38:49 EST


On 04/25, Al Viro wrote:
>
> FWIW, there's an interesting question rmk has brought up. Consider the
> following scenario (on any architecture):
> sigsuspend(2) sees a signal and returns ERESTARTNOHAND.
> do_signal() is called and calls get_signal_to_deliver() and gets 0,
> for whatever reason.
> We decide to restart, return address adjusted, syscall number
> returned to the right register in pt_regs. In the meanwhile, no matter what
> state interrupts used to have before, get_signal_to_deliver() has enabled
> them when returning

Afaics this doesn't really matter, TIF_SIGPENDING can be set by another CPU
once get_signal_to_deliver() drops ->siglock.

> , so we'll need to reload thread flags. And we find that
> another signal has arrived in the meanwhile.
> OK, do_signal() is called again, and this time we have a handler for
> the arrived signal. We form a stack frame and return to userland, into the
> beginning of the handler. We don't even look at the restart-related logics
> this time around, due to the usual logics protecting us from double restarts.
> Handler is executed, up to rt_sigreturn(2).
> We decode the sigcontext, restore pt_regs and return to userland.
> Right into the beginning of interrupted sigsuspend()
>
> So we have sigsuspend() hit by a signal we have a handler for. Handler is
> executed and we are stuck is sigsuspend() again, all because a signal without
> a handler has arrived just before that one - close enough for our signal to
> come right after get_signal_to_deliver() has returned zero to do_signal().

Yes, this (and the similar races) were already discussed a couple of times.
In short, regs->ax = -ERESTART* and ->ip doesn't survive after do_signal().
In this case the syscall was already restarted after the first do_signal()
even if we do not return to user-mode yet.

> AFAICS, that's a clear bug.

I do not know. So far it was decided that we do not really care, but
I won't argue if we decide to change the current behaviour.

As for sys_sigsuspend() and this race in particular:

> Arrival of a signal that has userland handler
> and that isn't blocked by the mask given to sigsuspend() should terminate
> sigsuspend().

Yes. But note that do_signal() restores the old sigmask. This means that
the signal we get after the first do_signal() was not blocked before
sigsuspend() was called. So, to some extent, we can pretend that the
handler was executed before sigsuspend() and it was never restarted.

IOW, I tend to agree with the comments from Roland, see for example

HR timers prevent an itimer from generating EINTR?
http://marc.info/?t=125210012600005

[RESEND] [RFC][PATCH X86_32 1/2]: Call do_notify_resume() with interrupts enabled
http://marc.info/?t=131955450100004

But let me repeat that I never really understood if this is "by design"
or not.

> Solution proposed last summer when that had been noticed by arm folks was
> more or less along the lines of
> * new thread flag, checked after we'd seen that no SIGPENDING et.al.
> is there. If it's set, we clear it, do syscall restart work as we would for
> handlerless signal and recheck the flags if we had to do something like
> __put_user() in process (arm might have to do that for ERESTART_RESTARTBLOCK)[1]
> * do_signal() would set that flag if
> + anti double-restart logics would not have prevented
> restarts
> + error value was ERESTART_...
> * no restart work on "no signal" path in do_signal()
> * if we have a handler and the flag is set, clear it and do what
> we normally do for restarts (including the "has ptrace mangled registers
> in a way that would prevent restarts in the current code" logics for
> architectures that have such logics - arm and sparc, at least).

Hmm. Not sure I understand this in details. But at first glance,
"do_signal() would set that flag" is not enough. We have the similar problem
if we dequeue a SA_RESTART signal first, then another signal without SA_RESTART.
Or I simply misunderstood.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/