Re: [regression] x86/signal/64: Fix SS handling for signals delivered to 64-bit programs breaks dosemu

From: Stas Sergeev
Date: Wed Aug 12 2015 - 13:00:42 EST


12.08.2015 19:19, Andy Lutomirski ÐÐÑÐÑ:
On Wed, Aug 12, 2015 at 1:02 AM, Stas Sergeev <stsp@xxxxxxx> wrote:
12.08.2015 03:38, Andy Lutomirski ÐÐÑÐÑ:
On Tue, Aug 11, 2015 at 5:17 PM, Stas Sergeev <stsp@xxxxxxx> wrote:
Hi guys, I wonder how easily the include/uapi/* is being
changed these days.
The patch:

http://lkml.kernel.org/r/405594361340a2ec32f8e2b115c142df0e180d8e.1426193719.git.luto@xxxxxxxxxx
breaks dosemu (and perhaps everyone else who used
to restore the segregs by hands). And the fix involves
both autoconf magic and run-time magic, so it is not even
trivial.
I realize this patch may be good to have in general, but
breaking userspace without a single warning is a bit
discouraging. Seems like the old "we don't break userspace"
rule have gone.
I didn't anticipate any breakage. I could have been wrong.
You changed include/uapi/*, which is obviously an asking
for problems. I applied the following changes to my local
git tree to get dosemu working again:
To be fair, I renamed a field that used to be padding. The UAPI has
to change on occasion -- it's just not supposed to break things.

https://github.com/stsp/dosemu2/commit/48b2a13a49a9fe1a456cd77df6b9a1feec675a01
Maybe I'm still missing something, but this seems like it should be
unnecessary. What goes wrong without it?
Without it, dosemu stores and fetches the ss value
elsewhere. It could use any place for it, be it even a global var.
But with your patch, dosemu _needs_ to use the sigcontext.ss,
because that's where the kernel now puts it.
As a result, dosemu had to be changed to use sigcontext.ss
to load the ss from. Sounds good? Not! :)
The reality is that you'll have to work with the old headers,
that still have no sigcontext.ss, and so you'd need to access
ss via __pad0 in pretty much 100% of real-life setups.
If there is such a need to touch uapi from time to time
(and I understand this is the case), then perhaps you should
invent some versioning or whatever, to save people from
surprises.
Yes, you took the field that was used for padding.
But this doesn't help, because this is not a new functionality.
The existing programs now _need_ to use your new field
for what they did in the past without it. So it is nearly the
same as renaming any of the existing widely used fields:
people will need the autoconf hacks to probe its existence.

https://github.com/stsp/dosemu2/commit/7898ac60d5e569964127d6cc48f592caecd20b81
So the problem is that dosemu was actually hacking around the old
buggy behavior and thus relying on it. Grr.
What else it could do? :(
Please note: I am not trying to ask you to revert the patch.
I realize it is needed, I realize dosemu did bad things without
it. But the way it is done is a bit annoying: sudden breakage,
no run-time checks or versionings, and even the build-time
hacks needed.

We might still be able to require a new sigcontext flag to be set and
to forcibly return to __USER_DS if the flag is set regardless of the
ss value in sigcontext when sigreturn is called, if that is indeed the
problem with DOSEMU. But I'm not actually sure that that's the
problem.
Well, the flag would be an ideal solution in an ideal world,
but in our world I don't know the current relevance of dosemu,
and whether or not it worth a new flag to add.
It wouldn't even help here, because the breakage isn't caused by
incompatible sigcontext formats -- it's caused by dosemu's reliance on
ss being preserved across signal delivery
I thought you mean some per-process flag that would
preserve the backward-compatibility for the unaware apps.
I probably got you wrong.

(even if it wasn't preserved
on the way back).
How so?
IIRC sometime fs/gs were restored, but I have no evidence
the ss was. Could you clarify?

In fact, DOSEMU contains this:

/* set up a frame to get back to DPMI via iret. The kernel does not
save
%ss, and the SYSCALL instruction in sigreturn() destroys it.

IRET pops off everything in 64-bit mode even if the privilege
does not change which is nice, but clobbers the high 48 bits
of rsp if the DPMI client uses a 16-bit stack which is not so
nice (see EMUfailure.txt). Setting %rsp to 0x100000000 so that
bits 16-31 are zero works around this problem, as DPMI code
can't see bits 32-63 anyway.
*/

So, if DOSEMU were to realize that both sigreturnissues it's
complaining about are fixed in recent kernels, it could sigreturn
directly back to any state.
Good, but have you added any flag for dosemu to even know
it can do this? Unless I am mistaken, you didn't. So the fix you
suggest, is not easy to detect and make portable with the older
kernels. Any suggestions?

You could probe for it directly: raise a signal, change the saved ss
and see what's in ss after sigreturn.
Umm, nope.

Let me see if I can come up with a clean kernel fix.
The check for proper sigreturn would be good.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/