Re: [PATCH] x86: execve and sigreturn syscalls must return via iret

From: Brian Gerst
Date: Mon Mar 23 2015 - 11:24:09 EST


On Mon, Mar 23, 2015 at 3:56 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>
> * Brian Gerst <brgerst@xxxxxxxxx> wrote:
>
>> Both the execve and sigreturn family of syscalls have the ability to change
>> registers in ways that may not be compatabile with the syscall path they
>> were called from. In particular, sysret and sysexit can't handle non-default
>> %cs and %ss, and some bits in eflags. These syscalls have stubs that are
>> hardcoded to jump to the iret path, and not return to the original syscall
>> path. Commit 76f5df43cab5e765c0bd42289103e8f625813ae1 (Always allocate a
>> complete "struct pt_regs" on the kernel stack) recently changed this for
>> some 32-bit compat syscalls, but introduced a bug where execve from a 32-bit
>> program to a 64-bit program would fail because it still returned via sysretl.
>> This caused Wine to fail when built for both 32-bit and 64-bit.
>>
>> This patch sets TIF_NOTIFY_RESUME for execve and sigreturn so that the iret
>> path is always taken on exit to userspace.
>>
>> Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
>> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
>> Cc: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
>> Cc: Borislav Petkov <bp@xxxxxxxxx>
>> Cc: H. Peter Anvin <hpa@xxxxxxxxx>
>> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
>> ---
>> arch/x86/ia32/ia32_signal.c | 2 ++
>> arch/x86/include/asm/ptrace.h | 2 +-
>> arch/x86/include/asm/thread_info.h | 7 +++++++
>> arch/x86/kernel/process_32.c | 6 +-----
>> arch/x86/kernel/process_64.c | 1 +
>> arch/x86/kernel/signal.c | 2 ++
>> 6 files changed, 14 insertions(+), 6 deletions(-)
>
> Applied the fix to tip:x86/asm, thanks Brian!
>
>> +
>> +/*
>> + * force syscall return via iret by making it look as if there was
>> + * some work pending.
>> +*/
>> +#define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
>
> I extended this comment to:
>
> /*
> * Force syscall return via IRET by making it look as if there was
> * some work pending. IRET is our most capable (but slowest) syscall
> * return path, which is able to restore modified SS, CS and certain
> * EFLAGS values that other (fast) syscall return instructions
> * are not able to restore properly.
> */
> #define force_iret() set_thread_flag(TIF_NOTIFY_RESUME)
>
> Just to preserve the underlying reason for force_iret() for the future
> and such.
>
> Btw., it might be a worthwile optimization to detect non-standard SS,
> CS and EFLAGS values and only force_iret() in that case, that will
> speed up 99.9999% of execve() and sigreturn() syscalls and only force
> the 'weird' process startup modes into the slow return path.

sysret/sysexit also can't restore rcx and r11/rdx. This would not
work for execve, since it sets those registers to zero. It could
possibly work for sigreturn if the signal interrupted a syscall. We
already have the opportunistic sysret code for 64-bit returns.

--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/