Re: Compat 32-bit syscall entry from 64-bit task!?

From: Andrew Lutomirski
Date: Wed Mar 08 2017 - 23:40:35 EST


On Wed, Mar 8, 2017 at 3:41 PM, Dmitry V. Levin <ldv@xxxxxxxxxxxx> wrote:
> Hi,
>
> On Thu, Jan 26, 2012 at 07:03:43PM +0100, Denys Vlasenko wrote:
>> Hi Linus,
>>
>> On Thu, Jan 26, 2012 at 4:47 AM, Linus Torvalds
>> <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>> >> Please look at strace source, get_scno() function, where
>> >> it reads syscall no and parameters. Let's see....
>> >> - POWERPC: has 32-bit and 64-bit mode
>> >> - X86_64: has 32-bit and 64-bit mode
>> >> - IA64: has i386-compat mode
>> >> - ARM: has more than one ABI
>> >> - SPARC: has 32-bit and 64-bit mode
>> >>
>> >> Do you want to re-invent a different arch-specific way to report
>> >> syscall type for each of these arches?
>> >
>> > I think an arch-specific one is better than trying to make some
>> > generic one that is messy.
>> >
>> > As you say, many architectures have multiple system call ABIs.
>> >
>> > But they tend to be very *different* issues. They can be about
>> > multiple ABI's, as you mention, and even when they *look* similar
>> > (32-bit vs 64-bit ABI's) they are actually totally different issues.
>> > [skip]
>>
>> I don't have a particular attachment to my solution,
>> and I think we already talk about this problem for
>> far too long.
>>
>> Looks like nobody is _strongly_ opposed to your patch
>> which uses a few bits in eflags to report bitness
>> of the x86 syscall.
>>
>> Lets just do that already. If you commit it to kernel git,
>> I will immediately change strace accordingly.
>
> Is there any progress with this (or any alternative) solution?
>
> I see the kernel side has changed a bit, and the strace part
> is in a better shape than 5 years ago (although I'm biased of course),
> but I don't see any kernel interface that would allow strace to reliably
> recognize this 0x80 case.

I am strongly opposed to fudging registers to half-arsedly slightly
improve the epicly crappy ptrace(2) interface for syscalls.

To fix this right, please just add PTRACE_GET_SYSCALL_INFO or similar
to, in one shot, read out all the syscall details. This means: arch,
no, arg0..arg5, and *whether it's entry or exit*. I propose returning
this structure:

struct ptrace_syscall_info {
u8 op; /* 0 for entry, 1 for exit */
u8 pad0;
u16 pad1;
u32 pad2;
union {
struct seccomp_data syscall_entry;
s64 syscall_exit_retval;
};
};

because struct seccomp_data already gets this right. There's plenty
of opportunity to fine-tune this. Now it works on all architectures.

Since struct seccomp_data may be extended in the future, the operation
should be:

ptrace(PTRACE_GET_SYSCALL_INFO, pid, (void *)sizeof(struct
ptrace_syscall_info), &info);

returns 0 on success and some error code if, for example, the current
ptrace stop isn't a syscall entry or exit.

--Andy