Re: Compat 32-bit syscall entry from 64-bit task!?

From: Will Drewry
Date: Fri Jan 20 2012 - 14:46:04 EST


On Fri, Jan 20, 2012 at 11:56 AM, Roland McGrath <mcgrathr@xxxxxxxxxx> wrote:
> In arch_ptrace, task_user_regset_view is called on current.  On an x86-64
> kernel, that path is only reached for a 64-bit syscall.  compat_arch_ptrace
> doesn't use it at all, always using the 32-bit view.  So your change would
> have no effect on PTRACE_GETREGS.
>
> It would only affect PTRACE_GETREGSET, which calls task_user_regset_view on
> the target task.  Is that what you meant?

Exactly - sorry for being unclear!

> I think that would be confusing
> at best.  A caller of PTRACE_GETREGSET is expecting a particular layout
> based on what type of task he thinks he's dealing with.  The caller can
> look at the iov_len in the result to discern which layout it actually got
> filled in, but I don't think that's what callers expect.

The question of what callers expect wasn't so clear to me -- for two reasons:
1. I was misreading
2. Compat syscall numbering.

#1 I had mistakenly thought that TIF_IA32 was set on a task if
personality(2) was called with PER_LINUX/PER_LINUX32. It appears that
thread info flag can only be set by the binfmt handlers at exec-time,
so personality(2) cannot be used to change the user_regs_struct on the
fly (just signal mappings).

#2 In the case of a 64-bit process doing a 32-bit system call without
a personality change, the 64-bit register view will be consistent,
but, as discussed, the numbering will be incorrect. So what the
caller gets back still seems to not be what they were expecting, it's
just not as far off as a different register view.

In either case the output from PTRACE_GETREGS is broken for the
TS_COMPAT-64-bit process flow, but it all comes down to determining
with brokenness is worse. The silent system call numbers change and
register truncation, or a different, but accurate user_regs_struct :/

> The other use of task_user_regset_view is in core dump
> (binfmt_elf.c:fill_note_info).  Off hand I don't think there's a way a core
> dump can be started while still "inside" a syscall so that TS_COMPAT could
> ever be set.  But that should be double-checked.

That was my reading, too, but additional eyes would be useful.

> As to whether it was considered before, I doubt that it was.  I don't
> really recall the sequence of events, but I think that I did all the
> user_regset code before I was really cognizant of the TS_COMPAT subtleties.

Makes sense.

Thanks!
will
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/