Re: the x86 sysret_rip test fails on the Intel FRED architecture

From: H. Peter Anvin
Date: Sun Jan 22 2023 - 18:46:22 EST


On 1/21/23 20:34, Dave Hansen wrote:
On 1/21/23 19:38, Li, Xin3 wrote:
However, it doesn't seem to make sense to do so to me. The current behavior is
much more of an artifact than desired behavior.
We kind of have an agreement that %r11 = %flags after returning from the kernel.

And the question is, is it what we want?

Can the selftest just set r11=rflags before the syscall? The old
syscall entry path will set r11=rflags. The FRED path won't touch it.
Either case will pass an r11==rflags check.

Thinking about this some more, the whole point here is to make sure that all syscalls work consistently, so that we don't leak a kernel implementation detail.

Either r11 and rcx should both be clobbered (set to %rcx and the return %rip respectively), or none should be.

As such, I think we want to do something like:



/* Arbitrary values */
const unsigned long r11_sentinel = 0xfeedfacedeadbeef;
const unsigned long rcx_sentinel = 0x5ca1ab1e0b57ac1e;

/* An arbitrary *valid* RFLAGS value */
const unsigned long rflags_sentinel = 0x200893;

enum regs_ok {
REGS_ERROR = -1, /* Invalid register contents */
REGS_SAVED = 0, /* Registers properly preserved */
REGS_SYSRET = 1 /* Registers match syscall/sysret */
};

/*
* Returns:
* 0 = %rcx and %r11 preserved
* 1 = %rcx and %r11 set to %rflags and %rip
* -1 = %rcx and/or %r11 set to any other values
*
* Note that check_regs_syscall() set %rbx to the syscall return %rip.
*/
static enum regs_ok check_regs_result(unsigned long r11,
unsigned long rcx, unsigned long rbx)
{
if (r11 == r11_sentinel && rcx == rcx_sentinel)
return REGS_SAVED;
else if (r11 == rflags_sentinel && rcx == rbx)
return REGS_SYSRET;
else
return REGS_ERROR;
}

static enum regs_ok check_regs_syscall(int syscall,
unsigned long arg1, unsigned long arg2)
{

register unsigned long r11 asm("%r11");
unsigned long rcx, rbx, tmp;

r11 = r11_sentinel;
rcx = rcx_sentinel;

asm volatile("push %3; popf; "
"lea 1f(%%rip),%2; "
"syscall; "
"1:"
: "+r" (r11), "+c" (rcx), "=b" (rbx)
: "g" (rflags_sentinel),
"a" (syscall), "D" (arg1), "S" (arg2));

return check_regs_result(r11, rcx, rbx);
}

enum regs_ok
check_regs_getppid(void)
{
return check_regs_syscall(__NR_getppid, 0, 0);
}



Test it out with a trivial system call like __NR_getppid which is extremely likely to return with SYSRET on an IDT system; check that it returns a nonnegative value and then save the result.

All tests from that point on should return the *same* value, including the signal handler etc.

I make the result-checking function separate so that it can be fed data from e.g. the signal stack or ptrace.

-hpa