the usage of __SYSCALL_MASK in entry_SYSCALL_64/do_syscall_64 is not consistent

From: Oleg Nesterov
Date: Mon Jun 20 2016 - 14:03:42 EST


On 06/19, Andy Lutomirski wrote:
>
> Something's clearly buggy there,

The usage of __X32_SYSCALL_BIT doesn't look right too. Nothing serious
but still.

Damn, initially I thought I have found the serious bug in entry_64.S
and it took me some time to understand why my exploit doesn't work ;)
So I learned that

andl $__SYSCALL_MASK, %eax

in entry_SYSCALL_64_fastpath() zero-extends %rax and thus

cmpl $__NR_syscall_max, %eax
...
call *sys_call_table(, %rax, 8)

is correct (rax <= __NR_syscall_max).

OK, so entry_64.S simply "ignores" the upper bits if CONFIG_X86_X32_ABI.
Fine, but this doesn't match the

if (likely((nr & __SYSCALL_MASK) < NR_syscalls))

check in do_syscall_64(). So this test-case

#include <stdio.h>

int main(void)
{
// __NR_exit == 0x3c
asm volatile ("movq $0xFFFFFFFF0000003c, %rax; syscall");

printf("I didn't exit because I am traced\n");

return 0;
}

silently exits if not traced, otherwise it calls printf().

Should we do something or we do not care?

Oleg.