Re: [PATCH v2 1/3] powerpc: Don't force ENOSYS as error on syscall fail

From: Purcareata Bogdan
Date: Thu Feb 12 2015 - 03:38:59 EST


On 12.02.2015 07:24, Michael Ellerman wrote:
On Wed, 2015-02-11 at 08:36 +0000, Bogdan Purcareata wrote:
In certain scenarios - e.g. seccomp filtering with ERRNO as default action -
the system call fails for other reasons than the syscall not being available.
The seccomp filter can be configured to store a user-defined error code on
return from a blacklisted syscall. Don't always set ENOSYS on
do_syscall_trace_enter failure.

v2:
- move setting ENOSYS as errno from the syscall entry assembly to
do_syscall_trace_enter, only in the specific case

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 194e46d..0111e04 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -269,7 +269,6 @@ syscall_dotrace:
b .Lsyscall_dotrace_cont

syscall_enosys:
- li r3,-ENOSYS
b syscall_exit


This still looks wrong to me.

On 64 bit we do:

CURRENT_THREAD_INFO(r11, r1)
ld r10,TI_FLAGS(r11)
andi. r11,r10,_TIF_SYSCALL_DOTRACE
bne syscall_dotrace
.Lsyscall_dotrace_cont:
cmpldi 0,r0,NR_syscalls
bge- syscall_enosys
...

syscall_enosys:
li r3,-ENOSYS
b .Lsyscall_exit


Your patch removes the load of ENOSYS.

Which means if we're not doing syscall tracing, and we get an out-of-bounds
syscall number, we'll return with something random on r3. Won't we?

Thanks for pointing this out, you are absolutely right. Perhaps this is a fix for the issue - on 64 bit:

ld r10,TI_FLAGS(r11)
andi. r11,r10,_TIF_SYSCALL_T_OR_A
bne syscall_dotrace
-.Lsyscall_dotrace_cont:
cmpldi 0,r0,NR_syscalls
bge- syscall_enosys

system_call:
...

syscall_enosys:
li r3,-ENOSYS
b .Lsyscall_exit
...

syscall_dotrace:
...
addi r9,r1,STACK_FRAME_OVERHEAD
CURRENT_THREAD_INFO(r10, r1)
ld r10,TI_FLAGS(r10)
- b .Lsyscall_dotrace_cont
+ cmpldi 0,r0,NR_syscalls
+ bge syscall_exit
+ b system_call

So basically I leave the code for syscall_enosys unchanged, but I keep using it only when not doing syscall tracing. When doing syscall tracing, I'm assuming do_syscall_trace_enter will take care of setting the errno, and should it return an invalid syscall number, go directly to syscall_exit.

The 32-bit code looks more or less similar, although the label has a different
name.

Same thing for 32-bit:

_GLOBAL(DoSyscall)
lwz r11,TI_FLAGS(r10)
andi. r11,r11,_TIF_SYSCALL_T_OR_A
bne- syscall_dotrace
-syscall_dotrace_cont:
cmplwi 0,r0,NR_syscalls
lis r10,sys_call_table@h
ori r10,r10,sys_call_table@l
slwi r0,r0,2
bge 66f
+syscall_dotrace_cont:
lwzx r10,r10,r0 /* Fetch system call handler [ptr] */
mtlr r10
addi r9,r1,STACK_FRAME_OVERHEAD
...

66: li r3,-ENOSYS
b ret_from_syscall
...

syscall_dotrace:
lwz r7,GPR7(r1)
lwz r8,GPR8(r1)
REST_NVGPRS(r1)
+ cmplwi 0,r0,NR_syscalls
+ lis r10,sys_call_table@h
+ ori r10,r10,sys_call_table@l
+ slwi r0,r0,2
+ bge- ret_from_syscall
b syscall_dotrace_cont

However I must admit that I don't like duplicating those 4 lines of code associated with verifying the syscall number. I can't think of any better way to do this. I also thought about leaving this check in one place, and then branch differently according to _TIF_SYSCALL_T_OR_A. Do you think that would be a better approach?

Thank you,
Bogdan P.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/