Re: {sys_,/dev/}epoll waiting timeout

From: Jamie Lokier (jamie@shareable.org)
Date: Tue Jan 28 2003 - 16:36:21 EST


Randy.Dunlap wrote:
> On Wed, 22 Jan 2003, Jamie Lokier wrote:
>
> | ps. sys_* system-call functions should never return "int". They
> | should always return "long" or a pointer - even if the user-space
> | equivalent returns "int". Take a look at sys_open() for an example.
> | Technical requirement of the system call return path on 64-bit targets.
>
> Is this a blanket truism? For all architectures?

I believe so, for all architecture-independent syscall functions.
(And architecture-dependent on those that need it -- see end of this
message).

Linus mentioned it in passing in the recent x86 vsyscall threads. I
have never seen it mentioned before. I always assumed that returned
longs were due to sloppy programming :)

If you look at the syscall return paths on the 64-bit architectures,
some of them always check the 64-bit return value register to see if
it is negative. If so, they set an error flag, which is what
userspace uses to decide whether to return -1, rather than checking if
the return value is >= -4096 (or >= -125, architecture implementations vary).

This convoluted ABI allows those architectures to return full 64-bit
values as legitimate values, which is needed for... ptrace(), only on
those architecture of course.

The question is therefore are "int" values returned from C functions
sign-extended to 64 bits? I don't know, maybe it varies between
architectures -- for all I know it happens to work on all the
supported 64-bit architectures -- but I believe this is the reason why
"long" (or equivalent, e.g. "ssize_t") and pointer types are
_supposed_ to be the only permitted return types from syscall
functions.

> Should current (older/all) syscalls be modified, or should only new ones
> (like epoll) be corrected?

All of them.

Here's a partial list of functions which return int from 2.5.49, based
on parsing Alpha, ARM, CRIS, IA64 and x86_64 trees:

        sys_set_tid_address
        sys_futex
        sys_sched_setaffinity
        sys_sched_getaffinity
        sys_remap_file_pages
        sys_epoll_create
        sys_epoll_ctl
        sys_epoll_wait
        sys_lookup_dcookie
        sys_pause

As far as I can guess from reading the GCC sources, 32-bit return
values aren't sign extended on x86_64 and IA64, but they are on all
the other Linux-supported 64-bit architectures (I could be mistaken
about this). Of x86_64 and IA64, only the IA64 syscall return path
tests if the return value register is negative.

Which suggests that all the architectures are fine with all these
"int" returns, except IA64.

Curiously, IA64's own sys_perfmonctl() returns int.

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jan 31 2003 - 22:00:20 EST