Re: [patch 6/9] signalfd/timerfd v1 - timerfd core ...

From: Linus Torvalds
Date: Sun Mar 11 2007 - 00:31:39 EST




On Sat, 10 Mar 2007, Nicholas Miell wrote:
>
> Ah, I see. You're just interested in fds as a generic handle concept,
> and not a more Plan 9 type thing.

Indeed. It's a "handle".

UNIX has pid's for "process" handles, and "file descriptors" for just
about everything else.

> If that's the goal, somebody should start thinking about reducing the
> contents of struct file to the bare minimum (i.e. not much more than a
> file_operations pointer).

Well, there's more there, but it really is fairly close. If you look at
it, a "struct file" ends up not having a lot more than the minimal stuff
required to use it as a a handle: it really isn't a very big structure.

The biggest part is actually the read-ahead state, which is arguably a
generic thing for a file handle, even though not all kinds will be able to
use it. We *could* make that be behind a pointer (along with the "f_pos"
thing, that really logically goes along with the read-ahead thing), of
course, but since most files probably do end up being "traditional file"
structures, it's probably not wrong to just have it in the file.

> It'd be useful if the polling interfaces could return small datums
> beyond just the POLL* flags -- having to do a read on timerfd just to
> get the overrun count has a lot of overhead for just an integer, and I
> imagine other things would like to pass back stuff too.

Well, since a lot of the interfaces harken back to "select()", we really
are stuck with basically a couple of bits total (poll extends on the
number of bits, but not a whole lot). So right now we have just "an event
happened", and if you want to know more, you do need to do a read() or
similar. That's true of all the traditional file descriptors too, of
course.

> You still want timeouts, creating/setting/destroying at timer just for
> a single call to select/poll/epoll is probably too heavy weight.

Well, since the interfaces for that already exists, I'm certainly not
going to disagree.

> timerfd() still leaves out the basic clock selection functionality
> provided by both setitimer() and timer_create().

Well, the setitimer ones do not really make sense for a timer that isn't
directly associated with one particular process. Once it's associated with
a file descriptor, it really isn't bound to any particular execution
context, and as such, virtual and profiling timers really don't make any
sense any more!

The only thing that exists outside of an execution context is really just
"relative" and "absolute". Of course, you could still specify just what
you want your timers to be based on (ie the "realtime" vs "monotonic"
thing), and possibly the resolution, but it really does boil down to just
those two choices (and the rest is just confusion).

So I really don't think you lose a lot by just limiting it to "real time"
vs "relative time". Those really *are* the choices.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/