Re: Syslets, Threadlets, generic AIO support, v6

From: Ingo Molnar
Date: Thu May 31 2007 - 02:14:22 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > I agree. What would be a good interface to allocate fds in such
> > area? We don't want to replicate syscalls, so maybe a special new
> > dup function?
>
> I'd do it with something like "newfd = dup2(fd, NONLINEAR_FD)" or
> similar, and just have NONLINEAR_FD be some magic value (for example,
> make it be 0x40000000 - the bit that says "private, nonlinear" in the
> first place).
>
> But what's gotten lost in the current discussion is that we probably
> don't actually _need_ such a private space. I'm just saying that if
> the *choice* is between memory-mapped interfaces and a private
> fd-space, we should probably go for the latter. "Everything is a file"
> is the UNIX way, after all. But there's little reason to introduce
> private fd's otherwise.

it's both a flexibility and a speedup thing as well:

flexibility: for libraries to be able to open files and keep them open
comes up regularly. For example currently glibc is quite wasteful in a
number of common networking related functions (Ulrich, please correct me
if i'm wrong), which could be optimized if glibc could just keep a
netlink channel fd open and could poll() it for changes and cache the
results if there are no changes (or something like that).

speedup: i suggested O_ANY 6 years ago as a speedup to Apache -
non-linear fds are cheaper to allocate/map:

http://www.mail-archive.com/linux-kernel@xxxxxxxxxxxxxxx/msg23820.html

(i definitely remember having written code for that too, but i cannot
find that in the archives. hm.) In theory we could avoid _all_ fd-bitmap
overhead as well and use a per-process list/pool of struct file buffers
plus a maximum-fd field as the 'non-linear fd allocator' (at the price
of only deallocating them at process exit time).

Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/