Re: Thread implementations...

Linus Torvalds (torvalds@transmeta.com)
Sun, 28 Jun 1998 00:52:12 -0700 (PDT)


On Sun, 28 Jun 1998, MOLNAR Ingo wrote:
> On 28 Jun 1998, Michael O'Reilly wrote:
>
> > Please reconsider this? There are some things that do a LOT of
> > network-to-network copies (i.e. proxy servers), [...]
> [...]
>
> nope, this is not how squid works. It wants a local copy of fetched
> documents as well. So squid wants to use sendfile() this way:
>
> sendfile(outsidesocket,localfile);
> sendfile(localfile,clientsocket);

Actually, that would be

receivefile(outsidesocket, localfile);
sendfile(localfile, clientsocket);

to be picky. Or "copyfd() x 2".

The problem for me is that I feel that there is one case that is trivial
to implement, and that one case I am absolutely _certain_ will remain
trivial to implement in the future, regardless of any VFS or networking
changes.

That one case is the "copy from file cache" case. I wish there were other
cases, but all the other cases I've thought about have a lot of fairly
nasty problems.

For example, even just "copy TO file cache" is hard. On the face of it it
sounds like it should be just as easy as copying from the file cache, but
it isn't. Writing a file is several orders of magnitude harder than
reading it.

And if there is one thing I'm nervous about, it's adding new interfaces to
the kernel that I am not sure make 100% sense to maintain. For a simple
"sendfile()" I have no qualms whatsoever: not only was I trivially able to
implement it, I cannot see _any_ way I could ever break it even by mistake
when I wanted to add some other feature.

In order to break sendfile() I'd have to break either the "read()" system
call, or break the most fundamental of all "mmap()" cases - anonymous
read-only mappings.

The same is not true of any of the other cases for copying from one source
to another. Yes, I can imagine doing insanely clever things like just
copying skb's around from one socket to another, but it's by no means
obvious how to do it, nor whether it is a feature that I'd be ready to
support forever even if I had a initial implementation.

In short, in "sendfile()" I have something that I'm confident works, and
will always continue to do so, and I feel that the kernel will always be
able to do a good implementation of it. I don't have the same confidence
in the more generic "copyfd()" - I can certainly implement it as a
"read+write", but that wouldn't be any better than doing it in user space
and could potentially be worse (especially when it comes to error
reporting etc, but potentially also for buffering and thus performance).

The advantage of being specific is that I don't give any sweeping promises
to user space. I'd hate to have a system call that would in certain
circumstances just perform worse than doing it the straightforward way in
user space.

One of the basic problems with receiving to a file is that right now the
VFS layer is deficient when it comes to giving a pageful of data to the
lower level filesystems. We have a "readpage()" action, but we don't have
a good "writepage()" one that would be any better than just doing a normal
"write()" (which implies a copy). It will have to be fixed at some point,
but it certainly won' thappen before 2.2.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu