Re: Thread implementations...

Erik Corry (erik@arbat.com)
Fri, 26 Jun 1998 00:44:00 +0200


On Thu, Jun 25, 1998 at 02:27:26PM -0700, Larry McVoy wrote:
> : > : caddr_t buf = mmap(0, len, PROT_READ, MAP_FILE | MAP_SHARED, ifd, 0);
> : > : write(ofd, buf, len);
> : >
> : > 1) Your model above still does a copy.
> :
> : I have to admit, I can't see where.
>
> OK, so read() is basically
>
> find the from pages
> bcopy(from_pages, to_user_virtual_address)

I'm going to be a little awkward here. Not so much because I'm
sure I'm right, more because I don't understand why what I am
suggesting isn't possible.

Lets assume

1) The mmap is readonly
2) The process doesn't actually read anything (only write(2))

Lazily omit the bcopy. Use an user-virtual area that is
unreadable and unwritable anyway, so you don't need to
change any TLBs or anything and you get a fault if your
lazy assumption is wrong. If the process tries to read, you
need to map it in after all and take the TLB pain. In the

> and write is
>
> find the destinion pages

Find them in the page cache and lock them down.

> bcopy(from_user_virtual_address, dest_pages)

bcopy(from_kernel_virtual_address, dest_pages)

You would have to modify copy_from_user to recognise the lazy
mmap and get it from the kernel-virtual directly.

You might need this copy in order not to have to modify the
net device driver too much. But if we are willing to modify
it (we would need to do so for splice as well) then this
copy can go too.

> Your example above is passing a mmap region to write(). Unless you go
> teach write about page flipping, or unless you lock the pages and sleep
> the process calling write, you have to bcopy out of the mmapped region
> into the destination pages (or skb buffers).

If we mapped read-only then you don't need sleep-on-write, you
just need sleep-on-unmap or -remap. If we mapped r-w, we fall
back to the current behaviour.

> If you look at the mentioned splice() interfaces, you can see that it gets
> things down to the DMA in from disk and the DMA out to network (or the other
> way).

The splice locking/handling of data areas looks good, do we need
the syscall?

> The mapping cost is not free, alpha8 of lmbench2 will try and
> quantify this in the next few days.

This is different to the Mmap latency it already does?
(By the way, alpha7 gives HUGE mmap latency figures for
my dual-PPro - bug?)

--
Erik Corry

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu