Re: PATCH: Raw device IO for 2.1.131

MOLNAR Ingo (mingo@chiara.csoma.elte.hu)
Sun, 13 Dec 1998 06:03:52 +0100 (CET)


On Sat, 12 Dec 1998, Alan Cox wrote:

> > Finally, I'd put all this lock down pages stuff into a mm/*.c file.
> > I want to use it for networking if it goes in (zero copy TCP, NFS
> > directly into page cache, etc.).
>
> Well for sendfile() it seems ideal for zero copying. I'd second it being in
> mm/*.c - I want it for video capture

proper zero-copy sendfile() is i think quite hard. (when the target is a
page-cachable inode)

we have to fix page cache writes anyway to go straight to disk, this will
already handle the pindown. (we never drop a dirty page) raw-IO will thus
will be a sendfile() from smallfile-to-bigfile, with the additional logic
of aliasing target logical pages to the source pages. This aliasing is
nontrivial though. (we can represent anonymous user-pages though a special
inode in the page-cache as well, so 'smallfile' or 'anonymous buffer' are
equivalent.)

Thus we have to deal with data-page to data-page IO only. A data-page can
be present, nonpresent, or nonpresent-aliased. Zero-copy is when we do an
IO from a present page to a nonpresent-aliased page. One difficulty here
is to handle state transition while IO is pending. Also nonpresent-aliased
page-descriptors need a separate cache, or we can restrict a given page to
be aliased to only one other page, this way we could embedd the alias into
struct page. (By having this solved in the page-cache we could maybe also
have SCSI-to-SCSI copies too, it's a nonpresent => nonpresent page IO. The
problem here is that none of the pages is in the page-map.)

another problem is, what if the target page gets mmap()-ed while the IO is
pending. (ie. nonpresent aliased->present state transition) We can solve
the problem by delaying such state transitions, or by doing an immediate
copy from the source-page (which is aliased to the target). But it's
mostly straightforward .. provided we solve the aliasing problem :)

zero-copy is basically equivalent to aliasing pages to pages and do IO on
the descriptor. The raw-IO patch does this conceptually too, but only in a
limited way.

-- mingo

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/