Re: Unexpected splice "always copy" behavior observed

From: Mathieu Desnoyers
Date: Wed May 19 2010 - 15:14:46 EST


* Nick Piggin (npiggin@xxxxxxx) wrote:
> On Wed, May 19, 2010 at 11:57:32AM -0400, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
> > > On Wed, 2010-05-19 at 17:33 +0200, Miklos Szeredi wrote:
> > > > On Wed, 19 May 2010, Linus Torvalds wrote:
> > > > > Btw, since you apparently have a real case - is the "splice to file"
> > > > > always just an append? IOW, if I'm not right in assuming that the only
> > > > > sane thing people would reasonable care about is "append to a file", then
> > > > > holler now.
> > > >
> > > > Virtual machines might reasonably need this for splicing to a disk
> > > > image.
> > >
> > > This comes down to balancing speed and complexity. Perhaps a copy is
> > > fine in this case.
> > >
> > > I'm concerned about high speed tracing, where we are always just taking
> > > pages from the trace ring buffer and appending them to a file or sending
> > > them off to the network. The slower this is, the more likely you will
> > > lose events.
> > >
> > > If the "move only on append to file" is easy to implement, I would
> > > really like to see that happen. The speed of splicing a disk image for a
> > > virtual machine only impacts the patience of the user. The speed of
> > > splicing tracing output, impacts how much you can trace without losing
> > > events.
> >
> > I'm with Steven here. I only care about appending full pages at the end of a
> > file. If possible, I'd also like to steal back the pages after waiting for the
> > writeback I/O to complete so we can put them back in the ring buffer without
> > stressing the page cache and the page allocator needlessly.
>
> Got to think about complexity and how much is really worth trying to
> speed up strange cases. The page allocator is the generic "pipe" in
> the kernel to move pages between subsystems when they become unused :)
>
> The page cache can be directed to be written out and discarded with
> fadvise and such.

Good point. This discard flag might do the trick and let us keep things simple.
The major concern here is to keep the page cache disturbance relatively low.
Which of new page allocation or stealing back the page has the lowest overhead
would have to be determined with benchmarks.

So I would tend to simply use this discard fadvise with new page allocation for
now.

>
> You might also consider using direct IO.

Maybe. I'm unsure about what it implies in the splice() context though.

Thanks,

Mathieu

--
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/