Re: Performance regression in write() syscall

From: Nick Piggin
Date: Tue Feb 24 2009 - 22:24:50 EST


On Wednesday 25 February 2009 02:52:34 Linus Torvalds wrote:
> On Tue, 24 Feb 2009, Nick Piggin wrote:
> > > it does make some kind of sense to try to avoid the noncached versions
> > > for small writes - because small writes tend to be for temp-files.
> >
> > I don't see the significance of a temp file. If the pagecache is
> > truncated, then the cachelines remain dirty and so you can't avoid an
> > eventual store back to RAM?
>
> No, because many small files end up being used as scratch-pads (think
> shell script sequences etc), and get read back immediately again. Doing
> non-temporal stores might just be bad simply because trying to play games
> with caching may simply do the wrong thing.

OK, for that angle it could make sense. Although as has been noted earlier,
at this point of the copy, we don't have much idea about the length of the
write passed into the vfs (and obviously will never know the higher level
intention of userspace).

I don't know if we can say a 1 page write is nontemporal, but anything
smaller is temporal. And having these kinds of behavioural cutoffs I
would worry will create strange performance boundary conditions in code.


> > > I don't know if PAGE_SIZE is the right thing to test, and I also don't
> > > know if this is necessarily the best place to test it in, but I don't
> > > think it's necessarily wrong to do something like this.
> >
> > No, but I think it should be in arch code, and the "_nocache" suffix
> > should just be a hint to the architecture that the destination is not
> > so likely to be used.
>
> Yes. Especially since arch code is likely to need various arch-specific
> checks anyway (like the x86 code does about aligning the destination).
>
> > It would have been nice to have had some numbers to justify
> > 0812a579c92fefa57506821fa08e90f47cb6dbdd in the first place, so you have
> > a point of reference to see what happens to your speed-up-case when you
> > change things like this. Sigh.
>
> Well, there were no performance numbers for that commit, since it didn't
> actually tie it into anything, but I'm pretty sure we saw several
> performance numbers for the change.
>
> Yes, and they are in the commit logs. See "x86: cache pollution aware
> __copy_from_user_ll()", commit c22ce143d15eb288543fe9873e1c5ac1c01b69a1.
>
> But notice how that is iozone numbers. Very much about _big_ writes.

Yeah I see, thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/