Re: writev data loss bug in (at least) 2.6.31 and 2.6.32pre8 x86-64

From: Nick Piggin
Date: Thu Dec 03 2009 - 00:28:38 EST


On Wed, Dec 02, 2009 at 08:04:26PM +0100, Jan Kara wrote:
> > When using writev, the page we copy from is not paged in (while when we
> > use ordinary write, it is paged in). This difference might be worth
> > investigation on its own (as it is likely to heavily impact performance of
> > writev) but is irrelevant for us now - we should handle this without data
> > corruption anyway.
> I've looked into why writev fails reliably the writes. The reason is that
> iov_iter_fault_in_readable() faults in only the first IO buffer. Because
> this is just 600 bytes big, following iov_iter_copy_from_user_atomic copies
> only 600 bytes and block_write_end sets number of copied bytes to 0. Thus
> we restart the write and do it one iov per iteration which succeeds. So
> everything works as designed only it gets inefficient in this particular
> case.

Yep, this would be right. We could actually do more prefaulting; I
think I was being a little over conservative and worried about earlier
pages being unmapped before we were able to consume them... but I
think being too worried about that case is optimizing an unusual case
that is probably performing badly anyway at the expense of more common
patterns.

Anyway, what I was doing to test this code when I wrote it was to
inject random failures into user copy functions. I guess this could
be useful to merge in the error injection framework?

Thanks,
Nick

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/