Re: sys_write() racy for multi-threaded append?

From: Alan Cox
Date: Fri Mar 09 2007 - 08:08:23 EST


> 1003.1 unless O_NONBLOCK is set. (Not that f_pos is interesting on a
> pipe except as a "bytes sent" indicator -- and in the multi-threaded

f_pos is undefined on a FIFO or similar object.

> As to what a "sane app" has to do: it's just not that unusual to write
> application code that treats a short read/write as a catastrophic
> error, especially when the fd is of a type that is known never to
> produce a short read/write unless something is drastically wrong. For

If you are working in a strictly POSIX environment then a signal can
interrupt almost any I/O as a short write even disk I/O. In the sane
world the file I/O cases don't do this.

> as long as the fd doesn't get screwed up. There is no reason for the
> generic sys_read code to leave a race open in which the same frame is
> read by both threads and a hardware buffer overrun results later.

Audio devices are not seekable anyway.

> concurrent reads and writes to arbitrary fd types. I'm proposing that
> it not do something blatantly stupid and easily avoided in generic
> code that makes it impossible for any fd type to guarantee that, after
> 10 successful pipelined 100-byte reads or writes, f_pos will have
> advanced by 1000.

You might want to read up on the Unix design philosophy. Things like
record based I/O are user space to avoid kernel complexity and also so
that the overhead of these things is paid only by those who need them
(its kind of RISC for OS design).

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/