Re: Write is not atomic?

From: Dave Chinner
Date: Mon Oct 15 2012 - 19:12:59 EST


On Mon, Oct 15, 2012 at 11:36:15PM +0200, Juliusz Chroboczek wrote:
> Hi,
>
> The Linux manual page for write(2) says:
>
> The adjustment of the file offset and the write operation are
> performed as an atomic step.

That's wrong. The file offset update is not synchronised at all with
the write, and for a shared fd the update will race.


> This is apparently an extension to POSIX, which says
>
> This volume of IEEE Std 1003.1-2001 does not specify behavior of
> concurrent writes to a file from multiple processes. Applications
> should use some form of concurrency control.

This is how Linux behaves.

> The following fragment of code
>
> int fd;
> fd = open("exemple", O_CREAT | O_WRONLY | O_TRUNC, 0666);
> fork();
> write(fd, "Ouille", 6);
> close(fd);
>
> produces "OuilleOuille", as expected, on ext4 on two machines running
> Linux 3.2 AMD64. However, over XFS on an old Pentium III at 500 MHz
> running 2.6.32, it produces just "Ouille" roughly once in three times.

ext4, on 3.6:

$ for i in `seq 0 10000`; do ./a.out ; cat /mnt/scratch/foo ; echo ; done | sort |uniq -c
39 Ouille
9962 OuilleOuille
$

XFS, on the same kernel, hardware and block device:

$ for i in `seq 0 10000`; do ./a.out ; cat /mnt/scratch/foo ; echo ; done | sort |uniq -c
40 Ouille
9961 OuilleOuille
$

So both filesystems behave according to the POSIX definition of
concurrent writes....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/