Re: [RESEND] [PATCH] VFS: make file->f_pos access atomic on 32bit arch

From: Miklos Szeredi
Date: Thu Oct 09 2008 - 09:40:07 EST


On Thu, 9 Oct 2008, Matthew Wilcox wrote:
> On Thu, Oct 09, 2008 at 02:23:19PM +0200, Pavel Machek wrote:
> > On Tue 2008-10-07 20:52:09, Matthew Wilcox wrote:
> > > And it's worth saying that letter-of-the-standard arguments aren't
> > > necessarily enough. Linux does not honour the POSIX guarantee that
> > > writes are atomic (if they cross page boundaries, it's not certain).
> > > This seems like even more of a corner case to me.
> >
> > We have append-only files, and normal users should not be able to work
> > around that restriction.
>
> Is it possible to work around this restriction by exploiting this?
>
> IS_APPEND() forces the user to have O_APPEND in their flags.
> O_APPEND is only checked in generic_write_checks() where it sets '*pos'
> to i_size.
>
> For the majority of filesystems, generic_write_checks() is called from
> __generic_file_aio_write_nolock. __generic_file_aio_write_nolock is
> only called from generic_file_aio_write_nolock (which passes the address
> of a kiocb->ki_pos) and generic_file_aio_write (same).
>
> The filesystems that call generic_write_checks() directly are:
> XFS (xfs_write): Passes the address of a local variable
> OCFS2 (ocfs2_file_aio_write): Passes the address of a ki_pos
> CIFS (cifs_user_write): Not sure.
> NFS (nfs_file_direct_write): "Note that O_APPEND is not supported".
> NTFS (ntfs_file_aio_write_nolock): Address of a local variable
> FUSE (fuse_file_aio_write): Address of a local variable
> FUSE (fuse_direct_write): Not sure.
>
> So the only two that might be affected are CIFS and FUSE (O_DIRECT?!) as
> far as I can tell. I'm having a hard time believing this is a security
> problem.

And even in those cases it's actually a local variable, since
sys_write() does:

loff_t pos = file_pos_read(file);
ret = vfs_write(file, buf, count, &pos);
file_pos_write(file, pos);

So there's no way to corrupt the starting position for an append mode
write, as that always comes from the file size.

Two append writes to the same file could corrupt f_pos, but that would
only matter for subsequent reads or non-append mode writes.

Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/