Re: [RFC] Heads up on sys_fallocate()

From: Andrew Morton
Date: Thu Mar 01 2007 - 18:01:00 EST


On Thu, 01 Mar 2007 22:44:16 +0000
Dave Kleikamp <shaggy@xxxxxxxxxxxxxxxxxx> wrote:

> On Thu, 2007-03-01 at 14:25 -0800, Andrew Morton wrote:
> > On Fri, 2 Mar 2007 00:04:45 +0530
> > "Amit K. Arora" <aarora@xxxxxxxxxxxxxxxxxx> wrote:
>
> > > +asmlinkage long sys_fallocate(int fd, loff_t offset, loff_t len)
> > > +{
> > > + struct file *file;
> > > + struct inode *inode;
> > > + long ret = -EINVAL;
> > > + file = fget(fd);
> > > + if (!file)
> > > + goto out;
> > > + inode = file->f_path.dentry->d_inode;
> > > + if (inode->i_op && inode->i_op->fallocate)
> > > + ret = inode->i_op->fallocate(inode, offset, len);
> > > + else
> > > + ret = -ENOTTY;
> > > + fput(file);
> > > +out:
> > > + return ret;
> > > +}
> >
>
> > ENOTTY is a bit unconventional - we often use EINVAL for this sort of
> > thing. But EINVAL has other meanings for posix_fallocate() and isn't
> > really appropriate here anyway. So I'm not sure what would be better...
>
> Would EINVAL (or whatever) make it back to the caller of
> posix_fallocate(), or would glibc fall back to its current
> implementation?
>
> Forgive me if I haven't put enough thought into it, but would it be
> useful to create a generic_fallocate() that writes zeroed pages for any
> non-existent pages in the range? I don't know how glibc currently
> implements posix_fallocate(), but maybe the kernel could do it more
> efficiently, even in generic code. Maybe we don't care, since the major
> file systems can probably do something better in their own code.

Given that glibc already implements fallocate for all filesystems, it will
need to continue to do so for filesystems which don't implement this
syscall - otherwise applications would start breaking.

However with this kernel change, glibc will need to look at the errno,
so that it can correctly propagate EIO, ENOSPC and whatever. So we will
need to return a reliable and stable and sensible value so that glibc knows
when it should emulate and when it should propagate.

Perhaps Ulrich can comment.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/