Re: [PATCH 11/16] f2fs: add inode operations for special inodes

From: Dave Chinner
Date: Mon Oct 15 2012 - 18:34:05 EST


On Sun, Oct 14, 2012 at 03:19:37PM +0000, Arnd Bergmann wrote:
> On Sunday 14 October 2012, Vyacheslav Dubeyko wrote:
> > On Oct 14, 2012, at 11:09 AM, Jaegeuk Kim wrote:
> > > 2012-10-14 (ì), 02:21 +0400, Vyacheslav Dubeyko:
> > Extended attributes are more flexible way, from my point of view. The xattr gives
> > possibility to make hint to filesystem at any time and without any dependencies with
> > application's functional opportunities. Documented way of using such extended attributes
> > gives to user flexible way of manipulation of filesystem behavior (but I remember that
> > you don't believe in an user :-)).
> >
> > So, I think that fadvise() and extended attributes can be complementary solutions.
>
> Right. Another option is to have ext4 style attributes, see
> http://linux.die.net/man/1/chattr

Xattrs are much prefered to more "ext4 style" flags because xattrs
are filesystem independent. Indeed, some filesystems can't store any
new "ext4 style" flags without a change of disk format or
internally mapping them to an xattr. So really, xattrs are the best
way forward for such hints.

> Unlike extended attributes, there is a limited number of those,
> and they can only be boolean flags, but that might be enough for
> this particular use case.

A boolean is not sufficient for access policy hints. An extensible
xattr format is probably the best approach to take here, so that we
can easily introduce new access policy hints as functionality is
required. Indeed, an extensible xattr could start with just a
hot/cold boolean, and grow from there....

> The main reason I can see against extended attributes is that they are not stored
> very efficiently in f2fs, unless a lot of work is put into coming up with a good
> implementation. A single flags bit can trivially be added to the inode in
> comparison (if it's not there already).

That's a deficiency that should be corrected, then, because xattrs
are very common these days.

And given that stuff like access frequency tracking is being
implemented at the VFS level, access policy hints should also be VFS
functionality. A bad filesystem implementation should not dictate
the interface for generically useful functionality....

> > Anyway, hardcoding or saving in filesystem list of file extensions is a nasty way. It
> > can be not safe or hardly understandable by users the way of reconfiguration filesystem
> > by means of tunefs or debugfs with the purpose of file extensions addition in such
> > "black-box" as TV or smartphones, from my point of view.
>
> It is only a performance hint though, so it is not a correctness issue the
> file system gets it wrong. In order to do efficient garbage collection, a log
> structured file system should take all the information it can get about the
> expected life of data it writes. I agree that the list, even in the form of
> mkfs time settings, is not a clean abstraction, but in the place of an Android
> phone manufacturer I would still enable it if it promises a significant
> performance advantage over not using it. I guess it would be nice if this
> could be overridden in some form, e.g. using an ioctl on the file as ext4 does.

An xattr on the root inode that holds a list like this is something
that could be set at mkfs time, but then also updated easily by new
software packages that are installed...

> We should also take the kinds of access we have seen on a file into account.

Yes, but it should be done at the VFS level, not in the filesystem
itself. Integrated into the current hot inode/range tracking that is
being worked on right now, I'd suggest.

IOWs, these access policy issues are not unique to F2FS or it's use
case. Anything to do with access hints, policy, tracking, file
classification, etc that can influence data locality, reclaim,
migration, etc need to be dealt with at the VFS, independently of a
specific filesystem. Filesystems can make use of that information
how they please (whether in the kernel or via userspace tools), but
having filesystem specific interfaces and implementations of the
same functionality is extremely wasteful. Let's do it once, and do
it right the first time. ;)

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/