RE: [PATCH 00/16] f2fs: introduce flash-friendly file system
From: Jaegeuk Kim
Date: Tue Oct 09 2012 - 08:01:15 EST
> -----Original Message-----
> From: LukÃÅ Czerner [mailto:lczerner@xxxxxxxxxx]
> Sent: Tuesday, October 09, 2012 8:01 PM
> To: Jaegeuk Kim
> Cc: 'LukÃÅ Czerner'; 'Namjae Jeon'; 'Vyacheslav Dubeyko'; 'Marco Stornelli'; 'Jaegeuk Kim'; 'Al Viro';
> tytso@xxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; chur.lee@xxxxxxxxxxx;
> cm224.lee@xxxxxxxxxxx; jooyoung.hwang@xxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
> Subject: RE: [PATCH 00/16] f2fs: introduce flash-friendly file system
>
> On Tue, 9 Oct 2012, Jaegeuk Kim wrote:
>
> > Date: Tue, 09 Oct 2012 19:45:57 +0900
> > From: Jaegeuk Kim <jaegeuk.kim@xxxxxxxxxxx>
> > To: 'LukÃÅ Czerner' <lczerner@xxxxxxxxxx>
> > Cc: 'Namjae Jeon' <linkinjeon@xxxxxxxxx>,
> > 'Vyacheslav Dubeyko' <slava@xxxxxxxxxxx>,
> > 'Marco Stornelli' <marco.stornelli@xxxxxxxxx>,
> > 'Jaegeuk Kim' <jaegeuk.kim@xxxxxxxxx>,
> > 'Al Viro' <viro@xxxxxxxxxxxxxxxxxx>, tytso@xxxxxxx,
> > gregkh@xxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx,
> > chur.lee@xxxxxxxxxxx, cm224.lee@xxxxxxxxxxx, jooyoung.hwang@xxxxxxxxxxx,
> > linux-fsdevel@xxxxxxxxxxxxxxx
> > Subject: RE: [PATCH 00/16] f2fs: introduce flash-friendly file system
> >
> > > -----Original Message-----
> > > From: linux-fsdevel-owner@xxxxxxxxxxxxxxx [mailto:linux-fsdevel-owner@xxxxxxxxxxxxxxx] On Behalf
> Of
> > > Luka? Czerner
> > > Sent: Tuesday, October 09, 2012 5:32 PM
> > > To: Jaegeuk Kim
> > > Cc: 'Namjae Jeon'; 'Vyacheslav Dubeyko'; 'Marco Stornelli'; 'Jaegeuk Kim'; 'Al Viro';
> tytso@xxxxxxx;
> > > gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; chur.lee@xxxxxxxxxxx;
> cm224.lee@xxxxxxxxxxx;
> > > jooyoung.hwang@xxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
> > > Subject: RE: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > >
> > > On Mon, 8 Oct 2012, Jaegeuk Kim wrote:
> > >
> > > > Date: Mon, 08 Oct 2012 19:52:03 +0900
> > > > From: Jaegeuk Kim <jaegeuk.kim@xxxxxxxxxxx>
> > > > To: 'Namjae Jeon' <linkinjeon@xxxxxxxxx>
> > > > Cc: 'Vyacheslav Dubeyko' <slava@xxxxxxxxxxx>,
> > > > 'Marco Stornelli' <marco.stornelli@xxxxxxxxx>,
> > > > 'Jaegeuk Kim' <jaegeuk.kim@xxxxxxxxx>,
> > > > 'Al Viro' <viro@xxxxxxxxxxxxxxxxxx>, tytso@xxxxxxx,
> > > > gregkh@xxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx,
> > > > chur.lee@xxxxxxxxxxx, cm224.lee@xxxxxxxxxxx, jooyoung.hwang@xxxxxxxxxxx,
> > > > linux-fsdevel@xxxxxxxxxxxxxxx
> > > > Subject: RE: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > > >
> > > > > -----Original Message-----
> > > > > From: Namjae Jeon [mailto:linkinjeon@xxxxxxxxx]
> > > > > Sent: Monday, October 08, 2012 7:00 PM
> > > > > To: Jaegeuk Kim
> > > > > Cc: Vyacheslav Dubeyko; Marco Stornelli; Jaegeuk Kim; Al Viro; tytso@xxxxxxx;
> > > > > gregkh@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; chur.lee@xxxxxxxxxxx;
> > > cm224.lee@xxxxxxxxxxx;
> > > > > jooyoung.hwang@xxxxxxxxxxx; linux-fsdevel@xxxxxxxxxxxxxxx
> > > > > Subject: Re: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > > > >
> > > > > 2012/10/8, Jaegeuk Kim <jaegeuk.kim@xxxxxxxxxxx>:
> > > > > >> -----Original Message-----
> > > > > >> From: Vyacheslav Dubeyko [mailto:slava@xxxxxxxxxxx]
> > > > > >> Sent: Sunday, October 07, 2012 9:09 PM
> > > > > >> To: Jaegeuk Kim
> > > > > >> Cc: 'Marco Stornelli'; 'Jaegeuk Kim'; 'Al Viro'; tytso@xxxxxxx;
> > > > > >> gregkh@xxxxxxxxxxxxxxxxxxx; linux-
> > > > > >> kernel@xxxxxxxxxxxxxxx; chur.lee@xxxxxxxxxxx; cm224.lee@xxxxxxxxxxx;
> > > > > >> jooyoung.hwang@xxxxxxxxxxx;
> > > > > >> linux-fsdevel@xxxxxxxxxxxxxxx
> > > > > >> Subject: Re: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > > > > >>
> > > > > >> Hi,
> > > > > >>
> > > > > >> On Oct 7, 2012, at 1:31 PM, Jaegeuk Kim wrote:
> > > > > >>
> > > > > >> >> -----Original Message-----
> > > > > >> >> From: Marco Stornelli [mailto:marco.stornelli@xxxxxxxxx]
> > > > > >> >> Sent: Sunday, October 07, 2012 4:10 PM
> > > > > >> >> To: Jaegeuk Kim
> > > > > >> >> Cc: Vyacheslav Dubeyko; jaegeuk.kim@xxxxxxxxxxx; Al Viro;
> > > > > >> >> tytso@xxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx;
> > > > > >> >> linux-kernel@xxxxxxxxxxxxxxx; chur.lee@xxxxxxxxxxx;
> > > > > >> >> cm224.lee@xxxxxxxxxxx;
> > > > > >> jooyoung.hwang@xxxxxxxxxxx;
> > > > > >> >> linux-fsdevel@xxxxxxxxxxxxxxx
> > > > > >> >> Subject: Re: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > > > > >> >>
> > > > > >> >> Il 06/10/2012 22:06, Jaegeuk Kim ha scritto:
> > > > > >> >>> 2012-10-06 (í), 17:54 +0400, Vyacheslav Dubeyko:
> > > > > >> >>>> Hi Jaegeuk,
> > > > > >> >>>
> > > > > >> >>> Hi.
> > > > > >> >>> We know each other, right? :)
> > > > > >> >>>
> > > > > >> >>>>
> > > > > >> >>>>> From: êìê <jaegeuk.kim@xxxxxxxxxxx>
> > > > > >> >>>>> To: viro@xxxxxxxxxxxxxxxxxx, 'Theodore Ts'o' <tytso@xxxxxxx>,
> > > > > >> >> gregkh@xxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx,
> > > > > >> >> chur.lee@xxxxxxxxxxx,
> > > > > >> cm224.lee@xxxxxxxxxxx,
> > > > > >> >> jaegeuk.kim@xxxxxxxxxxx, jooyoung.hwang@xxxxxxxxxxx
> > > > > >> >>>>> Subject: [PATCH 00/16] f2fs: introduce flash-friendly file system
> > > > > >> >>>>> Date: Fri, 05 Oct 2012 20:55:07 +0900
> > > > > >> >>>>>
> > > > > >> >>>>> This is a new patch set for the f2fs file system.
> > > > > >> >>>>>
> > > > > >> >>>>> What is F2FS?
> > > > > >> >>>>> =============
> > > > > >> >>>>>
> > > > > >> >>>>> NAND flash memory-based storage devices, such as SSD, eMMC, and SD
> > > > > >> >>>>> cards, have
> > > > > >> >>>>> been widely being used for ranging from mobile to server systems.
> > > > > >> >>>>> Since they are
> > > > > >> >>>>> known to have different characteristics from the conventional
> > > > > >> >>>>> rotational disks,
> > > > > >> >>>>> a file system, an upper layer to the storage device, should adapt to
> > > > > >> >>>>> the changes
> > > > > >> >>>>> from the sketch.
> > > > > >> >>>>>
> > > > > >> >>>>> F2FS is a new file system carefully designed for the NAND flash
> > > > > >> >>>>> memory-based storage
> > > > > >> >>>>> devices. We chose a log structure file system approach, but we tried
> > > > > >> >>>>> to adapt it
> > > > > >> >>>>> to the new form of storage. Also we remedy some known issues of the
> > > > > >> >>>>> very old log
> > > > > >> >>>>> structured file system, such as snowball effect of wandering tree
> > > > > >> >>>>> and high cleaning
> > > > > >> >>>>> overhead.
> > > > > >> >>>>>
> > > > > >> >>>>> Because a NAND-based storage device shows different characteristics
> > > > > >> >>>>> according to
> > > > > >> >>>>> its internal geometry or flash memory management scheme aka FTL, we
> > > > > >> >>>>> add various
> > > > > >> >>>>> parameters not only for configuring on-disk layout, but also for
> > > > > >> >>>>> selecting allocation
> > > > > >> >>>>> and cleaning algorithms.
> > > > > >> >>>>>
> > > > > >> >>>>
> > > > > >> >>>> What about F2FS performance? Could you share benchmarking results of
> > > > > >> >>>> the new file system?
> > > > > >> >>>>
> > > > > >> >>>> It is very interesting the case of aged file system. How is GC's
> > > > > >> >>>> implementation efficient? Could
> > > > > >> >> you share benchmarking results for the very aged file system state?
> > > > > >> >>>>
> > > > > >> >>>
> > > > > >> >>> Although I have benchmark results, currently I'd like to see the
> > > > > >> >>> results
> > > > > >> >>> measured by community as a black-box. As you know, the results are
> > > > > >> >>> very
> > > > > >> >>> dependent on the workloads and parameters, so I think it would be
> > > > > >> >>> better
> > > > > >> >>> to see other results for a while.
> > > > > >> >>> Thanks,
> > > > > >> >>>
> > > > > >> >>
> > > > > >> >> 1) Actually it's a strange approach. If you have got any results you
> > > > > >> >> should share them with the community explaining how (the workload, hw
> > > > > >> >> and so on) your benchmark works and the specific condition. I really
> > > > > >> >> don't like the approach "I've got the results but I don't say
> > > > > >> >> anything,
> > > > > >> >> if you want a number, do it yourself".
> > > > > >> >
> > > > > >> > It's definitely right, and I meant *for a while*.
> > > > > >> > I just wanted to avoid arguing with how to age file system in this
> > > > > >> > time.
> > > > > >> > Before then, I share the primitive results as follows.
> > > > > >> >
> > > > > >> > 1. iozone in Panda board
> > > > > >> > - ARM A9
> > > > > >> > - DRAM : 1GB
> > > > > >> > - Kernel: Linux 3.3
> > > > > >> > - Partition: 12GB (64GB Samsung eMMC)
> > > > > >> > - Tested on 2GB file
> > > > > >> >
> > > > > >> > seq. read, seq. write, rand. read, rand. write
> > > > > >> > - ext4: 30.753 17.066 5.06 4.15
> > > > > >> > - f2fs: 30.71 16.906 5.073 15.204
> > > > > >> >
> > > > > >> > 2. iozone in Galaxy Nexus
> > > > > >> > - DRAM : 1GB
> > > > > >> > - Android 4.0.4_r1.2
> > > > > >> > - Kernel omap 3.0.8
> > > > > >> > - Partition: /data, 12GB
> > > > > >> > - Tested on 2GB file
> > > > > >> >
> > > > > >> > seq. read, seq. write, rand. read, rand. write
> > > > > >> > - ext4: 29.88 12.83 11.43 0.56
> > > > > >> > - f2fs: 29.70 13.34 10.79 12.82
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >> This is results for non-aged filesystem state. Am I correct?
> > > > > >>
> > > > > >
> > > > > > Yes, right.
> > > > > >
> > > > > >>
> > > > > >> > Due to the company secret, I expect to show other results after
> > > > > >> > presenting f2fs at korea linux forum.
> > > > > >> >
> > > > > >> >> 2) For a new filesystem you should send the patches to linux-fsdevel.
> > > > > >> >
> > > > > >> > Yes, that was totally my mistake.
> > > > > >> >
> > > > > >> >> 3) It's not clear the pros/cons of your filesystem, can you share with
> > > > > >> >> us the main differences with the current fs already in mainline? Or is
> > > > > >> >> it a company secret?
> > > > > >> >
> > > > > >> > After forum, I can share the slides, and I hope they will be useful to
> > > > > >> > you.
> > > > > >> >
> > > > > >> > Instead, let me summarize at a glance compared with other file systems.
> > > > > >> > Here are several log-structured file systems.
> > > > > >> > Note that, F2FS operates on top of block device with consideration on
> > > > > >> > the FTL behavior.
> > > > > >> > So, JFFS2, YAFFS2, and UBIFS are out-of scope, since they are designed
> > > > > >> > for raw NAND flash.
> > > > > >> > LogFS is initially designed for raw NAND flash, but expanded to block
> > > > > >> > device.
> > > > > >> > But, I don't know whether it is stable or not.
> > > > > >> > NILFS2 is one of major log-structured file systems, which supports
> > > > > >> > multiple snap-shots.
> > > > > >> > IMO, that feature is quite promising and important to users, but it may
> > > > > >> > degrade the performance.
> > > > > >> > There is a trade-off between functionalities and performance.
> > > > > >> > F2FS chose high performance without any further fancy functionalities.
> > > > > >> >
> > > > > >>
> > > > > >> Performance is a good goal. But fault-tolerance is also very important
> > > > > >> point. Filesystems are used by
> > > > > >> users, so, it is very important to guarantee reliability of data keeping.
> > > > > >> Degradation of performance
> > > > > >> by means of snapshots is arguable point. Snapshots can solve the problem
> > > > > >> not only some unpredictable
> > > > > >> environmental issues but also user's erroneous behavior.
> > > > > >>
> > > > > >
> > > > > > Yes, I agree. I concerned the multiple snapshot feature.
> > > > > > Of course, fault-tolerance is very important, and file system should support
> > > > > > it as you know as power-off-recovery.
> > > > > > f2fs supports the recovery mechanism by adopting checkpoint similar to
> > > > > > snapshot.
> > > > > > But, f2fs does not support multiple snapshots for user convenience.
> > > > > > I just focused on the performance, and absolutely, the multiple snapshot
> > > > > > feature is also a good alternative approach.
> > > > > > That may be a trade-off.
> > > > > >
> > > > > >> As I understand, it is not possible to have a perfect performance in all
> > > > > >> possible workloads. Could you
> > > > > >> point out what workloads are the best way of F2FS using?
> > > > > >
> > > > > > Basically I think the following workloads will be good for F2FS.
> > > > > > - Many random writes : it's LFS nature
> > > > > > - Small writes with frequent fsync : f2fs is optimized to reduce the fsync
> > > > > > overhead.
> > > > > >
> > > > > >>
> > > > > >> > Maybe or obviously it is possible to optimize ext4 or btrfs to flash
> > > > > >> > storages.
> > > > > >> > IMHO, however, they are originally designed for HDDs, so that it may or
> > > > > >> > may not suffer from
> > > > > >> fundamental designs.
> > > > > >> > I don't know, but why not designing a new file system for flash storages
> > > > > >> > as a counterpart?
> > > > > >> >
> > > > > >>
> > > > > >> Yes, it is possible. But F2FS is not flash oriented filesystem as JFFS2,
> > > > > >> YAFFS2, UBIFS but block-
> > > > > >> oriented filesystem. So, F2FS design is restricted by block-layer's
> > > > > >> opportunities in the using of
> > > > > >> flash storages' peculiarities. Could you point out key points of F2FS
> > > > > >> design that makes this design
> > > > > >> fundamentally unique?
> > > > > >
> > > > > > As you can see the f2fs kernel document patch, I think one of the most
> > > > > > important features is to align operating units between f2fs and ftl.
> > > > > > Specifically, f2fs has section and zone, which are cleaning unit and basic
> > > > > > allocation unit respectively.
> > > > > > Through these configurable units in f2fs, I think f2fs is able to reduce the
> > > > > > unnecessary operations done by FTL.
> > > > > > And, in order to avoid changing IO patterns by the block-layer, f2fs merges
> > > > > > itself some bios likewise ext4.
> > > > > Hello.
> > > > > The internal of eMMC and SSD is the blackbox from user side.
> > > > > How does the normal user easily set operating units alignment(page
> > > > > size and physical block size ?) between f2fs and ftl in storage device
> > > > > ?
> > > >
> > > > I've known that some works have been tried to figure out the units by profiling the storage, AKA
> > > reverse engineering.
> > > > In most cases, the simplest way is to measure the latencies of consecutive writes and analyze
> their
> > > patterns.
> > > > As you mentioned, in practical, users will not want to do this, so maybe we need a tool to
> profile
> > > them to optimize f2fs.
> > > > In the current state, I think profiling is an another issue, and mkfs.f2fs had better include
> this
> > > work in the future.
> > > > But, IMO, from the viewpoint of performance, default configuration is quite enough now.
> > > >
> > > > ps) f2fs doesn't care about the flash page size, but considers garbage collection unit.
> > >
> > > I am sorry but this reply makes me smile. How can you design a fs
> > > relying on time attack heuristics to figure out what the proper
> > > layout should be ? Or even endorse such heuristics to be used in
> > > mkfs ? What we should be focusing on is to push vendors to actually
> > > give us such information so we can properly propagate that
> > > throughout the kernel - that's something everyone will benefit from.
> > > After that the optimization can be done in every file system.
> > >
> >
> > Frankly speaking, I agree that it would be the right direction eventually.
> > But, as you know, it's very difficult for all flash vendors to promote and standardize that.
> > Because each vendors have different strategies to open their internal information and also try
> > to protect their secrets whatever they are.
> >
> > IMO, we don't need to wait them now.
> > Instead, from the start, I suggest f2fs that uses those information to the file system design.
> > In addition, I suggest using heuristics right now as best efforts.
> > Maybe in future, if vendors give something, f2fs would be more feasible.
> > In the mean time, I strongly hope to validate and stabilize f2fs with community.
>
> Do not get me wrong, I do not think it is worth to wait for vendors
> to come to their senses, but it is worth constantly reminding that
> we *need* this kind of information and those heuristics are not
> feasible in the long run anyway.
>
> I believe that this conversation happened several times already, but
> what about having independent public database of all the internal
> information about hw from different vendors where users can add
> information gathered by the time attack heuristic so other does not
> have to run this again and again. I am not sure if Linaro or someone
> else have something like that, someone can maybe post a link to that.
>
As I mentioned, I agree to push vendors to open those information all the time.
And, I absolutely didn't mean that it is worth to wait vendors.
I meant, until opening those information by vendors, something like
proposing f2fs or gathering heuristics are also needed simultaneously.
Anyway, it's very interesting to build a database gathering products' information.
May I access the database?
Thanks,
> Eventually we can show this to the vendors to see that their
> "secrets" are already public anyway and that everyones lives would be
> easier if they just agree to provide it from the beginning.
>
> >
> > > Promoting time attack heuristics instead of pushing vendors to tell
> > > us how their hardware should be used is a journey to hell and we've
> > > been talking about this for a looong time now. And I imagine that
> > > you especially have quite some persuasion power.
> >
> > I know. :)
> > If there comes a chance, I want to try.
> > Thanks,
>
> That's very good to hear, thank you.
>
> -Lukas
>
> >
> > >
> > > Thanks!
> > > -Lukas
> > >
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > >
> > > > > >>
> > > > > >> With the best regards,
> > > > > >> Vyacheslav Dubeyko.
> > > > > >>
> > > > > >>
> > > > > >> >>
> > > > > >> >> Marco
> > > > > >> >
> > > > > >> > ---
> > > > > >> > Jaegeuk Kim
> > > > > >> > Samsung
> > > > > >> >
> > > > > >> > --
> > > > > >> > To unsubscribe from this list: send the line "unsubscribe linux-kernel"
> > > > > >> > in
> > > > > >> > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > > > >> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > >> > Please read the FAQ at http://www.tux.org/lkml/
> > > > > >
> > > > > >
> > > > > > ---
> > > > > > Jaegeuk Kim
> > > > > > Samsung
> > > > > >
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > > > > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > >
> > > >
> > > >
> > > > ---
> > > > Jaegeuk Kim
> > > > Samsung
> > > >
> > > >
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> > > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > >
> >
> >
> >
> > ---
> > Jaegeuk Kim
> > Samsung
> >
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/