Re: [PATCH 00/22 take 3] UBI: Unsorted Block Images

From: Thomas Gleixner
Date: Mon Mar 19 2007 - 20:37:05 EST


On Mon, 2007-03-19 at 16:36 -0500, Matt Mackall wrote:
> On Mon, Mar 19, 2007 at 11:06:33PM +0200, Artem Bityutskiy wrote:
> > On Mon, 2007-03-19 at 14:54 -0500, Matt Mackall wrote:
> > > The issue is 14000 lines of patch to make a parallel subsystem.
> >
> > Parallel system exists since very long. One is
> > flash->SW_or_HW_FTL->all_blkdev_stuff. The other is MTD->JFFS2. Think
> > about _why_ there are 2 of them. Hint - reliability, performance. Your
> > ranting basically says that only the first one makes sense. This is not
> > true.
>
> A better way would be for MTD to deliver a block dev with a rich
> enough interface for JFFS2 to use efficiently in the first place. Yes,
> I know that can't be done with the current block dev layer. But that's
> what the source is for.

Why the hell would JFFS2 need a block device interface ?

What's the gain ?

> > We enhance the second branch, not the first, please, realize this. Both
> > branches have their user base, and have always had.
> >
> > > iSCSI/nbd(6)
> > > |
> > > filesystem { swap | ext3 ext3 jffs2
> > > \ | | | /
> > > / \ | dm-crypt->snapshot(5) /
> > > device mapper -| \ \ | /
> > > | partitioning /
> > > | | partitioning(4)
> > > | wear leveling(3) /
> > > | | /
> > > | block concatenation
> > > | | | | |
> > > \ bad block remapping(2)
> > > | | | |
> > > MTD raw block { raw block devices with no smarts(1)
> > > / | \ \
> > > hardware { NAND NAND NAND NAND
> >
> > Matt, as I pointed in the first mail, flash != block device.
>
> And as I pointed out, you're wrong. It is both block oriented
> (eraseBLOCK??) and random access. That's what a block device is. The
> fact that it doesn't look like the other things that Linux currently
> calls a block device and supports well is another matter.

It does well matter, as it is not a block device. It is a FLASH device
and you can do as much comparisons of eraseBLOCK as you want, you do not
turn FLASH into a DISK.

Again: Disks (including CF-Cards and USB-Sticks) have intellegent
controllers, which abstract the hardware oddities away and present you a
block device.

> > In your picture I see NAND->MTD raw block. So am I right that you
> > assume that we already have a decent FTL? The fact is that we do
> > not.
>
> No. Look at the picture for more than two seconds, please.
>
> I can tell you didn't do this because you didn't manage to find (1)
> which explicitly says "with no smarts". And you also cut out the footnote
> where I explained what I meant by "with no smarts".
>
> Find the spots marked (2) and (3). These are your FTL.

And where please are (2) and (3) inside of device mapper ?

> > Please, bear in mind that decent FTL is difficult and an FS on top of
> > FTL is slow, FTL hits performance considerably.
>
> ...and if you'd actually looked at the picture, you'd have seen JFFS2
> bypassing it. Along with another footnote explaining it.

The (4) partitioning and JFFS2 on top is a step back from the current
UBI functionality. Now we can have resizable partitioning even for JFFS2
and JFFS2 can utilize the UBI wear levelling, which is way better than
the crude heuristics of JFFS2.

You want to force FLASH into device mapper for some strange and no
obvious reason. Just the coincidence of "eraseBLOCK" and "BLOCKdevice"
is not really convincing.

You impose the usage of eraseblock size on FLASH, which is simply wrong:

DISK has a 1:1 relationship of "eraseblock" and minimal I/O. FLASH has
not. I did the math in a different mail and I'm not buying your factor
32 FLASH life time reduction for the price of having a bunch of lines of
code less in the kernel.

If you really consider to run ext3, xfs or whatever on top of FLASH,
please go and do the homework on CF-Cards and USB-Sticks. Run them into
the fast wearout death. And device mapper does not help anything to
avoid that. Running ext3 on top of FLASH with a minimal I/O size of
erase block size is simply braindead.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/