Re: [PATCH v6] fat: Batched discard support for fat

From: Lukas Czerner
Date: Tue May 24 2011 - 05:25:30 EST


On Tue, 24 May 2011, OGAWA Hirofumi wrote:

> Kyungmin Park <kmpark@xxxxxxxxxxxxx> writes:
>
> >>> It's handled at trim implementation. It just trim the fat aware block.
> >>> Not trim the blocks which fat doesn't know.
> >>> As fat don't use the block 0, 1, it adjust the start block at kernel.
> >>>
> >>> +       if (start < FAT_START_ENT)
> >>> +               start = FAT_START_ENT;
> >>>
> >>> and don't exceed the max cluster size.
> >>>
> >>> +       len = (len > sbi->max_cluster) ? sbi->max_cluster : len;
> >>>
> >>> +       for (count = start; count <= len; count++) {
> >>
> >> Yes. We _adjust_ from 0 to 2 here, so, the end of block also have to be
> >> _adjusted_.
> >>
> >> From other point of view, if userland specified 0 - max-length
> >> (i.e. number of blocks), what happens? It would trim block of 2 -
> >> (max-length - 2), right?
> >
> > No, length is not changed. so max-length is used.
>
> No, no. Userland will know max-length from statvfs, right? So, let's
> assume it is 100 (->f_blocks) * 1024 (->f_bsize).

You do not need to know the filesystem size to do the discard, it should
be adjusted within the kernel. Just specify ULLONG_MAX as a length. See
fstrim tool in util-linux-ng.

>
> Now, userland know about max length, 102400, ok? Let's start to trim.
>
> Assume, userland want to trim whole. So, userland will specify like
>
> trim(0, 102400).
>
> What happen in kernel actually?
>
> Current implement doesn't map blocks. So, in the case of FAT, it adjusts
> from 0 to 2 * 1024.
>
> So, it trims between 2048 and 102400. The problem is here. FS layout is
> actually, 2048 and (102400 + 2048). I.e. actually userland has to do
>
> trim(2048, 102400 + 2048)
>
> to specify whole. How to know 2048?

You do not need to know anything in userspace. If you want to trim the
whole filesystem you just do trim(0, ULLONG_MAX) - which is what fstrim
does when you do not specify range. And you just skip the filesystem
metadata obviously, regardless if they are at the beginning of the
filesystem or in the middle. Just do whatever you need to do within your
filesystem.

What we do in ext4 is, that we convert length and start passed in struct
fstrim_range into filesystem block units and then get the last
allocation group and block offset within that group (we do the same for
the start block) and we try to discard free block ranges in from staring
block to the last block.

It is really not a rocket science and since every filesystem is
different and has different internal data structures it is up to you how
to do this. And if you shift a block or two, it really does not matter
as much since user-land does not know about how the filesystem block are
laid out anyway, nor user land knows which are free and which are not.

I agree that the interface is a little bit fuzzy, but that is mainly
because it is intended to be filesystem independent and we do have a lot
of various filesystems, so I wanted it to be as flexibile as it should,
hence the start, len in Bytes.

Hope it helped.

Thanks!
-Lukas

>
> See what I'm saying?
>
> FAT has liner block space, so the problem is small against mapping. But
> other FSes has bigger problem.
>
> Thanks.
>

--